Interface StoreSerializationStrategy<T>
- Type Parameters:
T- the type of embedded objects stored in the embedding store (typicallyTextSegment)
- All Known Implementing Classes:
JsonStoreSerializationStrategy
MemFileEmbeddingStore instances.
This interface defines the contract for converting embedding stores to and from various serialized formats (such as JSON, XML, binary, etc.) and provides methods for both string-based and file-based operations. Implementations can choose different serialization formats and strategies while maintaining a consistent API.
Design Pattern: This follows the Strategy design pattern, allowing different serialization approaches to be plugged in without changing the client code. The embedding store delegates all serialization concerns to the strategy implementation.
Serialization Scope: Implementations should serialize the embedding store's metadata and structure, including:
- All embedding vectors and their associated IDs
- References to embedded content files (chunk file paths)
- Configuration settings (chunk storage directory, cache size)
- Any other metadata necessary to fully restore the store's state
Important Note: The actual embedded content (e.g., TextSegment objects) stored in separate chunk files is typically NOT included in the serialized data. Only references to these files are serialized. This design keeps the serialized data compact while requiring that the original chunk files remain accessible for full functionality after deserialization.
Thread Safety: Implementations should be thread-safe for concurrent serialization operations, but individual method calls may not be atomic. Callers should ensure appropriate synchronization when serializing stores that are being modified concurrently.
Error Handling: All methods may throw RuntimeException or its subclasses to indicate
serialization/deserialization failures. Implementations should provide meaningful error messages
and preserve stack traces for debugging.
Example usage:
// Choose a serialization strategy
StoreSerializationStrategy<TextSegment> strategy = new JsonStoreSerializationStrategy<>();
// Create and populate an embedding store
MemFileEmbeddingStore<TextSegment> store = new MemFileEmbeddingStore<>();
store.add(embedding, textSegment);
// Serialize to string
String serializedData = strategy.serialize(store);
// Serialize to file
strategy.serializeToFile(store, Paths.get("backup.json"));
// Deserialize from string
MemFileEmbeddingStore<TextSegment> restoredStore = strategy.deserialize(serializedData);
// Deserialize from file
MemFileEmbeddingStore<TextSegment> loadedStore = strategy.deserializeFromFile("backup.json");
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptiondeserialize(String data) Deserializes an embedding store from its string representation.default MemFileEmbeddingStore<T> deserializeFromFile(String filePath) deserializeFromFile(Path filePath) Deserializes an embedding store from a file.serialize(MemFileEmbeddingStore<T> store) Serializes the given embedding store to a string representation.default voidserializeToFile(MemFileEmbeddingStore<T> store, String filePath) voidserializeToFile(MemFileEmbeddingStore<T> store, Path filePath) Serializes the given embedding store directly to a file.
-
Method Details
-
serialize
Serializes the given embedding store to a string representation.This method converts the complete state of the embedding store into a string format that can be persisted, transmitted, or cached. The exact format depends on the implementation (JSON, XML, etc.).
Serialization Content: The serialized string should include:
- All embedding entries with their IDs and vector data
- References to chunk files (not the actual content)
- Store configuration (chunk directory, cache size)
- Any metadata required for complete restoration
- Parameters:
store- the embedding store to serialize; must not benull- Returns:
- a string representation of the embedding store that can be used with
deserialize(String) - Throws:
IllegalArgumentException- if the store isnullRuntimeException- if serialization fails due to I/O errors or format-specific issues- See Also:
-
serializeToFile
Serializes the given embedding store directly to a file.This method writes the serialized representation of the embedding store directly to the specified file path. The file will be created if it doesn't exist, or overwritten if it does exist. Parent directories will be created as needed. File Operations: The implementation should handle:
- Creating parent directories if they don't exist
- Overwriting existing files atomically where possible
- Proper cleanup in case of write failures
- Parameters:
store- the embedding store to serialize; must not benullfilePath- the path where the serialized data should be written; must not benull- Throws:
IllegalArgumentException- if either parameter isnullRuntimeException- if the file cannot be created or written to, or if serialization fails- See Also:
-
serializeToFile
-
deserialize
Deserializes an embedding store from its string representation.This method reconstructs a
MemFileEmbeddingStorefrom data that was previously created byserialize(MemFileEmbeddingStore). The deserialized store will have the same configuration and embedding entries as the original.Restoration Process: Deserialization typically involves:
- Parsing the serialized format to extract metadata
- Recreating the store with original configuration
- Restoring all embedding entries and their references
- Setting up internal structures (cache, etc.) but not preloading content
Dependencies: The deserialized store requires:
- Access to the original chunk storage directory
- All referenced chunk files must exist and be readable
- Proper file permissions for the chunk directory and files
- Parameters:
data- the serialized string representation of an embedding store; must not benullor blank- Returns:
- a new
MemFileEmbeddingStoreinstance restored from the serialized data - Throws:
IllegalArgumentException- if the data isnull, blank, or has an invalid formatRuntimeException- if deserialization fails due to parsing errors or I/O issues- See Also:
-
deserializeFromFile
Deserializes an embedding store from a file.This method reads the serialized data from the specified file and reconstructs the embedding store. The file must contain data that was previously created by
serializeToFile(MemFileEmbeddingStore, Path)or compatible serialization method.File Requirements: The file must:
- Exist and be readable
- Contain valid serialized store data
- Be in the format expected by this strategy implementation
- Not be corrupted or partially written
- Parameters:
filePath- the path to the file containing serialized store data; must not benull- Returns:
- a new
MemFileEmbeddingStoreinstance restored from the file data - Throws:
IllegalArgumentException- if the file path isnullRuntimeException- if the file doesn't exist, cannot be read, or contains invalid data- See Also:
-
deserializeFromFile
-