Interface StoreSerializationStrategy<T>

Type Parameters:
T - the type of embedded objects stored in the embedding store (typically TextSegment)
All Known Implementing Classes:
JsonStoreSerializationStrategy

public interface StoreSerializationStrategy<T>
Strategy interface for serializing and deserializing MemFileEmbeddingStore instances.

This interface defines the contract for converting embedding stores to and from various serialized formats (such as JSON, XML, binary, etc.) and provides methods for both string-based and file-based operations. Implementations can choose different serialization formats and strategies while maintaining a consistent API.

Design Pattern: This follows the Strategy design pattern, allowing different serialization approaches to be plugged in without changing the client code. The embedding store delegates all serialization concerns to the strategy implementation.

Serialization Scope: Implementations should serialize the embedding store's metadata and structure, including:

  • All embedding vectors and their associated IDs
  • References to embedded content files (chunk file paths)
  • Configuration settings (chunk storage directory, cache size)
  • Any other metadata necessary to fully restore the store's state

Important Note: The actual embedded content (e.g., TextSegment objects) stored in separate chunk files is typically NOT included in the serialized data. Only references to these files are serialized. This design keeps the serialized data compact while requiring that the original chunk files remain accessible for full functionality after deserialization.

Thread Safety: Implementations should be thread-safe for concurrent serialization operations, but individual method calls may not be atomic. Callers should ensure appropriate synchronization when serializing stores that are being modified concurrently.

Error Handling: All methods may throw RuntimeException or its subclasses to indicate serialization/deserialization failures. Implementations should provide meaningful error messages and preserve stack traces for debugging.

Example usage:


 // Choose a serialization strategy
 StoreSerializationStrategy<TextSegment> strategy = new JsonStoreSerializationStrategy<>();

 // Create and populate an embedding store
 MemFileEmbeddingStore<TextSegment> store = new MemFileEmbeddingStore<>();
 store.add(embedding, textSegment);

 // Serialize to string
 String serializedData = strategy.serialize(store);

 // Serialize to file
 strategy.serializeToFile(store, Paths.get("backup.json"));

 // Deserialize from string
 MemFileEmbeddingStore<TextSegment> restoredStore = strategy.deserialize(serializedData);

 // Deserialize from file
 MemFileEmbeddingStore<TextSegment> loadedStore = strategy.deserializeFromFile("backup.json");
 
See Also:
  • Method Details

    • serialize

      String serialize(MemFileEmbeddingStore<T> store)
      Serializes the given embedding store to a string representation.

      This method converts the complete state of the embedding store into a string format that can be persisted, transmitted, or cached. The exact format depends on the implementation (JSON, XML, etc.).

      Serialization Content: The serialized string should include:

      • All embedding entries with their IDs and vector data
      • References to chunk files (not the actual content)
      • Store configuration (chunk directory, cache size)
      • Any metadata required for complete restoration

      Parameters:
      store - the embedding store to serialize; must not be null
      Returns:
      a string representation of the embedding store that can be used with deserialize(String)
      Throws:
      IllegalArgumentException - if the store is null
      RuntimeException - if serialization fails due to I/O errors or format-specific issues
      See Also:
    • serializeToFile

      void serializeToFile(MemFileEmbeddingStore<T> store, Path filePath)
      Serializes the given embedding store directly to a file.

      This method writes the serialized representation of the embedding store directly to the specified file path. The file will be created if it doesn't exist, or overwritten if it does exist. Parent directories will be created as needed. File Operations: The implementation should handle:

      • Creating parent directories if they don't exist
      • Overwriting existing files atomically where possible
      • Proper cleanup in case of write failures
      Parameters:
      store - the embedding store to serialize; must not be null
      filePath - the path where the serialized data should be written; must not be null
      Throws:
      IllegalArgumentException - if either parameter is null
      RuntimeException - if the file cannot be created or written to, or if serialization fails
      See Also:
    • serializeToFile

      default void serializeToFile(MemFileEmbeddingStore<T> store, String filePath)
    • deserialize

      MemFileEmbeddingStore<T> deserialize(String data)
      Deserializes an embedding store from its string representation.

      This method reconstructs a MemFileEmbeddingStore from data that was previously created by serialize(MemFileEmbeddingStore). The deserialized store will have the same configuration and embedding entries as the original.

      Restoration Process: Deserialization typically involves:

      • Parsing the serialized format to extract metadata
      • Recreating the store with original configuration
      • Restoring all embedding entries and their references
      • Setting up internal structures (cache, etc.) but not preloading content

      Dependencies: The deserialized store requires:

      • Access to the original chunk storage directory
      • All referenced chunk files must exist and be readable
      • Proper file permissions for the chunk directory and files
      Parameters:
      data - the serialized string representation of an embedding store; must not be null or blank
      Returns:
      a new MemFileEmbeddingStore instance restored from the serialized data
      Throws:
      IllegalArgumentException - if the data is null, blank, or has an invalid format
      RuntimeException - if deserialization fails due to parsing errors or I/O issues
      See Also:
    • deserializeFromFile

      MemFileEmbeddingStore<T> deserializeFromFile(Path filePath)
      Deserializes an embedding store from a file.

      This method reads the serialized data from the specified file and reconstructs the embedding store. The file must contain data that was previously created by serializeToFile(MemFileEmbeddingStore, Path) or compatible serialization method.

      File Requirements: The file must:

      • Exist and be readable
      • Contain valid serialized store data
      • Be in the format expected by this strategy implementation
      • Not be corrupted or partially written
      Parameters:
      filePath - the path to the file containing serialized store data; must not be null
      Returns:
      a new MemFileEmbeddingStore instance restored from the file data
      Throws:
      IllegalArgumentException - if the file path is null
      RuntimeException - if the file doesn't exist, cannot be read, or contains invalid data
      See Also:
    • deserializeFromFile

      default MemFileEmbeddingStore<T> deserializeFromFile(String filePath)