Class SegmentSchemaCache

java.lang.Object
org.apache.druid.segment.metadata.SegmentSchemaCache
Direct Known Subclasses:
NoopSegmentSchemaCache

public class SegmentSchemaCache extends Object
In-memory cache of segment schema used by CoordinatorSegmentMetadataCache.

The schema for a given segment ID may be present in one of the following data-structures:

  • realtimeSegmentSchemas: Schema for realtime segments retrieved from realtime tasks
  • schemasPendingBackfill: Schema for published segments fetched from data nodes using metadata queries.
  • recentlyBackfilledSchemas: Schema for segments recently persisted to the DB. This is needed only to maintain continuity until the next DB poll.
  • publishedSegmentSchemas: Schema for used segments as polled from the metadata store.
The cache always contains segment schemas with version CentralizedDatasourceSchemaConfig.SCHEMA_VERSION.
  • Constructor Details

    • SegmentSchemaCache

      public SegmentSchemaCache()
  • Method Details

    • isEnabled

      public boolean isEnabled()
      Returns:
      true if schema caching is enabled.
    • setInitialized

      public void setInitialized()
    • onLeaderStop

      public void onLeaderStop()
      This method is called when the current node is no longer the leader. The schema is cleared except for realtimeSegmentSchemaMap. Realtime schema continues to be updated on both the leader and follower nodes.
    • isInitialized

      public boolean isInitialized()
    • awaitInitialization

      public void awaitInitialization() throws InterruptedException
      CoordinatorSegmentMetadataCache startup waits on the cache initialization. This is being done to ensure that we don't execute metadata query for segment with schema already present in the DB.
      Throws:
      InterruptedException
    • resetSchemaForPublishedSegments

      public void resetSchemaForPublishedSegments(Map<SegmentId,SegmentMetadata> usedSegmentIdToMetadata, Map<String,SchemaPayload> schemaFingerprintToPayload)
      Resets the schema in the cache for published (non-realtime) segments. This method is called after each successful poll of used segments and schemas from the metadata store.
      Parameters:
      usedSegmentIdToMetadata - Map from used segment ID to corresponding metadata
      schemaFingerprintToPayload - Map from schema fingerprint to payload
    • addRealtimeSegmentSchema

      public void addRealtimeSegmentSchema(SegmentId segmentId, SchemaPayloadPlus schema)
      Adds schema for a realtime segment to the cache.
    • addSchemaPendingBackfill

      public void addSchemaPendingBackfill(SegmentId segmentId, SchemaPayloadPlus schema)
      Adds a temporary schema for the given segment ID to the cache. This schema is typically fetched from data nodes by issuing segment metadata queries. Once this schema is persisted to DB, call markSchemaPersisted(org.apache.druid.timeline.SegmentId).
    • markSchemaPersisted

      public void markSchemaPersisted(SegmentId segmentId)
      Marks the schema for the given segment ID as persisted to the DB.
    • getSchemaForSegment

      public Optional<SchemaPayloadPlus> getSchemaForSegment(SegmentId segmentId)
      Reads the schema for a given segment ID from the cache.

      Note that there is no check on schema version in this method, since only schema corresponding to a single schema version is present in the cache at any time. Any change in version requires a service restart and the cache is rebuilt.

    • isSchemaCached

      public boolean isSchemaCached(SegmentId segmentId)
      Check if the cache contains schema for the given segment ID.
    • getPublishedSegmentMetadataMap

      public Map<SegmentId,SegmentMetadata> getPublishedSegmentMetadataMap()
      Returns:
      Immutable map from segment ID to SegmentMetadata for all published used segments currently present in this cache.
    • getPublishedSchemaPayloadMap

      public Map<String,SchemaPayload> getPublishedSchemaPayloadMap()
      Returns:
      Immutable map from schema fingerprint to SchemaPayload for all schema fingerprints currently present in this cache.
    • segmentRemoved

      public void segmentRemoved(SegmentId segmentId)
      Removes schema cached for this segment ID.
    • realtimeSegmentRemoved

      public void realtimeSegmentRemoved(SegmentId segmentId)
      Removes schema for realtime segment.
    • getStats

      public Map<String,Integer> getStats()
      Returns:
      Summary stats of the current contents of the cache.