Package org.apache.druid.metadata
Class SqlSegmentsMetadataQuery
java.lang.Object
org.apache.druid.metadata.SqlSegmentsMetadataQuery
An object that is used to query the segments table in the metadata store.
Each instance of this class is scoped to a single
Handle and is meant
to be short-lived.-
Method Summary
Modifier and TypeMethodDescriptionstatic voidbindColumnValuesToQueryWithInCondition(String columnName, List<String> values, org.skife.jdbi.v2.SQLStatement<?> query) Binds the provided list ofvaluesto the specifiedcolumnNamein the given SQLquerythat contains anINclause.static voidbindIntervalsToQuery(org.skife.jdbi.v2.Query<Map<String, Object>> query, Collection<org.joda.time.Interval> intervals) Bind the suppliedintervalstoquery.findUnusedSegments(String dataSource, org.joda.time.Interval interval, List<String> versions, Integer limit, org.joda.time.DateTime maxUpdatedTime) Retrieves unused segments that are fully contained within the given interval.static SqlSegmentsMetadataQueryforHandle(org.skife.jdbi.v2.Handle handle, SQLMetadataConnector connector, MetadataStorageTablesConfig dbTables, com.fasterxml.jackson.databind.ObjectMapper jsonMapper) Create a query object.static StringgetConditionForIntervalsAndMatchMode(Collection<org.joda.time.Interval> intervals, org.apache.druid.metadata.SqlSegmentsMetadataQuery.IntervalMode matchMode, String quoteString) Get the condition for the interval and match mode.static StringgetParameterizedInConditionForColumn(String columnName, List<String> values) iterateAllUnusedSegmentsForDatasource(String datasource, org.joda.time.Interval interval, Integer limit, String lastSegmentId, SortOrder sortOrder) Retrieves segments and their associated metadata for a given datasource that are marked unused and that are *fully contained by* an optionally specified interval.intmarkAllNonOvershadowedSegmentsAsUsed(String dataSource, org.joda.time.DateTime updateTime) intmarkNonOvershadowedSegmentsAsUsed(String dataSource, Set<SegmentId> segmentIds, org.joda.time.DateTime updateTime) intmarkNonOvershadowedSegmentsAsUsed(String dataSource, org.joda.time.Interval interval, List<String> versions, org.joda.time.DateTime updateTime) booleanmarkSegmentAsUsed(SegmentId segmentId, org.joda.time.DateTime updateTime) intmarkSegmentsAsUnused(Set<SegmentId> segmentIds, org.joda.time.DateTime updateTime) Marks the given segment IDs as unused.intmarkSegmentsAsUsed(Set<SegmentId> segmentIds, org.joda.time.DateTime updateTime) Marks the given segment IDs as used.intmarkSegmentsUnused(String dataSource, org.joda.time.Interval interval, List<String> versions, org.joda.time.DateTime updateTime) Marks all used segments that are fully contained by a particular interval filtered by an optional list of versions as unused.static org.joda.time.DateTimenullAndEmptySafeDate(String date) Create a DateTime object from a string.Retrieves all used schema fingerprints present in the metadata store.Retrieves all used segment schemas present in the metadata store irrespective of their last updated time.retrieveHighestUnusedSegmentId(String datasource, org.joda.time.Interval interval, String version) Retrieves the ID of the unused segment that has the highest partition number amongst all unused segments that exactly match the given interval and version.retrievePendingSegmentIds(String dataSource, String sequenceName, String sequencePreviousId) retrievePendingSegmentIdsWithExactInterval(String dataSource, String sequenceName, org.joda.time.Interval interval) retrievePendingSegmentsForTaskAllocatorId(String dataSource, String taskAllocatorId) retrievePendingSegmentsOverlappingInterval(String dataSource, org.joda.time.Interval interval) Fetches all the pending segments, whose interval overlaps with the given search interval, from the metadata store.retrievePendingSegmentsWithExactInterval(String dataSource, org.joda.time.Interval interval) retrieveSegmentForId(SegmentId segmentId) Retrieve the segment for a given id if it exists in the metadata store and null otherwiseretrieveSegmentsById(String datasource, Set<SegmentId> segmentIds) Retrieves segments for the given segment IDs from the metadata store.retrieveSegmentsByIdIterator(String datasource, Set<SegmentId> segmentIds, boolean includeSchemaInfo) Retrieves segments for the specified IDs in batches of a small size.retrieveSegmentsWithSchemaById(String datasource, Set<SegmentId> segmentIds) Retrieves segments with additional metadata such as number of rows and schema fingerprint.List<org.joda.time.Interval>retrieveUnusedSegmentIntervals(String dataSource, int limit) Gets unused segment intervals for the specified datasource.List<org.joda.time.Interval>retrieveUnusedSegmentIntervals(String dataSource, org.joda.time.DateTime minStartTime, org.joda.time.DateTime maxEndTime, int limit, org.joda.time.DateTime maxUsedStatusLastUpdatedTime) retrieveUnusedSegments(String dataSource, Collection<org.joda.time.Interval> intervals, List<String> versions, Integer limit, String lastSegmentId, SortOrder sortOrder, org.joda.time.DateTime maxUsedStatusLastUpdatedTime) Retrieves segments for a given datasource that are marked unused and that are fully contained by any interval in a particular collection of intervals.retrieveUnusedSegmentsPlus(String dataSource, Collection<org.joda.time.Interval> intervals, List<String> versions, Integer limit, String lastSegmentId, SortOrder sortOrder, org.joda.time.DateTime maxUsedStatusLastUpdatedTime) Similar toretrieveUnusedSegments(java.lang.String, java.util.Collection<org.joda.time.Interval>, java.util.List<java.lang.String>, java.lang.Integer, java.lang.String, org.apache.druid.metadata.SortOrder, org.joda.time.DateTime), but also retrieves associated metadata for the segments for a given datasource that are marked unused and that are fully contained by any interval in a particular collection of intervals.retrieveUnusedSegmentsWithExactInterval(String dataSource, org.joda.time.Interval interval, org.joda.time.DateTime maxUpdatedTime, int limit) Retrieves unused segments that exactly match the given interval.retrieveUnusedSegmentVersionsWithInterval(String dataSource, org.joda.time.Interval interval) Retrieves the versions of unused segments which are perfectly aligned with the given interval.retrieveUsedSegmentForId(SegmentId segmentId) Retrieve the used segment for a given id if it exists in the metadata store and null otherwiseretrieveUsedSegmentIds(String dataSource, org.joda.time.Interval interval) Retrieves IDs of used segments that belong to the datasource and overlap the given interval.retrieveUsedSegments(String dataSource, Collection<org.joda.time.Interval> intervals) Retrieves segments for a given datasource that are marked used (i.e.retrieveUsedSegments(String dataSource, Collection<org.joda.time.Interval> intervals, List<String> versions) Similar toretrieveUsedSegments(java.lang.String, java.util.Collection<org.joda.time.Interval>), but with an additionalversionsargument.retrieveUsedSegmentSchemasForFingerprints(Set<String> schemaFingerprints) Retrieves segment schemas from the metadata store for the given fingerprints.retrieveUsedSegmentsPlus(String dataSource, Collection<org.joda.time.Interval> intervals)
-
Method Details
-
forHandle
public static SqlSegmentsMetadataQuery forHandle(org.skife.jdbi.v2.Handle handle, SQLMetadataConnector connector, MetadataStorageTablesConfig dbTables, com.fasterxml.jackson.databind.ObjectMapper jsonMapper) Create a query object. This instance is scoped to a single handle and is meant to be short-lived. It is okay to use it for more than one query, though. -
nullAndEmptySafeDate
Create a DateTime object from a string. If the string is null or empty, return null. -
retrieveUsedSegments
public CloseableIterator<DataSegment> retrieveUsedSegments(String dataSource, Collection<org.joda.time.Interval> intervals) Retrieves segments for a given datasource that are marked used (i.e. published) in the metadata store, and that *overlap* any interval in a particular collection of intervals. If the collection of intervals is empty, this method will retrieve all used segments.You cannot assume that segments returned by this call are actually active. Because there is some delay between new segment publishing and the marking-unused of older segments, it is possible that some segments returned by this call are overshadowed by other segments. To check for this, use
SegmentTimeline.forSegments(Iterable).This call does not return any information about realtime segments.
- Returns:
- a closeable iterator. You should close it when you are done.
-
retrieveUsedSegments
public CloseableIterator<DataSegment> retrieveUsedSegments(String dataSource, Collection<org.joda.time.Interval> intervals, List<String> versions) Similar toretrieveUsedSegments(java.lang.String, java.util.Collection<org.joda.time.Interval>), but with an additionalversionsargument. Whenversionsis specified, all used segments in the specifiedintervalsandversionsare retrieved. -
retrieveUsedSegmentsPlus
public CloseableIterator<DataSegmentPlus> retrieveUsedSegmentsPlus(String dataSource, Collection<org.joda.time.Interval> intervals) -
retrieveHighestUnusedSegmentId
@Nullable public SegmentId retrieveHighestUnusedSegmentId(String datasource, org.joda.time.Interval interval, String version) Retrieves the ID of the unused segment that has the highest partition number amongst all unused segments that exactly match the given interval and version.- Returns:
- null if no unused segment exists for the given parameters.
-
iterateAllUnusedSegmentsForDatasource
public List<DataSegmentPlus> iterateAllUnusedSegmentsForDatasource(String datasource, @Nullable org.joda.time.Interval interval, @Nullable Integer limit, @Nullable String lastSegmentId, @Nullable SortOrder sortOrder) Retrieves segments and their associated metadata for a given datasource that are marked unused and that are *fully contained by* an optionally specified interval. If the interval specified is null, this method will retrieve all unused segments. This call does not return any information about realtime segments.- Parameters:
datasource- The name of the datasourceinterval- an optional interval to search over.limit- an optional maximum number of results to return. If none is specified, the results are not limited.lastSegmentId- an optional last segment id from which to search for results. All segments returned are > this segment lexigraphically if sortOrder is null orSortOrder.ASC, or < this segment lexigraphically if sortOrder isSortOrder.DESC. If none is specified, no such filter is used.sortOrder- an optional order with which to return the matching segments by id, start time, end time. If none is specified, the order of the results is not guarenteed. Returns an iterable.
-
findUnusedSegments
public List<DataSegment> findUnusedSegments(String dataSource, org.joda.time.Interval interval, @Nullable List<String> versions, @Nullable Integer limit, @Nullable org.joda.time.DateTime maxUpdatedTime) Retrieves unused segments that are fully contained within the given interval.- Parameters:
interval- Returned segments must be fully contained within this intervalversions- Optional list of segment versions. If passed as null, all segment versions are eligible.limit- Maximum number of segments to return. If passed as null, all segments are returned.maxUpdatedTime- Returned segments must have aused_status_last_updatedwhich is either null or earlier than this value.
-
retrieveUnusedSegments
public CloseableIterator<DataSegment> retrieveUnusedSegments(String dataSource, Collection<org.joda.time.Interval> intervals, @Nullable List<String> versions, @Nullable Integer limit, @Nullable String lastSegmentId, @Nullable SortOrder sortOrder, @Nullable org.joda.time.DateTime maxUsedStatusLastUpdatedTime) Retrieves segments for a given datasource that are marked unused and that are fully contained by any interval in a particular collection of intervals. If the collection of intervals is empty, this method will retrieve all unused segments.This call does not return any information about realtime segments.
- Parameters:
dataSource- The name of the datasourceintervals- The intervals to search overversions- An optional list of unused segment versions to retrieve in the givenintervals. If unspecified, all versions of unused segments in theintervalsmust be retrieved. If an empty list is passed, no segments are retrieved.limit- The limit of segments to returnlastSegmentId- the last segment id from which to search for results. All segments returned are > this segment lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.sortOrder- Specifies the order with which to return the matching segments by start time, end time. A null value indicates that order does not matter.maxUsedStatusLastUpdatedTime- The maximumused_status_last_updatedtime. Any unused segment inintervalswithused_status_last_updatedno later than this time will be included in the iterator. Segments withoutused_status_last_updatedtime (due to an upgrade from legacy Druid) will havemaxUsedStatusLastUpdatedTimeignored- Returns:
- a closeable iterator. You should close it when you are done.
-
retrieveUnusedSegmentsPlus
public CloseableIterator<DataSegmentPlus> retrieveUnusedSegmentsPlus(String dataSource, Collection<org.joda.time.Interval> intervals, @Nullable List<String> versions, @Nullable Integer limit, @Nullable String lastSegmentId, @Nullable SortOrder sortOrder, @Nullable org.joda.time.DateTime maxUsedStatusLastUpdatedTime) Similar toretrieveUnusedSegments(java.lang.String, java.util.Collection<org.joda.time.Interval>, java.util.List<java.lang.String>, java.lang.Integer, java.lang.String, org.apache.druid.metadata.SortOrder, org.joda.time.DateTime), but also retrieves associated metadata for the segments for a given datasource that are marked unused and that are fully contained by any interval in a particular collection of intervals. If the collection of intervals is empty, this method will retrieve all unused segments. This call does not return any information about realtime segments.- Parameters:
dataSource- The name of the datasourceintervals- The intervals to search overlimit- The limit of segments to returnlastSegmentId- the last segment id from which to search for results. All segments returned are > this segment lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.sortOrder- Specifies the order with which to return the matching segments by start time, end time. A null value indicates that order does not matter.maxUsedStatusLastUpdatedTime- The maximumused_status_last_updatedtime. Any unused segment inintervalswithused_status_last_updatedno later than this time will be included in the iterator. Segments withoutused_status_last_updatedtime (due to an upgrade from legacy Druid) will havemaxUsedStatusLastUpdatedTimeignored- Returns:
- a closeable iterator. You should close it when you are done.
-
retrieveUsedSegmentIds
Retrieves IDs of used segments that belong to the datasource and overlap the given interval. -
retrieveSegmentsById
Retrieves segments for the given segment IDs from the metadata store. -
retrieveSegmentsByIdIterator
public CloseableIterator<DataSegmentPlus> retrieveSegmentsByIdIterator(String datasource, Set<SegmentId> segmentIds, boolean includeSchemaInfo) Retrieves segments for the specified IDs in batches of a small size.- Parameters:
includeSchemaInfo- If true, additional metadata info such as number of rows and schema fingerprint is also retrieved- Returns:
- CloseableIterator over the retrieved segments which must be closed once the result has been handled. If the iterator is closed while reading a batch of segments, queries for subsequent batches are not fired.
-
retrieveSegmentsWithSchemaById
public List<DataSegmentPlus> retrieveSegmentsWithSchemaById(String datasource, Set<SegmentId> segmentIds) Retrieves segments with additional metadata such as number of rows and schema fingerprint. -
retrieveAllUsedSegmentSchemaFingerprints
Retrieves all used schema fingerprints present in the metadata store. -
retrieveAllUsedSegmentSchemas
Retrieves all used segment schemas present in the metadata store irrespective of their last updated time. -
retrieveUsedSegmentSchemasForFingerprints
public List<SegmentSchemaRecord> retrieveUsedSegmentSchemasForFingerprints(Set<String> schemaFingerprints) Retrieves segment schemas from the metadata store for the given fingerprints. -
markSegmentsAsUsed
Marks the given segment IDs as used.- Parameters:
segmentIds- Segment IDs to update. For better performance, ensure that these segment IDs are not already marked as used.updateTime- Updated segments will have their used_status_last_updated column set to this value- Returns:
- Number of segments updated in the metadata store.
-
markSegmentsAsUnused
Marks the given segment IDs as unused.- Parameters:
segmentIds- Segment IDs to update. For better performance, ensure that these segment IDs are not already marked as unused.updateTime- Updated segments will have their used_status_last_updated column set to this value- Returns:
- Number of segments updated in the metadata store.
-
markSegmentsUnused
public int markSegmentsUnused(String dataSource, org.joda.time.Interval interval, @Nullable List<String> versions, org.joda.time.DateTime updateTime) Marks all used segments that are fully contained by a particular interval filtered by an optional list of versions as unused.- Parameters:
interval- Only used segments fully contained within this interval are eligible to be marked as unusedversions- List of eligible segment versions. If null or empty, all versions are considered eligible to be marked as unused.updateTime- Updated segments will have their used_status_last_updated column set to this value- Returns:
- Number of segments updated.
-
markSegmentAsUsed
-
markAllNonOvershadowedSegmentsAsUsed
public int markAllNonOvershadowedSegmentsAsUsed(String dataSource, org.joda.time.DateTime updateTime) -
markNonOvershadowedSegmentsAsUsed
-
markNonOvershadowedSegmentsAsUsed
-
retrieveUnusedSegmentIntervals
-
retrieveUnusedSegmentIntervals
Gets unused segment intervals for the specified datasource. There is no guarantee on the order of intervals in the list or on whether the limited list contains the earliest or latest intervals present in the datasource.- Returns:
- List of unused segment intervals containing upto
limitentries.
-
retrieveUnusedSegmentsWithExactInterval
public List<DataSegment> retrieveUnusedSegmentsWithExactInterval(String dataSource, org.joda.time.Interval interval, org.joda.time.DateTime maxUpdatedTime, int limit) Retrieves unused segments that exactly match the given interval.- Parameters:
interval- Returned segments must exactly match this interval.maxUpdatedTime- Returned segments must have aused_status_last_updatedwhich is either null or earlier than this value.limit- Maximum number of segments to return
-
retrieveUnusedSegmentVersionsWithInterval
public Set<String> retrieveUnusedSegmentVersionsWithInterval(String dataSource, org.joda.time.Interval interval) Retrieves the versions of unused segments which are perfectly aligned with the given interval. -
retrieveUsedSegmentForId
Retrieve the used segment for a given id if it exists in the metadata store and null otherwise -
retrieveSegmentForId
Retrieve the segment for a given id if it exists in the metadata store and null otherwise -
retrievePendingSegmentIds
public List<SegmentIdWithShardSpec> retrievePendingSegmentIds(String dataSource, String sequenceName, String sequencePreviousId) -
retrievePendingSegmentIdsWithExactInterval
public List<SegmentIdWithShardSpec> retrievePendingSegmentIdsWithExactInterval(String dataSource, String sequenceName, org.joda.time.Interval interval) -
retrievePendingSegmentsWithExactInterval
public List<PendingSegmentRecord> retrievePendingSegmentsWithExactInterval(String dataSource, org.joda.time.Interval interval) -
retrievePendingSegmentsOverlappingInterval
public List<PendingSegmentRecord> retrievePendingSegmentsOverlappingInterval(String dataSource, org.joda.time.Interval interval) Fetches all the pending segments, whose interval overlaps with the given search interval, from the metadata store. -
retrievePendingSegmentsForTaskAllocatorId
public List<PendingSegmentRecord> retrievePendingSegmentsForTaskAllocatorId(String dataSource, String taskAllocatorId) -
getConditionForIntervalsAndMatchMode
public static String getConditionForIntervalsAndMatchMode(Collection<org.joda.time.Interval> intervals, org.apache.druid.metadata.SqlSegmentsMetadataQuery.IntervalMode matchMode, String quoteString) Get the condition for the interval and match mode.- Parameters:
intervals- - intervals to fetch the segments formatchMode- - Interval match mode - overlaps or containsquoteString- - the connector-specific quote string
-
bindIntervalsToQuery
public static void bindIntervalsToQuery(org.skife.jdbi.v2.Query<Map<String, Object>> query, Collection<org.joda.time.Interval> intervals) Bind the suppliedintervalstoquery. -
getParameterizedInConditionForColumn
- Returns:
- a parameterized
INclause for the specifiedcolumnName. The column values need to be bound to a query by callingbindColumnValuesToQueryWithInCondition(String, List, SQLStatement).
-
bindColumnValuesToQueryWithInCondition
public static void bindColumnValuesToQueryWithInCondition(String columnName, List<String> values, org.skife.jdbi.v2.SQLStatement<?> query) Binds the provided list ofvaluesto the specifiedcolumnNamein the given SQLquerythat contains anINclause.
-