@NotThreadSafe public class HoodieMergedLogRecordScanner extends BaseHoodieMergedLogRecordScanner<String>
NOTE: If readBlockLazily is turned on, does not merge, instead keeps reading log blocks and merges everything at once This is an optimization to avoid seek() back and forth to read new block (forward seek()) and lazily read content of seen block (reverse and forward seek()) during merge | | Read Block 1 Metadata | | Read Block 1 Data | | | Read Block 2 Metadata | | Read Block 2 Data | | I/O Pass 1 | ..................... | I/O Pass 2 | ................. | | | Read Block N Metadata | | Read Block N Data |
This results in two I/O passes over the log file.
| Modifier and Type | Class and Description |
|---|---|
static class |
HoodieMergedLogRecordScanner.Builder
Builder used to build
HoodieMergedLogRecordScanner. |
AbstractHoodieLogRecordReader.KeySpecrecords, timerforceFullScan, hoodieTableMetaClient, logFilePaths, preCombineField, readerSchema, recordMerger, recordType| Modifier | Constructor and Description |
|---|---|
protected |
HoodieMergedLogRecordScanner(HoodieStorage storage,
String basePath,
List<String> logFilePaths,
org.apache.avro.Schema readerSchema,
String latestInstantTime,
Long maxMemorySizeInBytes,
boolean reverseReader,
int bufferSize,
String spillableMapBasePath,
Option<InstantRange> instantRange,
ExternalSpillableMap.DiskMapType diskMapType,
boolean isBitCaskDiskMapCompressionEnabled,
boolean withOperationField,
boolean forceFullScan,
Option<String> partitionName,
InternalSchema internalSchema,
Option<String> keyFieldOverride,
boolean enableOptimizedLogBlocksScan,
HoodieRecordMerger recordMerger,
Option<HoodieTableMetaClient> hoodieTableMetaClientOption) |
| Modifier and Type | Method and Description |
|---|---|
Map<String,HoodieRecord> |
getRecords() |
static HoodieMergedLogRecordScanner.Builder |
newBuilder()
Returns the builder for
HoodieMergedLogRecordScanner. |
protected <T> void |
processNextRecord(HoodieRecord<T> newRecord)
Process next record.
|
close, getLatestHoodieRecord, getNumMergedRecordsInLog, getRecordType, getTotalTimeTakenToReadAndMergeBlocks, iterator, processNextDeletedRecord, scan, scan, scanByFullKeys, scanByKeyPrefixesgetCurrentInstantLogBlocks, getPartitionNameOverride, getPayloadClassFQN, getPayloadProps, getProgress, getTotalCorruptBlocks, getTotalLogBlocks, getTotalLogFiles, getTotalLogRecords, getTotalRollbacks, getValidBlockInstants, isWithOperationField, scanInternalclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEach, spliteratorprotected HoodieMergedLogRecordScanner(HoodieStorage storage, String basePath, List<String> logFilePaths, org.apache.avro.Schema readerSchema, String latestInstantTime, Long maxMemorySizeInBytes, boolean reverseReader, int bufferSize, String spillableMapBasePath, Option<InstantRange> instantRange, ExternalSpillableMap.DiskMapType diskMapType, boolean isBitCaskDiskMapCompressionEnabled, boolean withOperationField, boolean forceFullScan, Option<String> partitionName, InternalSchema internalSchema, Option<String> keyFieldOverride, boolean enableOptimizedLogBlocksScan, HoodieRecordMerger recordMerger, Option<HoodieTableMetaClient> hoodieTableMetaClientOption)
public Map<String,HoodieRecord> getRecords()
getRecords in class BaseHoodieMergedLogRecordScanner<String>protected <T> void processNextRecord(HoodieRecord<T> newRecord) throws IOException
AbstractHoodieLogRecordReaderprocessNextRecord in class AbstractHoodieLogRecordReadernewRecord - Hoodie Record to processIOExceptionpublic static HoodieMergedLogRecordScanner.Builder newBuilder()
HoodieMergedLogRecordScanner.Copyright © 2024 The Apache Software Foundation. All rights reserved.