Class ParquetUtil

java.lang.Object
org.apache.iceberg.parquet.ParquetUtil

public class ParquetUtil extends Object
  • Method Summary

    Modifier and Type
    Method
    Description
    static long
    Method to read timestamp (parquet Int96) from bytebuffer.
    static org.apache.iceberg.Metrics
    fileMetrics(org.apache.iceberg.io.InputFile file, org.apache.iceberg.MetricsConfig metricsConfig)
     
    static org.apache.iceberg.Metrics
    fileMetrics(org.apache.iceberg.io.InputFile file, org.apache.iceberg.MetricsConfig metricsConfig, org.apache.iceberg.mapping.NameMapping nameMapping)
     
    static org.apache.iceberg.Metrics
    footerMetrics(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<org.apache.iceberg.FieldMetrics<?>> fieldMetrics, org.apache.iceberg.MetricsConfig metricsConfig)
     
    static org.apache.iceberg.Metrics
    footerMetrics(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<org.apache.iceberg.FieldMetrics<?>> fieldMetrics, org.apache.iceberg.MetricsConfig metricsConfig, org.apache.iceberg.mapping.NameMapping nameMapping)
     
    static List<Long>
    getSplitOffsets(org.apache.parquet.hadoop.metadata.ParquetMetadata md)
    Returns a list of offsets in ascending order determined by the starting position of the row groups.
    static boolean
    hasNoBloomFilterPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta)
     
    static boolean
    hasNonDictionaryPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta)
     
    static boolean
    isIntType(org.apache.parquet.schema.PrimitiveType primitiveType)
     
    static org.apache.parquet.column.Dictionary
    readDictionary(org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.column.page.PageReader pageSource)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • fileMetrics

      public static org.apache.iceberg.Metrics fileMetrics(org.apache.iceberg.io.InputFile file, org.apache.iceberg.MetricsConfig metricsConfig)
    • fileMetrics

      public static org.apache.iceberg.Metrics fileMetrics(org.apache.iceberg.io.InputFile file, org.apache.iceberg.MetricsConfig metricsConfig, org.apache.iceberg.mapping.NameMapping nameMapping)
    • footerMetrics

      public static org.apache.iceberg.Metrics footerMetrics(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<org.apache.iceberg.FieldMetrics<?>> fieldMetrics, org.apache.iceberg.MetricsConfig metricsConfig)
    • footerMetrics

      public static org.apache.iceberg.Metrics footerMetrics(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata, Stream<org.apache.iceberg.FieldMetrics<?>> fieldMetrics, org.apache.iceberg.MetricsConfig metricsConfig, org.apache.iceberg.mapping.NameMapping nameMapping)
    • getSplitOffsets

      public static List<Long> getSplitOffsets(org.apache.parquet.hadoop.metadata.ParquetMetadata md)
      Returns a list of offsets in ascending order determined by the starting position of the row groups.
    • hasNonDictionaryPages

      public static boolean hasNonDictionaryPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta)
    • hasNoBloomFilterPages

      public static boolean hasNoBloomFilterPages(org.apache.parquet.hadoop.metadata.ColumnChunkMetaData meta)
    • readDictionary

      public static org.apache.parquet.column.Dictionary readDictionary(org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.column.page.PageReader pageSource)
    • isIntType

      public static boolean isIntType(org.apache.parquet.schema.PrimitiveType primitiveType)
    • extractTimestampInt96

      public static long extractTimestampInt96(ByteBuffer buffer)
      Method to read timestamp (parquet Int96) from bytebuffer. Read 12 bytes in byteBuffer: 8 bytes (time of day nanos) + 4 bytes(julianDay)