@Experimental
public abstract class HBaseInputFormat<T extends org.apache.flink.api.java.tuple.Tuple>
extends org.apache.flink.api.common.io.RichInputFormat<T,org.apache.flink.connector.hbase.source.TableInputSplit>
InputFormat subclass that wraps the access for HTables.| 限定符和类型 | 字段和说明 |
|---|---|
protected byte[] |
currentRow |
protected boolean |
endReached |
protected static org.slf4j.Logger |
LOG |
protected org.apache.hadoop.hbase.client.ResultScanner |
resultScanner
HBase iterator wrapper.
|
protected org.apache.hadoop.hbase.client.Scan |
scan |
protected long |
scannedRows |
protected byte[] |
serializedConfig |
protected org.apache.hadoop.hbase.client.HTable |
table |
| 构造器和说明 |
|---|
HBaseInputFormat(org.apache.hadoop.conf.Configuration hConf)
Constructs a
InputFormat with hbase configuration to read data from hbase. |
| 限定符和类型 | 方法和说明 |
|---|---|
void |
close() |
void |
closeInputFormat() |
void |
configure(org.apache.flink.configuration.Configuration parameters)
Creates a
Scan object and opens the HTable connection. |
org.apache.flink.connector.hbase.source.TableInputSplit[] |
createInputSplits(int minNumSplits) |
protected org.apache.hadoop.conf.Configuration |
getHadoopConfiguration() |
org.apache.flink.core.io.InputSplitAssigner |
getInputSplitAssigner(org.apache.flink.connector.hbase.source.TableInputSplit[] inputSplits) |
protected abstract org.apache.hadoop.hbase.client.Scan |
getScanner()
Returns an instance of Scan that retrieves the required subset of records from the HBase table.
|
org.apache.flink.api.common.io.statistics.BaseStatistics |
getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics) |
protected abstract String |
getTableName()
What table is to be read.
|
protected boolean |
includeRegionInScan(byte[] startKey,
byte[] endKey)
Test if the given region is to be included in the scan while splitting the regions of a table.
|
protected T |
mapResultToOutType(org.apache.hadoop.hbase.client.Result r)
HBase returns an instance of
Result. |
protected abstract T |
mapResultToTuple(org.apache.hadoop.hbase.client.Result r)
The output from HBase is always an instance of
Result. |
T |
nextRecord(T reuse) |
void |
open(org.apache.flink.connector.hbase.source.TableInputSplit split) |
boolean |
reachedEnd() |
protected static final org.slf4j.Logger LOG
protected boolean endReached
protected transient org.apache.hadoop.hbase.client.HTable table
protected transient org.apache.hadoop.hbase.client.Scan scan
protected org.apache.hadoop.hbase.client.ResultScanner resultScanner
protected byte[] currentRow
protected long scannedRows
protected byte[] serializedConfig
public HBaseInputFormat(org.apache.hadoop.conf.Configuration hConf)
InputFormat with hbase configuration to read data from hbase.hConf - The configuration that connect to hbase.
At least hbase.zookeeper.quorum and zookeeper.znode.parent need to be set.protected abstract org.apache.hadoop.hbase.client.Scan getScanner()
protected abstract String getTableName()
protected abstract T mapResultToTuple(org.apache.hadoop.hbase.client.Result r)
Result.
This method is to copy the data in the Result instance into the required Tupler - The Result instance from HBase that needs to be convertedTuple that contains the needed information.public void configure(org.apache.flink.configuration.Configuration parameters)
Scan object and opens the HTable connection.
These are opened here because they are needed in the createInputSplits
which is called before the openInputFormat method.
So the connection is opened in configure(Configuration) and closed in closeInputFormat().configure 在接口中 org.apache.flink.api.common.io.InputFormat<T extends org.apache.flink.api.java.tuple.Tuple,org.apache.flink.connector.hbase.source.TableInputSplit>parameters - The configuration that is to be usedConfigurationprotected T mapResultToOutType(org.apache.hadoop.hbase.client.Result r)
Result.
This method maps the returned Result instance into the output type T.
r - The Result instance from HBase that needs to be convertedT that contains the data of Result.protected org.apache.hadoop.conf.Configuration getHadoopConfiguration()
public void open(org.apache.flink.connector.hbase.source.TableInputSplit split)
throws IOException
IOExceptionpublic T nextRecord(T reuse)
throws IOException
IOExceptionpublic boolean reachedEnd()
throws IOException
IOExceptionpublic void close()
throws IOException
IOExceptionpublic void closeInputFormat()
throws IOException
closeInputFormat 在类中 org.apache.flink.api.common.io.RichInputFormat<T,org.apache.flink.connector.hbase.source.TableInputSplit>IOExceptionpublic org.apache.flink.connector.hbase.source.TableInputSplit[] createInputSplits(int minNumSplits)
throws IOException
IOExceptionprotected boolean includeRegionInScan(byte[] startKey,
byte[] endKey)
startKey - Start key of the regionendKey - End key of the regionpublic org.apache.flink.core.io.InputSplitAssigner getInputSplitAssigner(org.apache.flink.connector.hbase.source.TableInputSplit[] inputSplits)
public org.apache.flink.api.common.io.statistics.BaseStatistics getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics)
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.