public interface Bucketer<T> extends Serializable
BucketingSink
to put emitted elements into rolling files.
The BucketingSink can be writing to many buckets at a time, and it is responsible for managing
a set of active buckets. Whenever a new element arrives it will ask the Bucketer for the bucket
path the element should fall in. The Bucketer can, for example, determine buckets based on
system time.
| 限定符和类型 | 方法和说明 |
|---|---|
org.apache.hadoop.fs.Path |
getBucketPath(Clock clock,
org.apache.hadoop.fs.Path basePath,
T element)
Returns the
Path of a bucket file. |
org.apache.hadoop.fs.Path getBucketPath(Clock clock, org.apache.hadoop.fs.Path basePath, T element)
Path of a bucket file.basePath - The base path containing all the buckets.element - The current element being processed.Path of the bucket which the provided element should fall in. This
should include the basePath and also the subtaskIndex to avoid clashes with
parallel sinks.Copyright © 2014–2019 The Apache Software Foundation. All rights reserved.