Class HgkvFileSorter
- java.lang.Object
-
- org.apache.hugegraph.computer.core.sort.HgkvFileSorter
-
-
Constructor Summary
Constructors Constructor Description HgkvFileSorter(org.apache.hugegraph.computer.core.config.Config config)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PeekableIterator<KvEntry>iterator(java.util.List<java.lang.String> inputs, boolean withSubKv)Get the iterator ofpair by increasing order of key. voidmergeBuffers(java.util.List<org.apache.hugegraph.computer.core.io.RandomAccessInput> inputs, OuterSortFlusher flusher, java.lang.String output, boolean withSubKv)Merge the buffers by increasing order of key.voidmergeInputs(java.util.List<java.lang.String> inputs, OuterSortFlusher flusher, java.util.List<java.lang.String> outputs, boolean withSubKv)Merge the n inputs into m outputs.voidsortBuffer(org.apache.hugegraph.computer.core.io.RandomAccessInput input, InnerSortFlusher flusher, boolean withSubKv)Sort the buffer by increasing order of key.
-
-
-
Method Detail
-
sortBuffer
public void sortBuffer(org.apache.hugegraph.computer.core.io.RandomAccessInput input, InnerSortFlusher flusher, boolean withSubKv) throws java.lang.ExceptionDescription copied from interface:SorterSort the buffer by increasing order of key. Every key exists only once in output buffer. The input buffer format: | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | | key1 length | key1 | value3 length | value3 | and so on. If some key exists several time, combine the values.- Specified by:
sortBufferin interfaceSorter- Parameters:
input- The input buffer.flusher- The flusher for the same key.withSubKv- True if need sort subKv.- Throws:
java.lang.Exception
-
mergeBuffers
public void mergeBuffers(java.util.List<org.apache.hugegraph.computer.core.io.RandomAccessInput> inputs, OuterSortFlusher flusher, java.lang.String output, boolean withSubKv) throws java.lang.ExceptionDescription copied from interface:SorterMerge the buffers by increasing order of key. The input buffers in list are in increasing order of the key. There are two formats for the input buffer: 1. | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | and so on. Keys are in increasing order in each buffer. 2. | key1 length | key1 | value1 length | sub-entry count | | sub-key1 length | sub-key1 | sub-value1 length | sub-value1 | | sub-key2 length | sub-key2 | sub-value2 length | sub-value2 | and so on. Keys are in increasing order in each buffer. Sub-keys are in increasing order in a key value pair. The results of multiple buffer sorting are outputted to @param output- Specified by:
mergeBuffersin interfaceSorter- Parameters:
inputs- The input buffer list.flusher- The flusher for the same key.output- Sort result output location.withSubKv- True if need sort subKv.- Throws:
java.lang.Exception
-
mergeInputs
public void mergeInputs(java.util.List<java.lang.String> inputs, OuterSortFlusher flusher, java.util.List<java.lang.String> outputs, boolean withSubKv) throws java.lang.ExceptionDescription copied from interface:SorterMerge the n inputs into m outputs. 'n' is size of inputs, 'm' is size of outputs. The input files in list are in increasing order of the key. There are two formats for the input buffer: 1. | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | and so on. Keys are in increasing order in each buffer. 2. | key1 length | key1 | value1 length | sub-entry count | | sub-key1 length | sub-key1 | sub-value1 length | sub-value1 | | sub-key2 length | sub-key2 | sub-value2 length | sub-value2 | and so on. Sub-keys are in increasing order in a key value pair. The format of outputs is same as inputs. For example number of the inputs is 100, and number of the outputs is 10, this method merge 100 inputs into 10 outputs. The outputs need to be as evenly distributed as possible. It might need to sort the inputs by desc order. Then select the inputs one by one assign to the output with least inputs. It makes the difference between the outputs below the least inputs.- Specified by:
mergeInputsin interfaceSorter- Parameters:
inputs- The input file list.flusher- The flusher for the same key.outputs- Sort result output locations.withSubKv- True if need sort subKv.- Throws:
java.lang.Exception
-
iterator
public PeekableIterator<KvEntry> iterator(java.util.List<java.lang.String> inputs, boolean withSubKv) throws java.io.IOException
Description copied from interface:SorterGet the iterator ofpair by increasing order of key.
-
-