Class BufferFileSorter

  • All Implemented Interfaces:
    Sorter

    public class BufferFileSorter
    extends java.lang.Object
    implements Sorter
    • Constructor Summary

      Constructors 
      Constructor Description
      BufferFileSorter​(org.apache.hugegraph.computer.core.config.Config config)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      PeekableIterator<KvEntry> iterator​(java.util.List<java.lang.String> inputs, boolean withSubKv)
      Get the iterator of pair by increasing order of key.
      void mergeBuffers​(java.util.List<org.apache.hugegraph.computer.core.io.RandomAccessInput> inputs, OuterSortFlusher flusher, java.lang.String output, boolean withSubKv)
      Merge the buffers by increasing order of key.
      void mergeInputs​(java.util.List<java.lang.String> inputs, OuterSortFlusher flusher, java.util.List<java.lang.String> outputs, boolean withSubKv)
      Merge the n inputs into m outputs.
      void sortBuffer​(org.apache.hugegraph.computer.core.io.RandomAccessInput input, InnerSortFlusher flusher, boolean withSubKv)
      Sort the buffer by increasing order of key.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BufferFileSorter

        public BufferFileSorter​(org.apache.hugegraph.computer.core.config.Config config)
    • Method Detail

      • sortBuffer

        public void sortBuffer​(org.apache.hugegraph.computer.core.io.RandomAccessInput input,
                               InnerSortFlusher flusher,
                               boolean withSubKv)
                        throws java.lang.Exception
        Description copied from interface: Sorter
        Sort the buffer by increasing order of key. Every key exists only once in output buffer. The input buffer format: | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | | key1 length | key1 | value3 length | value3 | and so on. If some key exists several time, combine the values.
        Specified by:
        sortBuffer in interface Sorter
        Parameters:
        input - The input buffer.
        flusher - The flusher for the same key.
        withSubKv - True if need sort subKv.
        Throws:
        java.lang.Exception
      • mergeBuffers

        public void mergeBuffers​(java.util.List<org.apache.hugegraph.computer.core.io.RandomAccessInput> inputs,
                                 OuterSortFlusher flusher,
                                 java.lang.String output,
                                 boolean withSubKv)
                          throws java.lang.Exception
        Description copied from interface: Sorter
        Merge the buffers by increasing order of key. The input buffers in list are in increasing order of the key. There are two formats for the input buffer: 1. | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | and so on. Keys are in increasing order in each buffer. 2. | key1 length | key1 | value1 length | sub-entry count | | sub-key1 length | sub-key1 | sub-value1 length | sub-value1 | | sub-key2 length | sub-key2 | sub-value2 length | sub-value2 | and so on. Keys are in increasing order in each buffer. Sub-keys are in increasing order in a key value pair. The results of multiple buffer sorting are outputted to @param output
        Specified by:
        mergeBuffers in interface Sorter
        Parameters:
        inputs - The input buffer list.
        flusher - The flusher for the same key.
        output - Sort result output location.
        withSubKv - True if need sort subKv.
        Throws:
        java.lang.Exception
      • mergeInputs

        public void mergeInputs​(java.util.List<java.lang.String> inputs,
                                OuterSortFlusher flusher,
                                java.util.List<java.lang.String> outputs,
                                boolean withSubKv)
                         throws java.lang.Exception
        Description copied from interface: Sorter
        Merge the n inputs into m outputs. 'n' is size of inputs, 'm' is size of outputs. The input files in list are in increasing order of the key. There are two formats for the input buffer: 1. | key1 length | key1 | value1 length | value1 | | key2 length | key2 | value2 length | value2 | and so on. Keys are in increasing order in each buffer. 2. | key1 length | key1 | value1 length | sub-entry count | | sub-key1 length | sub-key1 | sub-value1 length | sub-value1 | | sub-key2 length | sub-key2 | sub-value2 length | sub-value2 | and so on. Sub-keys are in increasing order in a key value pair. The format of outputs is same as inputs. For example number of the inputs is 100, and number of the outputs is 10, this method merge 100 inputs into 10 outputs. The outputs need to be as evenly distributed as possible. It might need to sort the inputs by desc order. Then select the inputs one by one assign to the output with least inputs. It makes the difference between the outputs below the least inputs.
        Specified by:
        mergeInputs in interface Sorter
        Parameters:
        inputs - The input file list.
        flusher - The flusher for the same key.
        outputs - Sort result output locations.
        withSubKv - True if need sort subKv.
        Throws:
        java.lang.Exception
      • iterator

        public PeekableIterator<KvEntry> iterator​(java.util.List<java.lang.String> inputs,
                                                  boolean withSubKv)
                                           throws java.io.IOException
        Description copied from interface: Sorter
        Get the iterator of pair by increasing order of key.
        Specified by:
        iterator in interface Sorter
        Throws:
        java.io.IOException