org.apache.hadoop.tools.mapred
Class CopyOutputFormat<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>
          extended by org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<K,V>
              extended by org.apache.hadoop.tools.mapred.CopyOutputFormat<K,V>
Type Parameters:
K -
V -

public class CopyOutputFormat<K,V>
extends org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<K,V>

The CopyOutputFormat is the Hadoop OutputFormat used in DistCp. It sets up the Job's Configuration (in the Job-Context) with the settings for the work-directory, final commit-directory, etc. It also sets the right output-committer.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.LineRecordWriter<K,V>
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
 
Field Summary
 
Fields inherited from class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
SEPERATOR
 
Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART
 
Constructor Summary
CopyOutputFormat()
           
 
Method Summary
 void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
           
static org.apache.hadoop.fs.Path getCommitDirectory(org.apache.hadoop.mapreduce.Job job)
          Getter for the final commit-directory.
 org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
static org.apache.hadoop.fs.Path getWorkingDirectory(org.apache.hadoop.mapreduce.Job job)
          Getter for the working directory.
static void setCommitDirectory(org.apache.hadoop.mapreduce.Job job, org.apache.hadoop.fs.Path commitDirectory)
          Setter for the final directory for DistCp (where files copied will be moved, atomically.)
static void setWorkingDirectory(org.apache.hadoop.mapreduce.Job job, org.apache.hadoop.fs.Path workingDirectory)
          Setter for the working directory for DistCp (where files will be copied before they are moved to the final commit-directory.)
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat
getRecordWriter
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
getCompressOutput, getDefaultWorkFile, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CopyOutputFormat

public CopyOutputFormat()
Method Detail

setWorkingDirectory

public static void setWorkingDirectory(org.apache.hadoop.mapreduce.Job job,
                                       org.apache.hadoop.fs.Path workingDirectory)
Setter for the working directory for DistCp (where files will be copied before they are moved to the final commit-directory.)

Parameters:
job - The Job on whose configuration the working-directory is to be set.
workingDirectory - The path to use as the working directory.

setCommitDirectory

public static void setCommitDirectory(org.apache.hadoop.mapreduce.Job job,
                                      org.apache.hadoop.fs.Path commitDirectory)
Setter for the final directory for DistCp (where files copied will be moved, atomically.)

Parameters:
job - The Job on whose configuration the working-directory is to be set.
commitDirectory - The path to use for final commit.

getWorkingDirectory

public static org.apache.hadoop.fs.Path getWorkingDirectory(org.apache.hadoop.mapreduce.Job job)
Getter for the working directory.

Parameters:
job - The Job from whose configuration the working-directory is to be retrieved.
Returns:
The working-directory Path.

getCommitDirectory

public static org.apache.hadoop.fs.Path getCommitDirectory(org.apache.hadoop.mapreduce.Job job)
Getter for the final commit-directory.

Parameters:
job - The Job from whose configuration the commit-directory is to be retrieved.
Returns:
The commit-directory Path.

getOutputCommitter

public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                               throws IOException
Overrides:
getOutputCommitter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>
Throws:
IOException

checkOutputSpecs

public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
                      throws IOException
Overrides:
checkOutputSpecs in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>
Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.