org.apache.hadoop.tools
Class DistCp

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.tools.DistCp
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class DistCp
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

DistCp is the main driver-class for DistCpV2. For command-line use, DistCp::main() orchestrates the parsing of command-line parameters and the launch of the DistCp job. For programmatic use, a DistCp object can be constructed by specifying options (in a DistCpOptions object), and DistCp::execute() may be used to launch the copy-job. DistCp may alternatively be sub-classed to fine-tune behaviour.


Field Summary
static Random rand
           
static int SHUTDOWN_HOOK_PRIORITY
          Priority of the ResourceManager shutdown hook.
 
Constructor Summary
DistCp()
          To be used with the ToolRunner.
DistCp(org.apache.hadoop.conf.Configuration configuration, DistCpOptions inputOptions)
          Public Constructor.
 
Method Summary
protected  org.apache.hadoop.fs.Path createInputFileListing(org.apache.hadoop.mapreduce.Job job)
          Create input listing by invoking an appropriate copy listing implementation.
 org.apache.hadoop.mapreduce.Job execute()
          Implements the core-execution.
protected  org.apache.hadoop.fs.Path getFileListingPath()
          Get default name of the copy listing file.
static void main(String[] argv)
          Main function of the DistCp program.
 int run(String[] argv)
          Implementation of Tool::run().
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

SHUTDOWN_HOOK_PRIORITY

public static final int SHUTDOWN_HOOK_PRIORITY
Priority of the ResourceManager shutdown hook.

See Also:
Constant Field Values

rand

public static final Random rand
Constructor Detail

DistCp

public DistCp(org.apache.hadoop.conf.Configuration configuration,
              DistCpOptions inputOptions)
       throws Exception
Public Constructor. Creates DistCp object with specified input-parameters. (E.g. source-paths, target-location, etc.)

Parameters:
inputOptions - Options (indicating source-paths, target-location.)
configuration - The Hadoop configuration against which the Copy-mapper must run.
Throws:
Exception, - on failure.
Exception

DistCp

public DistCp()
To be used with the ToolRunner. Not for public consumption.

Method Detail

run

public int run(String[] argv)
Implementation of Tool::run(). Orchestrates the copy of source file(s) to target location, by: 1. Creating a list of files to be copied to target. 2. Launching a Map-only job to copy the files. (Delegates to execute().)

Specified by:
run in interface org.apache.hadoop.util.Tool
Parameters:
argv - List of arguments passed to DistCp, from the ToolRunner.
Returns:
On success, it returns 0. Else, -1.

execute

public org.apache.hadoop.mapreduce.Job execute()
                                        throws Exception
Implements the core-execution. Creates the file-list for copy, and launches the Hadoop-job, to do the copy.

Returns:
Job handle
Throws:
Exception, - on failure.
Exception

createInputFileListing

protected org.apache.hadoop.fs.Path createInputFileListing(org.apache.hadoop.mapreduce.Job job)
                                                    throws IOException
Create input listing by invoking an appropriate copy listing implementation. Also add delegation tokens for each path to job's credential store

Parameters:
job - - Handle to job
Returns:
Returns the path where the copy listing is created
Throws:
IOException - - If any

getFileListingPath

protected org.apache.hadoop.fs.Path getFileListingPath()
                                                throws IOException
Get default name of the copy listing file. Use the meta folder to create the copy listing file

Returns:
- Path where the copy listing file has to be saved
Throws:
IOException - - Exception if any

main

public static void main(String[] argv)
Main function of the DistCp program. Parses the input arguments (via OptionsParser), and invokes the DistCp::run() method, via the ToolRunner.

Parameters:
argv - Command-line arguments sent to DistCp.


Copyright © 2014 Apache Software Foundation. All Rights Reserved.