Class AbstractHadoopProcessor
java.lang.Object
org.apache.nifi.components.AbstractConfigurableComponent
org.apache.nifi.processor.AbstractSessionFactoryProcessor
org.apache.nifi.processor.AbstractProcessor
org.apache.nifi.processors.hadoop.AbstractHadoopProcessor
- All Implemented Interfaces:
org.apache.nifi.components.ClassloaderIsolationKeyProvider,org.apache.nifi.components.ConfigurableComponent,org.apache.nifi.processor.Processor
@RequiresInstanceClassLoading(cloneAncestorResources=true)
public abstract class AbstractHadoopProcessor
extends org.apache.nifi.processor.AbstractProcessor
implements org.apache.nifi.components.ClassloaderIsolationKeyProvider
This is a base class that is helpful when building processors interacting with HDFS.
As of Apache NiFi 1.5.0, the Relogin Period property is no longer used in the configuration of a Hadoop processor.
Due to changes made to
SecurityUtil.loginKerberos(Configuration, String, String), which is used by this
class to authenticate a principal with Kerberos, Hadoop components no longer
attempt relogins explicitly. For more information, please read the documentation for
SecurityUtil.loginKerberos(Configuration, String, String).- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static class -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final org.apache.nifi.components.PropertyDescriptorstatic final org.apache.nifi.components.PropertyDescriptorprivate static final Stringprivate static final Stringstatic final org.apache.nifi.components.PropertyDescriptorprivate static final HdfsResourcesstatic final org.apache.nifi.components.PropertyDescriptorstatic final String(package private) final AtomicReference<HdfsResources> static final org.apache.nifi.components.PropertyDescriptorprivate static final Patternprivate static final Stringprivate static final Stringprotected List<org.apache.nifi.components.PropertyDescriptor> private static final Objectprotected static final Stringprivate final AtomicReference<AbstractHadoopProcessor.ValidationResources> -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal voidabstractOnScheduled(org.apache.nifi.processor.ProcessContext context) If your subclass also has an @OnScheduled annotated method and you need hdfsResources in that method, then be sure to call super.abstractOnScheduled(context)final voidprotected voidcheckHdfsUriForTimeout(org.apache.hadoop.conf.Configuration config) protected Collection<org.apache.nifi.components.ValidationResult> customValidate(org.apache.nifi.components.ValidationContext validationContext) Returns an optional with the first throwable in the causal chain that is assignable to the provided cause type, and satisfies the provided cause predicate,Optional.empty()otherwise.getClassloaderIsolationKey(org.apache.nifi.context.PropertyContext context) protected org.apache.hadoop.io.compress.CompressionCodecgetCompressionCodec(org.apache.nifi.processor.ProcessContext context, org.apache.hadoop.conf.Configuration configuration) Returns the configured CompressionCodec, or null if none is configured.getConfigLocations(org.apache.nifi.context.PropertyContext context) protected org.apache.hadoop.conf.Configurationprivate static org.apache.hadoop.conf.ConfigurationgetConfigurationFromResources(org.apache.hadoop.conf.Configuration config, List<String> locations) protected org.apache.hadoop.fs.FileSystemprotected org.apache.hadoop.fs.FileSystemgetFileSystem(org.apache.hadoop.conf.Configuration config) This exists in order to allow unit tests to override it so that they don't take several minutes waiting for UDP packets to be receivedprotected org.apache.hadoop.fs.FileSystemgetFileSystemAsUser(org.apache.hadoop.conf.Configuration config, org.apache.hadoop.security.UserGroupInformation ugi) protected org.apache.hadoop.conf.ConfigurationgetHadoopConfigurationForValidation(List<String> locations) private KerberosUsergetKerberosUser(org.apache.nifi.processor.ProcessContext context) protected org.apache.hadoop.fs.PathgetNormalizedPath(String rawPath) private org.apache.hadoop.fs.PathgetNormalizedPath(String rawPath, Optional<String> propertyName) protected org.apache.hadoop.fs.PathgetNormalizedPath(org.apache.nifi.processor.ProcessContext context, org.apache.nifi.components.PropertyDescriptor property) protected org.apache.hadoop.fs.PathgetNormalizedPath(org.apache.nifi.processor.ProcessContext context, org.apache.nifi.components.PropertyDescriptor property, org.apache.nifi.flowfile.FlowFile flowFile) static StringgetPathDifference(org.apache.hadoop.fs.Path root, org.apache.hadoop.fs.Path child) Returns the relative path of the child that does not include the filename or the root path.protected List<org.apache.nifi.components.PropertyDescriptor> protected org.apache.hadoop.security.UserGroupInformationprotected booleanhandleAuthErrors(Throwable t, org.apache.nifi.processor.ProcessSession session, org.apache.nifi.processor.ProcessContext context, BiConsumer<org.apache.nifi.processor.ProcessSession, org.apache.nifi.processor.ProcessContext> sessionHandler) protected voidinit(org.apache.nifi.processor.ProcessorInitializationContext context) protected booleanisFileSystemAccessDenied(URI fileSystemUri) (package private) booleanvoidmigrateProperties(org.apache.nifi.migration.PropertyConfiguration config) protected voidpreProcessConfiguration(org.apache.hadoop.conf.Configuration config, org.apache.nifi.processor.ProcessContext context) This method will be called after the Configuration has been created, but before the FileSystem is created, allowing sub-classes to take further action on the Configuration before creating the FileSystem.(package private) HdfsResourcesresetHDFSResources(List<String> resourceLocations, org.apache.nifi.processor.ProcessContext context) protected Collection<org.apache.nifi.components.ValidationResult> validateFileSystem(org.apache.hadoop.conf.Configuration configuration) Methods inherited from class org.apache.nifi.processor.AbstractProcessor
onTrigger, onTriggerMethods inherited from class org.apache.nifi.processor.AbstractSessionFactoryProcessor
getControllerServiceLookup, getIdentifier, getLogger, getNodeTypeProvider, getRelationships, initialize, isConfigurationRestored, isScheduled, toString, updateConfiguredRestoredTrue, updateScheduledFalse, updateScheduledTrueMethods inherited from class org.apache.nifi.components.AbstractConfigurableComponent
equals, getPropertyDescriptor, getPropertyDescriptors, getSupportedDynamicPropertyDescriptor, hashCode, onPropertyModified, validateMethods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.nifi.components.ConfigurableComponent
getPropertyDescriptor, getPropertyDescriptors, onPropertyModified, validateMethods inherited from interface org.apache.nifi.processor.Processor
isStateful, migrateRelationships
-
Field Details
-
DENY_LFS_ACCESS
- See Also:
-
DENY_LFS_EXPLANATION
-
LOCAL_FILE_SYSTEM_URI
-
NORMALIZE_ERROR_WITH_PROPERTY
- See Also:
-
NORMALIZE_ERROR_WITHOUT_PROPERTY
- See Also:
-
HADOOP_CONFIGURATION_RESOURCES
public static final org.apache.nifi.components.PropertyDescriptor HADOOP_CONFIGURATION_RESOURCES -
DIRECTORY
public static final org.apache.nifi.components.PropertyDescriptor DIRECTORY -
COMPRESSION_CODEC
public static final org.apache.nifi.components.PropertyDescriptor COMPRESSION_CODEC -
ADDITIONAL_CLASSPATH_RESOURCES
public static final org.apache.nifi.components.PropertyDescriptor ADDITIONAL_CLASSPATH_RESOURCES -
KERBEROS_USER_SERVICE
public static final org.apache.nifi.components.PropertyDescriptor KERBEROS_USER_SERVICE -
ABSOLUTE_HDFS_PATH_ATTRIBUTE
- See Also:
-
HADOOP_FILE_URL_ATTRIBUTE
- See Also:
-
TARGET_HDFS_DIR_CREATED_ATTRIBUTE
- See Also:
-
RESOURCES_LOCK
-
EMPTY_HDFS_RESOURCES
-
properties
-
hdfsResources
-
validationResourceHolder
-
-
Constructor Details
-
AbstractHadoopProcessor
public AbstractHadoopProcessor()
-
-
Method Details
-
init
protected void init(org.apache.nifi.processor.ProcessorInitializationContext context) - Overrides:
initin classorg.apache.nifi.processor.AbstractSessionFactoryProcessor
-
migrateProperties
public void migrateProperties(org.apache.nifi.migration.PropertyConfiguration config) - Specified by:
migratePropertiesin interfaceorg.apache.nifi.processor.Processor
-
getSupportedPropertyDescriptors
- Overrides:
getSupportedPropertyDescriptorsin classorg.apache.nifi.components.AbstractConfigurableComponent
-
getClassloaderIsolationKey
- Specified by:
getClassloaderIsolationKeyin interfaceorg.apache.nifi.components.ClassloaderIsolationKeyProvider
-
customValidate
protected Collection<org.apache.nifi.components.ValidationResult> customValidate(org.apache.nifi.components.ValidationContext validationContext) - Overrides:
customValidatein classorg.apache.nifi.components.AbstractConfigurableComponent
-
validateFileSystem
protected Collection<org.apache.nifi.components.ValidationResult> validateFileSystem(org.apache.hadoop.conf.Configuration configuration) -
getHadoopConfigurationForValidation
protected org.apache.hadoop.conf.Configuration getHadoopConfigurationForValidation(List<String> locations) throws IOException - Throws:
IOException
-
abstractOnScheduled
@OnScheduled public final void abstractOnScheduled(org.apache.nifi.processor.ProcessContext context) throws IOException If your subclass also has an @OnScheduled annotated method and you need hdfsResources in that method, then be sure to call super.abstractOnScheduled(context)- Throws:
IOException
-
getConfigLocations
-
abstractOnStopped
@OnStopped public final void abstractOnStopped() -
getConfigurationFromResources
private static org.apache.hadoop.conf.Configuration getConfigurationFromResources(org.apache.hadoop.conf.Configuration config, List<String> locations) throws IOException - Throws:
IOException
-
resetHDFSResources
HdfsResources resetHDFSResources(List<String> resourceLocations, org.apache.nifi.processor.ProcessContext context) throws IOException - Throws:
IOException
-
getKerberosUser
-
preProcessConfiguration
protected void preProcessConfiguration(org.apache.hadoop.conf.Configuration config, org.apache.nifi.processor.ProcessContext context) This method will be called after the Configuration has been created, but before the FileSystem is created, allowing sub-classes to take further action on the Configuration before creating the FileSystem.- Parameters:
config- the Configuration that will be used to create the FileSystemcontext- the context that can be used to retrieve additional values
-
getFileSystem
protected org.apache.hadoop.fs.FileSystem getFileSystem(org.apache.hadoop.conf.Configuration config) throws IOException This exists in order to allow unit tests to override it so that they don't take several minutes waiting for UDP packets to be received- Parameters:
config- the configuration to use- Returns:
- the FileSystem that is created for the given Configuration
- Throws:
IOException- if unable to create the FileSystem
-
getFileSystemAsUser
protected org.apache.hadoop.fs.FileSystem getFileSystemAsUser(org.apache.hadoop.conf.Configuration config, org.apache.hadoop.security.UserGroupInformation ugi) throws IOException - Throws:
IOException
-
checkHdfsUriForTimeout
protected void checkHdfsUriForTimeout(org.apache.hadoop.conf.Configuration config) throws IOException - Throws:
IOException
-
getCompressionCodec
protected org.apache.hadoop.io.compress.CompressionCodec getCompressionCodec(org.apache.nifi.processor.ProcessContext context, org.apache.hadoop.conf.Configuration configuration) Returns the configured CompressionCodec, or null if none is configured.- Parameters:
context- the ProcessContextconfiguration- the Hadoop Configuration- Returns:
- CompressionCodec or null
-
getPathDifference
public static String getPathDifference(org.apache.hadoop.fs.Path root, org.apache.hadoop.fs.Path child) Returns the relative path of the child that does not include the filename or the root path.- Parameters:
root- the path to relativize fromchild- the path to relativize- Returns:
- the relative path
-
getConfiguration
protected org.apache.hadoop.conf.Configuration getConfiguration() -
getFileSystem
protected org.apache.hadoop.fs.FileSystem getFileSystem() -
getUserGroupInformation
protected org.apache.hadoop.security.UserGroupInformation getUserGroupInformation() -
isLocalFileSystemAccessDenied
boolean isLocalFileSystemAccessDenied() -
isFileSystemAccessDenied
-
getNormalizedPath
protected org.apache.hadoop.fs.Path getNormalizedPath(org.apache.nifi.processor.ProcessContext context, org.apache.nifi.components.PropertyDescriptor property) -
getNormalizedPath
-
getNormalizedPath
protected org.apache.hadoop.fs.Path getNormalizedPath(org.apache.nifi.processor.ProcessContext context, org.apache.nifi.components.PropertyDescriptor property, org.apache.nifi.flowfile.FlowFile flowFile) -
getNormalizedPath
-
findCause
protected <T extends Throwable> Optional<T> findCause(Throwable t, Class<T> expectedCauseType, Predicate<T> causePredicate) Returns an optional with the first throwable in the causal chain that is assignable to the provided cause type, and satisfies the provided cause predicate,Optional.empty()otherwise.- Parameters:
t- The throwable to inspect for the cause.- Returns:
- Throwable Cause
-
handleAuthErrors
protected boolean handleAuthErrors(Throwable t, org.apache.nifi.processor.ProcessSession session, org.apache.nifi.processor.ProcessContext context, BiConsumer<org.apache.nifi.processor.ProcessSession, org.apache.nifi.processor.ProcessContext> sessionHandler)
-