Class FileCrawler


  • public class FileCrawler
    extends Crawler
    Class that crawls a given file url.
    Author:
    mcayanan
    • Constructor Detail

      • FileCrawler

        public FileCrawler()
    • Method Detail

      • crawl

        public List<Target> crawl​(URL fileUrl,
                                  boolean getDirectories,
                                  org.apache.commons.io.filefilter.IOFileFilter fileFilter)
                           throws IOException
        Crawl a given directory url.
        Specified by:
        crawl in class Crawler
        Parameters:
        fileUrl - File url.
        Returns:
        A list of files and sub-directories (if found and if getSubDirectories flag is 'true').
        Throws:
        IOException
      • crawl

        public List<Target> crawl​(URL fileUrl,
                                  String[] extensions,
                                  boolean getDirectories,
                                  boolean ignoreCaseFlag)
                           throws IOException
        Crawl a given directory url.
        Parameters:
        fileUrl - File url.
        extensions - The file matching file a list of file extensions.
        getDirectories - Flag if True will crawl next sub directory.
        nameToken - The substring will be searched for in the file names. Note that the search will be done in all lower cased if ignoreCaseFlag is true.
        ignoreCaseFlag - Flag to ignore case when comparing the file name found with the nameToken.
        Returns:
        A list of files and sub-directories (if found and if getSubDirectories flag is 'true').
        Throws:
        IOException
      • crawl

        public List<Target> crawl​(URL fileUrl,
                                  String[] extensions,
                                  boolean getDirectories)
                           throws IOException
        Crawl a given directory url.
        Overrides:
        crawl in class Crawler
        Parameters:
        fileUrl - File url.
        extensions - The file matching file a list of file extensions.
        getDirectories - Flag if True will crawl next sub directory.
        Returns:
        A list of files and sub-directories (if found and if getSubDirectories flag is 'true').
        Throws:
        IOException