Class PDSProductCrawler
- java.lang.Object
-
- gov.nasa.jpl.oodt.cas.crawl.config.ProductCrawlerBean
-
- gov.nasa.jpl.oodt.cas.crawl.ProductCrawler
-
- gov.nasa.pds.harvest.search.crawler.PDSProductCrawler
-
- All Implemented Interfaces:
gov.nasa.jpl.oodt.cas.commons.spring.SpringSetIdInjectionType,gov.nasa.jpl.oodt.cas.filemgr.metadata.CoreMetKeys
- Direct Known Subclasses:
CollectionCrawler,PDS3ProductCrawler
public class PDSProductCrawler extends gov.nasa.jpl.oodt.cas.crawl.ProductCrawlerClass that extends the Cas-Crawler to crawl a directory or PDS inventory file and register products to the PDS Registry Service.- Author:
- mcayanan
-
-
Field Summary
Fields Modifier and Type Field Description protected booleaninPersistanceModeFlag for crawler persistance.protected Map<File,Long>touchedFilesA map of files that were touched during crawler persistance.
-
Constructor Summary
Constructors Constructor Description PDSProductCrawler()Default constructor.PDSProductCrawler(Pds4MetExtractorConfig extractorConfig)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddAction(gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction action)Adds a crawler action.voidaddActions(List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> actions)Adds a list of crawler actions.protected voidaddKnownMetadata(File product, gov.nasa.jpl.oodt.cas.metadata.Metadata productMetadata)Method not implemented at the moment.voidcrawl(File dir)Crawls the given directory.List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction>getActions()Gets a list of crawler actions defined for the crawler.protected gov.nasa.jpl.oodt.cas.metadata.MetadatagetMetadataForProduct(File product)Extracts metadata from the given product.Pds4MetExtractorConfiggetMetExtractorConfig()Get the MetExtractor configuration object.protected booleanpassesPreconditions(File product)Determines whether the supplied file passes the necessary pre-conditions for the file to be registered.voidsetCounter(SearchDocState searchDocState)voidsetDirectoryFilter(DirectoryFilter filter)Sets the directory filter for the crawler.voidsetFileFilter(FileFilter filter)Sets the file filter for the crawler.voidsetInPersistanceMode(boolean value)voidsetMetExtractorConfig(Pds4MetExtractorConfig config)voidsetSearchUrl(String url)Sets the Search Service URL location.-
Methods inherited from class gov.nasa.jpl.oodt.cas.crawl.ProductCrawler
clearIngestStatus, crawl, getIngestStatus, handleFile, setActionRepo
-
Methods inherited from class gov.nasa.jpl.oodt.cas.crawl.config.ProductCrawlerBean
addRequiredMetadata, getActionIds, getApplicationContext, getDaemonPort, getDaemonWait, getFilemgrUrl, getGlobalMetadata, getId, getIngester, getProductPath, getRequiredMetadata, isCrawlForDirs, isNoRecur, isSkipIngest, setActionIds, setApplicationContext, setCrawlForDirs, setDaemonPort, setDaemonWait, setFilemgrUrl, setGlobalMetadata, setId, setIngester, setNoRecur, setProductPath, setRequiredMetadata, setSkipIngest
-
-
-
-
Constructor Detail
-
PDSProductCrawler
public PDSProductCrawler()
Default constructor.
-
PDSProductCrawler
public PDSProductCrawler(Pds4MetExtractorConfig extractorConfig)
Constructor.- Parameters:
extractorConfig- A configuration class that tells the crawler what data product types to look for and what metadata to extract.
-
-
Method Detail
-
getMetExtractorConfig
public Pds4MetExtractorConfig getMetExtractorConfig()
Get the MetExtractor configuration object.- Returns:
- The PDSMetExtractorConfig object.
-
setMetExtractorConfig
public void setMetExtractorConfig(Pds4MetExtractorConfig config)
-
setInPersistanceMode
public void setInPersistanceMode(boolean value)
-
setFileFilter
public void setFileFilter(FileFilter filter)
Sets the file filter for the crawler.- Parameters:
filter- A File Filter defined in the Harvest policy config.
-
setDirectoryFilter
public void setDirectoryFilter(DirectoryFilter filter)
Sets the directory filter for the crawler.- Parameters:
filter- A Directory Filter defined in the Harvest policy config.
-
addKnownMetadata
protected void addKnownMetadata(File product, gov.nasa.jpl.oodt.cas.metadata.Metadata productMetadata)
Method not implemented at the moment.- Overrides:
addKnownMetadatain classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler- Parameters:
product- The product file.productMetadata- The metadata associated with the product.
-
crawl
public void crawl(File dir)
Crawls the given directory.- Overrides:
crawlin classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler- Parameters:
dir- The directory to crawl.
-
addAction
public void addAction(gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction action)
Adds a crawler action.- Parameters:
action- A crawler action.
-
addActions
public void addActions(List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> actions)
Adds a list of crawler actions.- Parameters:
actions- A list of crawler actions.
-
getActions
public List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> getActions()
Gets a list of crawler actions defined for the crawler.- Returns:
- A list of crawler actions that will be performed during crawling.
-
getMetadataForProduct
protected gov.nasa.jpl.oodt.cas.metadata.Metadata getMetadataForProduct(File product)
Extracts metadata from the given product.- Specified by:
getMetadataForProductin classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler- Parameters:
product- A PDS file.- Returns:
- A Metadata object, which holds metadata from the product.
-
passesPreconditions
protected boolean passesPreconditions(File product)
Determines whether the supplied file passes the necessary pre-conditions for the file to be registered.- Specified by:
passesPreconditionsin classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler- Parameters:
product- A file.- Returns:
- true if the file passes.
-
setSearchUrl
public void setSearchUrl(String url) throws MalformedURLException
Sets the Search Service URL location.- Parameters:
url- A url of the Search Service location.- Throws:
MalformedURLException- If the given url is malformed.
-
setCounter
public void setCounter(SearchDocState searchDocState)
-
-