Class PDSProductCrawler
- java.lang.Object
-
- gov.nasa.jpl.oodt.cas.crawl.config.ProductCrawlerBean
-
- gov.nasa.jpl.oodt.cas.crawl.ProductCrawler
-
- gov.nasa.pds.harvest.search.crawler.PDSProductCrawler
-
- All Implemented Interfaces:
gov.nasa.jpl.oodt.cas.commons.spring.SpringSetIdInjectionType
,gov.nasa.jpl.oodt.cas.filemgr.metadata.CoreMetKeys
- Direct Known Subclasses:
CollectionCrawler
,PDS3ProductCrawler
public class PDSProductCrawler extends gov.nasa.jpl.oodt.cas.crawl.ProductCrawler
Class that extends the Cas-Crawler to crawl a directory or PDS inventory file and register products to the PDS Registry Service.- Author:
- mcayanan
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
inPersistanceMode
Flag for crawler persistance.protected Map<File,Long>
touchedFiles
A map of files that were touched during crawler persistance.
-
Constructor Summary
Constructors Constructor Description PDSProductCrawler()
Default constructor.PDSProductCrawler(Pds4MetExtractorConfig extractorConfig)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addAction(gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction action)
Adds a crawler action.void
addActions(List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> actions)
Adds a list of crawler actions.protected void
addKnownMetadata(File product, gov.nasa.jpl.oodt.cas.metadata.Metadata productMetadata)
Method not implemented at the moment.void
crawl(File dir)
Crawls the given directory.List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction>
getActions()
Gets a list of crawler actions defined for the crawler.protected gov.nasa.jpl.oodt.cas.metadata.Metadata
getMetadataForProduct(File product)
Extracts metadata from the given product.Pds4MetExtractorConfig
getMetExtractorConfig()
Get the MetExtractor configuration object.protected boolean
passesPreconditions(File product)
Determines whether the supplied file passes the necessary pre-conditions for the file to be registered.void
setCounter(SearchDocState searchDocState)
void
setDirectoryFilter(DirectoryFilter filter)
Sets the directory filter for the crawler.void
setFileFilter(FileFilter filter)
Sets the file filter for the crawler.void
setInPersistanceMode(boolean value)
void
setMetExtractorConfig(Pds4MetExtractorConfig config)
void
setSearchUrl(String url)
Sets the Search Service URL location.-
Methods inherited from class gov.nasa.jpl.oodt.cas.crawl.ProductCrawler
clearIngestStatus, crawl, getIngestStatus, handleFile, setActionRepo
-
Methods inherited from class gov.nasa.jpl.oodt.cas.crawl.config.ProductCrawlerBean
addRequiredMetadata, getActionIds, getApplicationContext, getDaemonPort, getDaemonWait, getFilemgrUrl, getGlobalMetadata, getId, getIngester, getProductPath, getRequiredMetadata, isCrawlForDirs, isNoRecur, isSkipIngest, setActionIds, setApplicationContext, setCrawlForDirs, setDaemonPort, setDaemonWait, setFilemgrUrl, setGlobalMetadata, setId, setIngester, setNoRecur, setProductPath, setRequiredMetadata, setSkipIngest
-
-
-
-
Constructor Detail
-
PDSProductCrawler
public PDSProductCrawler()
Default constructor.
-
PDSProductCrawler
public PDSProductCrawler(Pds4MetExtractorConfig extractorConfig)
Constructor.- Parameters:
extractorConfig
- A configuration class that tells the crawler what data product types to look for and what metadata to extract.
-
-
Method Detail
-
getMetExtractorConfig
public Pds4MetExtractorConfig getMetExtractorConfig()
Get the MetExtractor configuration object.- Returns:
- The PDSMetExtractorConfig object.
-
setMetExtractorConfig
public void setMetExtractorConfig(Pds4MetExtractorConfig config)
-
setInPersistanceMode
public void setInPersistanceMode(boolean value)
-
setFileFilter
public void setFileFilter(FileFilter filter)
Sets the file filter for the crawler.- Parameters:
filter
- A File Filter defined in the Harvest policy config.
-
setDirectoryFilter
public void setDirectoryFilter(DirectoryFilter filter)
Sets the directory filter for the crawler.- Parameters:
filter
- A Directory Filter defined in the Harvest policy config.
-
addKnownMetadata
protected void addKnownMetadata(File product, gov.nasa.jpl.oodt.cas.metadata.Metadata productMetadata)
Method not implemented at the moment.- Overrides:
addKnownMetadata
in classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler
- Parameters:
product
- The product file.productMetadata
- The metadata associated with the product.
-
crawl
public void crawl(File dir)
Crawls the given directory.- Overrides:
crawl
in classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler
- Parameters:
dir
- The directory to crawl.
-
addAction
public void addAction(gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction action)
Adds a crawler action.- Parameters:
action
- A crawler action.
-
addActions
public void addActions(List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> actions)
Adds a list of crawler actions.- Parameters:
actions
- A list of crawler actions.
-
getActions
public List<gov.nasa.jpl.oodt.cas.crawl.action.CrawlerAction> getActions()
Gets a list of crawler actions defined for the crawler.- Returns:
- A list of crawler actions that will be performed during crawling.
-
getMetadataForProduct
protected gov.nasa.jpl.oodt.cas.metadata.Metadata getMetadataForProduct(File product)
Extracts metadata from the given product.- Specified by:
getMetadataForProduct
in classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler
- Parameters:
product
- A PDS file.- Returns:
- A Metadata object, which holds metadata from the product.
-
passesPreconditions
protected boolean passesPreconditions(File product)
Determines whether the supplied file passes the necessary pre-conditions for the file to be registered.- Specified by:
passesPreconditions
in classgov.nasa.jpl.oodt.cas.crawl.ProductCrawler
- Parameters:
product
- A file.- Returns:
- true if the file passes.
-
setSearchUrl
public void setSearchUrl(String url) throws MalformedURLException
Sets the Search Service URL location.- Parameters:
url
- A url of the Search Service location.- Throws:
MalformedURLException
- If the given url is malformed.
-
setCounter
public void setCounter(SearchDocState searchDocState)
-
-