Class OntologyClassNameExtractor


  • public class OntologyClassNameExtractor
    extends Object
    The error "[Fatal Error] :1:1: Content is not allowed in prolog." for OBO ontologies is just a STDERR leak before the next parser is tried by the OWL API. Just ignore it. https://github.com/owlcs/owlapi/issues/550
    Author:
    faessler
    • Constructor Detail

      • OntologyClassNameExtractor

        public OntologyClassNameExtractor()
        Constructs an OntologyClassNameExtractor with a fixed threadpool of size 4 and no reasoning.
      • OntologyClassNameExtractor

        public OntologyClassNameExtractor​(ExecutorService executor,
                                          boolean applyReasoning)
        Constructs an OntologyClassNameExtractor with he given ExecutorService for multithreading and prepares a HermiT OWLReasonerFactory if applyReasoning is set to true.
        Parameters:
        executor - An ExecutorService for parallel name extraction in case of multiple input ontologies.
        applyReasoning - If set to true, a reasoner will be used to determine super- and subclass relationships. This is employed to find the parent concepts of ontology classes. Should be switched off if the parent information is not required by the application since reasoning takes substantial time and space.
      • OntologyClassNameExtractor

        public OntologyClassNameExtractor​(ExecutorService executor,
                                          boolean applyReasoning,
                                          boolean filterDeprecated)
        Constructs an OntologyClassNameExtractor with he given ExecutorService for multithreading and prepares a HermiT OWLReasonerFactory if applyReasoning is set to true.
        Parameters:
        executor - An ExecutorService for parallel name extraction in case of multiple input ontologies.
        applyReasoning - If set to true, a reasoner will be used to determine super- and subclass relationships. This is employed to find the parent concepts of ontology classes. Should be switched off if the parent information is not required by the application since reasoning takes substantial time and space.
        filterDeprecated - Should classes marked deprecated be removed?
    • Method Detail

      • run

        public int run​(File input,
                       File submissionsDirectory,
                       File output)
                throws InterruptedException,
                       ExecutionException
        Starts the extraction of ontology class names of ontologies in the input directory. The results are written in JSON format into the outputDir directory.
        Parameters:
        input - A directory of ontologies or a single ontology file.
        submissionsDirectory - The directory that holds the - via {link OntologyDownloader} - downloaded submission information about each ontology. It is used to determine the correct properties for preferred name, synonyms and description for classes.
        outputDir - The directory where to store the extracted class names to.
        Returns:
        The number of processed Ontologies.
        Throws:
        IOException - If reading or writing goes wrong.
        org.semanticweb.owlapi.model.OWLOntologyCreationException - If an ontology cannot be loaded.
        InterruptedException - Ontology name extraction is done using an ExecutorService. This exception may be thrown if a worker thread is interrupted.
        ExecutionException - If the thread execution files for a worker thread.
      • run

        public int run​(File input,
                       File submissionsDirectory,
                       File outputDir,
                       Set<String> ontologiesToExtract)
                throws InterruptedException,
                       ExecutionException
        Starts the extraction of ontology class names of ontologies in the input directory. The results are written in JSON format into the outputDir directory. The output contains the preferred name or label of ontology classes, their synonyms, their definition or description and their super classes / parents. Classes marked as obsolete are omitted. The ontologiesToExtract set is used to restrict name extraction only to those ontologies where the ontology file name without extension (i.e. the filename up to the first dot) is contained in the set. For BioPortal ontologies that have been downloaded using the OntologyDownloader, this is just the acronym.
        Parameters:
        input - A directory of ontologies or a single ontology file.
        submissionsDirectory - The directory that holds the - via {link OntologyDownloader} - downloaded submission information about each ontology. It is used to determine the correct properties for preferred name, synonyms and description for classes.
        outputDir - The directory where to store the extracted class names to.
        ontologiesToExtract - A set of ontology identifiers - based on the ontology's file names - for which name extraction should be performed. An empty or null set means that all ontologies will be processed.
        Returns:
        The number of processed Ontologies.
        Throws:
        IOException - If reading or writing goes wrong.
        org.semanticweb.owlapi.model.OWLOntologyCreationException - If an ontology cannot be loaded.
        InterruptedException - Ontology name extraction is done using an ExecutorService. This exception may be thrown if a worker thread is interrupted.
        ExecutionException - If the thread execution files for a worker thread.
      • shutDown

        public void shutDown()