Class TransformerDisambiguationDataUtils
- java.lang.Object
-
- de.julielab.genemapper.classification.TransformerDisambiguationDataUtils
-
public class TransformerDisambiguationDataUtils extends Object
-
-
Field Summary
Fields Modifier and Type Field Description static booleanADD_DESCstatic booleanADD_GENERIFstatic booleanADD_INTERACTIONSstatic booleanADD_NAME_TYPESstatic booleanADD_SUMMARYstatic de.julielab.geneexpbase.configuration.ParametersCANDIDATE_SETTER_PARAMSstatic booleanEXCLUDE_FP_GMstatic intMAX_DOC_CONTEXT_SIZEstatic booleanNORMALIZE_CONTEXT_GENESstatic booleanONLY_APPROX_MATCHESstatic booleanONLY_EXACT_MATCHESstatic booleanUSE_GOLD_TAX_FOR_CANDIDATE_RETRIEVALstatic booleanUSE_ORIGINAL_QUERY_NAMESstatic intVERSION
-
Constructor Summary
Constructors Constructor Description TransformerDisambiguationDataUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidaddDocumentLevelGeneAnnotations(de.julielab.geneexpbase.genemodel.GeneDocument document, com.google.common.collect.Multimap<String,String> docid2geneid)static StringgetCandidateQueryString(de.julielab.geneexpbase.candidateretrieval.SynHit sh, CandidateRetrieval candidateRetrieval)Creates a single string describing the given gene database candidate.static StringgetGmMarkedDocumentText(de.julielab.geneexpbase.genemodel.GeneMention gm, int maxContextTokens, boolean onlyGenes, boolean uniqueGenes)static voidwriteData(GeneMapper mapper, File outputFile, Stream<de.julielab.geneexpbase.genemodel.GeneDocument> geneDocumentStream)static voidwriteData(BufferedWriter bw, GeneMapper mapper, de.julielab.geneexpbase.genemodel.GeneDocument doc)
-
-
-
Field Detail
-
USE_GOLD_TAX_FOR_CANDIDATE_RETRIEVAL
public static final boolean USE_GOLD_TAX_FOR_CANDIDATE_RETRIEVAL
- See Also:
- Constant Field Values
-
ADD_GENERIF
public static final boolean ADD_GENERIF
- See Also:
- Constant Field Values
-
ADD_INTERACTIONS
public static final boolean ADD_INTERACTIONS
- See Also:
- Constant Field Values
-
ADD_SUMMARY
public static final boolean ADD_SUMMARY
- See Also:
- Constant Field Values
-
ADD_DESC
public static final boolean ADD_DESC
- See Also:
- Constant Field Values
-
MAX_DOC_CONTEXT_SIZE
public static final int MAX_DOC_CONTEXT_SIZE
- See Also:
- Constant Field Values
-
ADD_NAME_TYPES
public static final boolean ADD_NAME_TYPES
- See Also:
- Constant Field Values
-
USE_ORIGINAL_QUERY_NAMES
public static final boolean USE_ORIGINAL_QUERY_NAMES
- See Also:
- Constant Field Values
-
NORMALIZE_CONTEXT_GENES
public static final boolean NORMALIZE_CONTEXT_GENES
- See Also:
- Constant Field Values
-
EXCLUDE_FP_GM
public static final boolean EXCLUDE_FP_GM
- See Also:
- Constant Field Values
-
ONLY_EXACT_MATCHES
public static final boolean ONLY_EXACT_MATCHES
- See Also:
- Constant Field Values
-
ONLY_APPROX_MATCHES
public static final boolean ONLY_APPROX_MATCHES
- See Also:
- Constant Field Values
-
VERSION
public static final int VERSION
- See Also:
- Constant Field Values
-
CANDIDATE_SETTER_PARAMS
public static final de.julielab.geneexpbase.configuration.Parameters CANDIDATE_SETTER_PARAMS
-
-
Method Detail
-
writeData
public static void writeData(GeneMapper mapper, File outputFile, Stream<de.julielab.geneexpbase.genemodel.GeneDocument> geneDocumentStream) throws IOException, ExecutionException, GeneMapperException
-
writeData
public static void writeData(BufferedWriter bw, GeneMapper mapper, de.julielab.geneexpbase.genemodel.GeneDocument doc) throws IOException, ExecutionException, GeneMapperException
-
getGmMarkedDocumentText
public static String getGmMarkedDocumentText(de.julielab.geneexpbase.genemodel.GeneMention gm, int maxContextTokens, boolean onlyGenes, boolean uniqueGenes)
-
getCandidateQueryString
public static String getCandidateQueryString(de.julielab.geneexpbase.candidateretrieval.SynHit sh, CandidateRetrieval candidateRetrieval) throws ExecutionException
Creates a single string describing the given gene database candidate. Those are names and synonyms and optionally textual descriptions of the gene. This string is supposed to serve as a "query" or "question" for the transformer. Given also a part of the document context of the current gene, the transformer should then decide whether the gene description belongs to the current gene or not.- Parameters:
sh- The gene candidate to create a query string for.- Returns:
- The candidate "query" to compare to the document context.
- Throws:
ExecutionException
-
-