Package edu.uchsc.ccp.nlp.ei.mutation
Class MutationExtractor
- java.lang.Object
-
- edu.uchsc.ccp.nlp.ei.mutation.MutationExtractor
-
- Direct Known Subclasses:
MutationFinder
public abstract class MutationExtractor extends Object
A base class for extracting Mutations from text- Version:
- 1.0
- Author:
- William A. Baumgartner, Jr.
william.baumgartner@uchsc.edu
-
-
Constructor Summary
Constructors Constructor Description MutationExtractor()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voiderror(String message)abstract Map<Mutation,Set<int[]>>extractMutations(String rawText)Extract point mutations mentions from raw_text and return them in a map.protected static Map<String,String>populateAminoAcidNameToOneLookupMap()This method simple fills the two mappings between amino acid name and the amino acid one-letter codeprotected static Map<String,String>populateAminoAcidThreeToOneLookupMap()This method simple fills the two mappings between amino acid one letter code and the amino acid three letter codeprotected voidwarn(String message)
-
-
-
Method Detail
-
populateAminoAcidThreeToOneLookupMap
protected static Map<String,String> populateAminoAcidThreeToOneLookupMap()
This method simple fills the two mappings between amino acid one letter code and the amino acid three letter code- Returns:
- a mapping from three-letter code to one-letter code
-
populateAminoAcidNameToOneLookupMap
protected static Map<String,String> populateAminoAcidNameToOneLookupMap()
This method simple fills the two mappings between amino acid name and the amino acid one-letter code- Returns:
- a mapping from amino acid full name to one-letter code
-
extractMutations
public abstract Map<Mutation,Set<int[]>> extractMutations(String rawText) throws MutationException
Extract point mutations mentions from raw_text and return them in a map. The result of this method is a mapping of PointMutation objects to a set of spans (int arrays of size 2) where they were identified. Spans are presented in the form of character-offsets in text. Example result:
raw_text: 'We constructed A42G and L22G, and crystalized A42G.'
result = {PointMutation(42,'A','G'):[(15,19),(46,50)],
PointMutation(22,'L','G'):[(24,28)]}
Note that the spans won't necessarily be in increasing order, due to the order of processing regular expressions.- Parameters:
rawText- the text to be processed- Returns:
- Throws:
MutationException
-
warn
protected void warn(String message)
-
error
protected void error(String message)
-
-