public class ExtractAbbrev
extends Object
This class was adapted from the BioText ExtractAbbrev.java software by Ariel S. Schwartz. See
http://biotext.berkeley.edu/software.html.
The ExtractAbbrev class implements a simple algorithm for extraction of abbreviations and their definitions from
biomedical text. Abbreviations (short forms) are extracted from the input file, and those abbreviations for which a
definition (long form) is found are printed out, along with that definition, one per line. A file consisting of
short-form/long-form pairs (tab separated) can be specified in tandem with the -testlist option for the purposes of
evaluating the algorithm.
- Version:
- 03/12/03
- Author:
- Ariel Schwartz
- See Also:
- A Simple Algorithm for Identifying Abbreviation
Definitions in Biomedical Text A.S. Schwartz, M.A. Hearst; Pacific Symposium on Biocomputing 8:451-462(2003)
for a detailed description of the algorithm.