A B C D G H I L N O P R S T U
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- addLanguageMap(LanguageMap) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Adds a language map to this document.
- addLanguageRule(String, ArrayList<Rule>) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Adds a language rule to this SRX document.
- addRule(CompiledRule) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Adds a compiled rule to this segmenter.
- after - Variable in class net.sf.okapi.lib.segmentation.Rule
-
Pattern for after the break point.
- ANYCODE - Static variable in class net.sf.okapi.lib.segmentation.SRXDocument
-
Marker for INLINECODE_PATTERN in the given pattern.
B
C
- cascade() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if cascading must be applied when selecting the rules for a given language pattern.
- cascade() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Indicates if cascading must be applied when selecting the rules for a given language pattern.
- comment - Variable in class net.sf.okapi.lib.segmentation.Rule
-
Optional comment placed just before the rule.
- compileLanguageRules(LocaleId, ISegmenter) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Compiles the all language rules applicable for a given language code, and assign them to a segmenter.
- compileSingleLanguageRule(String, ISegmenter) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Compiles a single language rule group and assign it to a segmenter.
- computeSegments(String) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- computeSegments(TextContainer) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
D
- DEFAULT_SRX_RULES - Static variable in class net.sf.okapi.lib.segmentation.SRXDocument
G
- generateRuleRegex(Rule) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
- getAfter() - Method in class net.sf.okapi.lib.segmentation.Rule
-
Gets the pattern after the break point for this rule.
- getAllLanguageRules() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets a map of all the language rules in this document.
- getAllLanguagesMaps() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the list of all the language maps in this document.
- getBefore() - Method in class net.sf.okapi.lib.segmentation.Rule
-
Gets the pattern before the break point for this rule.
- getComment() - Method in class net.sf.okapi.lib.segmentation.Rule
-
Gets the optional comment for this rule.
- getComments() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the comments associated with this document.
- getHeaderComments() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the comments associated with the header of this document.
- getLanguage() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- getLanguageRules(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the list of rules for a given <languagerule7gt; element.
- getMaskRule() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the current pattern of the mask rule.
- getNextSegmentRange(TextContainer) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- getPattern() - Method in class net.sf.okapi.lib.segmentation.LanguageMap
-
Gets the pattern associated to this language map.
- getRanges() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- getRuleName() - Method in class net.sf.okapi.lib.segmentation.LanguageMap
-
Gets the name of this language map.
- getSampleLanguage() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the current sample language code.
- getSampleText() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the current sample text.
- getSplitPositions() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- getVersion() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the version of this SRX document.
- getWarning() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Gets the last warning that was issued while loading a document.
H
- hasWarning() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if a warning was issued last time a document was read.
I
- includeEndCodes() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if end codes should be included (See SRX implementation notes).
- includeEndCodes() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- includeIsolatedCodes() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if isolated codes should be included (See SRX implementation notes).
- includeIsolatedCodes() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- includeStartCodes() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if start codes should be included (See SRX implementation notes).
- includeStartCodes() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- INLINECODE_PATTERN - Static variable in class net.sf.okapi.lib.segmentation.SRXDocument
-
Represents the pattern for an inline code (both special characters).
- isActive - Variable in class net.sf.okapi.lib.segmentation.Rule
-
Flag indicating if the rule is active.
- isActive() - Method in class net.sf.okapi.lib.segmentation.Rule
-
Indicates if this rule is active.
- isBreak - Variable in class net.sf.okapi.lib.segmentation.Rule
-
Flag indicating if the rule is a breaking rule.
- isBreak() - Method in class net.sf.okapi.lib.segmentation.Rule
-
Indicates if this rule is a breaking rule.
- isModified() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if the document has been modified since the last load or save.
L
- LanguageMap - Class in net.sf.okapi.lib.segmentation
-
Stores the data for an SRX <languagemap> map element
- LanguageMap() - Constructor for class net.sf.okapi.lib.segmentation.LanguageMap
-
Creates an empty LanguageMap object.
- LanguageMap(String, String) - Constructor for class net.sf.okapi.lib.segmentation.LanguageMap
-
Creates a LanguageMap object with a given pattern and a given name.
- loadRules(InputStream) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Loads an SRX document from an input stream.
- loadRules(CharSequence) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Loads an SRX document from a CharSequence object.
- loadRules(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Loads an SRX document from a file.
N
- net.sf.okapi.lib.segmentation - package net.sf.okapi.lib.segmentation
-
Interfaces and classes for segmentation handling.
- NOAUTO - Static variable in class net.sf.okapi.lib.segmentation.SRXDocument
-
Placed at the end of the 'after' expression, this marker indicates the given pattern should not have auto-insertion of AUTO_INLINECODES.
O
- oneSegmentIncludesAll() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if, when there is a single segment in a text, it should include the whole text (no spaces or codes trim left/right)
- oneSegmentIncludesAll() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
P
- pattern - Variable in class net.sf.okapi.lib.segmentation.LanguageMap
-
The pattern of this language map.
R
- reset() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- resetAll() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Resets the document to its default empty initial state.
- Rule - Class in net.sf.okapi.lib.segmentation
-
Stores the data for a SRX <rule> element.
- Rule() - Constructor for class net.sf.okapi.lib.segmentation.Rule
-
Creates an empty breaking and active Rule object.
- Rule(String, String, boolean) - Constructor for class net.sf.okapi.lib.segmentation.Rule
-
Creates a Rule object with given patterns and a flag indicating if the rule is a breaking one or a breaking exception.
- ruleName - Variable in class net.sf.okapi.lib.segmentation.LanguageMap
-
The name of this language map.
S
- saveRules(String, boolean, boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Saves the current rules to an SRX rules document.
- saveRulesToString(boolean, boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Saves the current rules to an SRX string.
- SegmentationRuleException - Exception in net.sf.okapi.lib.segmentation
-
Signals that a sever error related to segmentation has occurred.
- SegmentationRuleException(String) - Constructor for exception net.sf.okapi.lib.segmentation.SegmentationRuleException
-
Creates a new SegmentationRuleException object with a given message.
- SegmentationRuleException(Throwable) - Constructor for exception net.sf.okapi.lib.segmentation.SegmentationRuleException
-
Creates a new SegmentationRuleException object with a given parent exception.
- segmentSubFlows() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if sub-flows must be segmented.
- segmentSubFlows() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setActive(boolean) - Method in class net.sf.okapi.lib.segmentation.Rule
-
Sets the flag indicating if this rule is active.
- setAfter(String) - Method in class net.sf.okapi.lib.segmentation.Rule
-
Sets the pattern after the break point for this rule.
- setBefore(String) - Method in class net.sf.okapi.lib.segmentation.Rule
-
Sets the pattern before the break point for this rule.
- setBreak(boolean) - Method in class net.sf.okapi.lib.segmentation.Rule
-
Sets the flag indicating if this rule is a breaking rule.
- setCascade(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the flag indicating if cascading must be applied when selecting the rules for a given language pattern.
- setCascade(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Sets the flag indicating if cascading must be applied when selecting the rules for a given language pattern.
- setComment(String) - Method in class net.sf.okapi.lib.segmentation.Rule
-
Sets the comment for this rule.
- setComments(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the comments for this document.
- setHeaderComments(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the comments for the header of this document.
- setIncludeEndCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if end codes should be included or not.
- setIncludeEndCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setIncludeIsolatedCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if isolated codes should be included or not.
- setIncludeIsolatedCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setIncludeStartCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if start codes should be included or not.
- setIncludeStartCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setLanguage(LocaleId) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setMaskRule(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the pattern for the mask rule.
- setMaskRule(String) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Sets the pattern for the mask rule.
- setModified(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the flag indicating if the document has been modified since the last load or save.
- setOneSegmentIncludesAll(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if when there is a single segment in a text it should include the whole text (no spaces or codes trim left/right) text.
- setOneSegmentIncludesAll(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setOptions(boolean, boolean, boolean, boolean, boolean, boolean, boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setOptions(boolean, boolean, boolean, boolean, boolean, boolean, boolean, boolean, boolean, boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Sets the options for this segmenter.
- setSampleLanguage(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the sample language code.
- setSampleText(String) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the sample text.
- setSegmentSubFlows(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the flag indicating if sub-flows must be segmented.
- setSegmentSubFlows(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setTestOnSelectedGroup(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator on how to apply rules for samples.
- setTreatIsolatedCodesAsWhitespace(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator if this document should treat isolated codes as whitespace when matching SRX rules.
- setTreatIsolatedCodesAsWhitespace(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setTrimCodes(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setTrimLeadingWhitespaces(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if leading white-spaces should be left outside the segments.
- setTrimLeadingWS(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setTrimTrailingWhitespaces(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if trailing white-spaces should be left outside the segments.
- setTrimTrailingWS(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- setUseICU4JBreakRules(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Sets the indicator that tells if this document uses ICU4J BreakIterator rules.
- setUseJavaRegex(boolean) - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Sets the indicator that tells if this document has rules that are defined for the Java regular expression engine (vs ICU).
- SRXDocument - Class in net.sf.okapi.lib.segmentation
-
Provides facilities to load, save, and manage segmentation rules in SRX format.
- SRXDocument() - Constructor for class net.sf.okapi.lib.segmentation.SRXDocument
-
Creates an empty SRX document.
- SRXSegmenter - Class in net.sf.okapi.lib.segmentation
-
Implements the
ISegmenterinterface for SRX rules. - SRXSegmenter() - Constructor for class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Creates a new SRXSegmenter object.
T
- testOnSelectedGroup() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates that, when sampling the rules, the sample should be computed using only a selected group of rules.
- treatIsolatedCodesAsWhitespace() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if this document should treat isolated codes as whitespace when matching SRX rules.
- treatIsolatedCodesAsWhitespace() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- trimLeadingWhitespaces() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if leading white-spaces should be left outside the segments.
- trimLeadingWhitespaces() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
- trimTrailingWhitespaces() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if trailing white-spaces should be left outside the segments.
- trimTrailingWhitespaces() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
U
- useIcu4JBreakRules() - Method in class net.sf.okapi.lib.segmentation.SRXDocument
-
Indicates if this document uses ICU4J break rules.
- useJavaRegex() - Method in class net.sf.okapi.lib.segmentation.SRXSegmenter
-
Indicates if this document has rules that are defined for the Java regular expression engine (vs ICU).
All Classes All Packages