Class GMX


  • public class GMX
    extends Object
    Implementation of the GMX-V specification, v. 2.0
    Version:
    0.2 08.26.2015
    See Also:
    http://www.xtm-intl.com/manuals/gmx-v/GMX-V-2.0.html, http://www.etsi.org/deliver/etsi_gs/LIS/001_099/004/02.00.00_60/gs_LIS004v020000p.pdf
    • Field Detail

      • TotalWordCount

        public static final String TotalWordCount
        Total word count - an accumulation of the word counts, both translatable and non-translatable, from the individual text units that make up the document.
        See Also:
        Constant Field Values
      • ProtectedWordCount

        public static final String ProtectedWordCount
        An accumulation of the word count for text that has been marked as 'protected', or otherwise not translatable (XLIFF text enclosed in elements).
        See Also:
        Constant Field Values
      • ExactMatchedWordCount

        public static final String ExactMatchedWordCount
        An accumulation of the word count for text units that have been matched unambiguously with a prior translation and thus require no translator input.
        See Also:
        Constant Field Values
      • LeveragedMatchedWordCount

        public static final String LeveragedMatchedWordCount
        An accumulation of the word count for text units that have been matched against a leveraged translation memory database.
        See Also:
        Constant Field Values
      • RepetitionMatchedWordCount

        public static final String RepetitionMatchedWordCount
        An accumulation of the word count for repeating text units that have not been matched in any other form. Repetition matching is deemed to take precedence over fuzzy matching.
        See Also:
        Constant Field Values
      • FuzzyMatchedWordCount

        public static final String FuzzyMatchedWordCount
        An accumulation of the word count for text units that have been fuzzy matched against a leveraged translation memory database.
        See Also:
        Constant Field Values
      • AlphanumericOnlyTextUnitWordCount

        public static final String AlphanumericOnlyTextUnitWordCount
        An accumulation of the word count for text units that have been identified as containing only alphanumeric words.
        See Also:
        Constant Field Values
      • NumericOnlyTextUnitWordCount

        public static final String NumericOnlyTextUnitWordCount
        An accumulation of the word count for text units that have been identified as containing only numeric words.
        See Also:
        Constant Field Values
      • MeasurementOnlyTextUnitWordCount

        public static final String MeasurementOnlyTextUnitWordCount
        An accumulation of the word count from measurement-only text units.
        See Also:
        Constant Field Values
      • SimpleNumericAutoTextWordCount

        public static final String SimpleNumericAutoTextWordCount
        An accumulation of the word count for simple numeric values, e.g. 10.
        See Also:
        Constant Field Values
      • ComplexNumericAutoTextWordCount

        public static final String ComplexNumericAutoTextWordCount
        An accumulation of the word count for complex numeric values which include decimal and/or thousands separators, e.g. 10,000.00.
        See Also:
        Constant Field Values
      • MeasurementAutoTextWordCount

        public static final String MeasurementAutoTextWordCount
        An accumulation of the word count for identifiable measurement values, e.g. 10.50 mm. Measurement values take precedent over the above numeric categories. No double counting of these categories is allowed.
        See Also:
        Constant Field Values
      • AlphaNumericAutoTextWordCount

        public static final String AlphaNumericAutoTextWordCount
        An accumulation of the word count for identifiable alphanumeric words, e.g. AEG321.
        See Also:
        Constant Field Values
      • DateAutoTextWordCount

        public static final String DateAutoTextWordCount
        An accumulation of the word count for identifiable dates, e.g. 25 June 1992.
        See Also:
        Constant Field Values
      • TMAutoTextWordCount

        public static final String TMAutoTextWordCount
        An accumulation of the word count for identifiable trade marks, e.g. "Weapons of Mass Destruction...".
        See Also:
        Constant Field Values
      • TotalCharacterCount

        public static final String TotalCharacterCount
        An accumulation of the character counts, both translatable and non-translatable, from the individual text units that make up the document. This count includes all non white space characters in the document (please refer to Section 2.7. White Space Characters for details of what constitutes white space characters), excluding inline markup and punctuation characters (please refer to Section 2.10. Punctuation Characters for details of what constitutes punctuation characters).
        See Also:
        Constant Field Values
      • PunctuationCharacterCount

        public static final String PunctuationCharacterCount
        The total of all punctuation characters in the canonical form of text in the document that DO NOT form part of the character count as per section 2.10. Punctuation Characters.
        See Also:
        Constant Field Values
      • WhiteSpaceCharacterCount

        public static final String WhiteSpaceCharacterCount
        The total of all white space characters in the canonical form of the text units in the document. Please refer to section 2.7. White Space Characters for a detailed explanation of how white space characters are identified and counted.
        See Also:
        Constant Field Values
      • OverallCharacterCount

        public static final String OverallCharacterCount
        The total of all of the three main character counts (TotalCharacterCount + PunctuationCharacterCount + WhiteSpaceCharacterCount) in the canonical form of the text units in the document. (Added in GMX-V 2.0)
        See Also:
        Constant Field Values
      • ProtectedCharacterCount

        public static final String ProtectedCharacterCount
        An accumulation of the character count for text that has been marked as 'protected', or otherwise not translatable (XLIFF text enclosed in elements).
        See Also:
        Constant Field Values
      • ExactMatchedCharacterCount

        public static final String ExactMatchedCharacterCount
        An accumulation of the character count for text units that have been matched unambiguously with a prior translation and require no translator input.
        See Also:
        Constant Field Values
      • LeveragedMatchedCharacterCount

        public static final String LeveragedMatchedCharacterCount
        An accumulation of the character count for text units that have been matched against a leveraged translation memory database.
        See Also:
        Constant Field Values
      • RepetitionMatchedCharacterCount

        public static final String RepetitionMatchedCharacterCount
        An accumulation of the character count for repeating text units that have not been matched in any other form. Repetition matching is deemed to take precedence over fuzzy matching.
        See Also:
        Constant Field Values
      • FuzzyMatchedCharacterCount

        public static final String FuzzyMatchedCharacterCount
        An accumulation of the character count for text units that have a fuzzy match against a leveraged translation memory database.
        See Also:
        Constant Field Values
      • AlphanumericOnlyTextUnitCharacterCount

        public static final String AlphanumericOnlyTextUnitCharacterCount
        An accumulation of the character count for text units that have been identified as containing only alphanumeric words.
        See Also:
        Constant Field Values
      • NumericOnlyTextUnitCharacterCount

        public static final String NumericOnlyTextUnitCharacterCount
        An accumulation of the character count for text units that have been identified as containing only numeric words.
        See Also:
        Constant Field Values
      • MeasurementOnlyTextUnitCharacterCount

        public static final String MeasurementOnlyTextUnitCharacterCount
        An accumulation of the character count from measurement-only text units.
        See Also:
        Constant Field Values
      • SimpleNumericAutoTextCharacterCount

        public static final String SimpleNumericAutoTextCharacterCount
        An accumulation of the character count for simple numeric values, e.g. 10.
        See Also:
        Constant Field Values
      • ComplexNumericAutoTextCharacterCount

        public static final String ComplexNumericAutoTextCharacterCount
        An accumulation of the character count for complex numeric values which include decimal and/or thousands separators, e.g. 10,000.00.
        See Also:
        Constant Field Values
      • MeasurementAutoTextCharacterCount

        public static final String MeasurementAutoTextCharacterCount
        An accumulation of the character count for identifiable measurement values, e.g. 10.50 mm. Measurement values take precedent over the above numeric categories. No double counting of these categories is allowed.
        See Also:
        Constant Field Values
      • AlphaNumericAutoTextCharacterCount

        public static final String AlphaNumericAutoTextCharacterCount
        An accumulation of the character count for identifiable alphanumeric words, e.g. AEG321.
        See Also:
        Constant Field Values
      • DateAutoTextCharacterCount

        public static final String DateAutoTextCharacterCount
        An accumulation of the character count for identifiable dates, e.g. 25 June 1992.
        See Also:
        Constant Field Values
      • TMAutoTextCharacterCount

        public static final String TMAutoTextCharacterCount
        An accumulation of the character count for identifiable trade marks, e.g. "Weapons of Mass Destruction...".
        See Also:
        Constant Field Values
      • TranslatableInlineCount

        public static final String TranslatableInlineCount
        The actual non-linking inline element count for unqualified (see Section 2.14.2 Unqualified Text Units) text units. Please refer to Section 2.11. Inline Element Counts for a detailed explanation and examples for this category.
        See Also:
        Constant Field Values
      • TranslatableLinkingInlineCount

        public static final String TranslatableLinkingInlineCount
        The actual linking inline element count for unqualified (see Section 2.14.2 Unqualified Text Units) text units. Please refer to Section 2.12. Linking Inline Elements for a detailed explanation and examples for this category.
        See Also:
        Constant Field Values
      • ProjectRepetionMatchedWordCount

        public static final String ProjectRepetionMatchedWordCount
        The word count for text units that are identical within all files within a given project. The word count for the primary occurrence is not included in this count, only that of subsequent matches.
        See Also:
        Constant Field Values
      • ProjectFuzzyMatchedWordCount

        public static final String ProjectFuzzyMatchedWordCount
        The word count for fuzzy matched text units within all files within a given project. The word count for the primary occurrence is not included in this count, only that of subsequent matches.
        See Also:
        Constant Field Values
      • ProjectRepetionMatchedCharacterCount

        public static final String ProjectRepetionMatchedCharacterCount
        The character count for text that is identical within all files within a given project. The character count for the primary occurrence is not included in this count, only that of subsequent matches.
        See Also:
        Constant Field Values
      • ProjectFuzzyMatchedCharacterCount

        public static final String ProjectFuzzyMatchedCharacterCount
        The character count for fuzzy matched text within all files within a given project. The character count for the primary occurrence is not included in this count, only that of subsequent matches.
        See Also:
        Constant Field Values
    • Constructor Detail

      • GMX

        public GMX()
    • Method Detail

      • isLogographicScript

        public static boolean isLogographicScript​(LocaleId locId)
        Indicates whether or not the language is considered a "logographic" language per the GMX-V 2.0 spec. If true, word counts for this language are defined as (character count / getCharacterCountFactor(LocaleId)), unless the character count factor is -1d in which case word counts are not meaningful for the language.
        See Also:
        http://www.xtm-intl.com/manuals/gmx-v/GMX-V-2.0.html#LogographicScripts
      • getCharacterCountFactor

        public static double getCharacterCountFactor​(LocaleId language)
        For "logographic" languages, GMX-V 2.0 defines factors by which the character count should be divided in order to yield the word count.

        Returns -1d if the language does not have a factor. If this method returns -1d and isLogographicScript(LocaleId) returns true, then word counts are not meaningful for this language.

        See Also:
        http://www.xtm-intl.com/manuals/gmx-v/GMX-V-2.0.html#LogographicScripts