Class GeniaTreebankToOpenNLPChunkFormat


  • public class GeniaTreebankToOpenNLPChunkFormat
    extends Object
    * Copyright (c) 2015, JULIE Lab. All rights reserved. This program and the accompanying materials are made available under the terms of the BSD-2-Clause License This java class is used to convert GENIA Treebank version 1.0 xml files into one file in the openNLP 1.6 Chunker Training format Comment from faessler, 09/2017: This conversion doesn't take into account that phrases may be embedded and some other issues. For this reason, the resulting chunking is rather weird when training a chunker on the resulting data. Consider using ToIOBConverter instead.
    Author:
    rubruck
    • Constructor Detail

      • GeniaTreebankToOpenNLPChunkFormat

        public GeniaTreebankToOpenNLPChunkFormat()
    • Method Detail

      • main

        public static void main​(String[] args)
        args[0] = input directory args[1] = outputFile
        Parameters:
        args -