YamCha

From Chorus
Jump to: navigation, search
YamCha
Domain Text Retrieval
Media
Task
Creation Date
Copyright open source
URL http://chasen.org/~taku/software/yamcha/


Description

  • YamCha: Yet Another Multipurpose CHunk Annotator. Is a generic, customizable, and open source text chunker oriented toward a lot of NLP tasks, such as POS tagging, Named Entity Recognition, base NP chunking, and Text Chunking.
  • YamCha is using a state-of-the-art machine learning algorithm called Support Vector Machines (SVMs), first introduced by Vapnik in 1995. Is exactly the same system which performed the best in the CoNLL2000 Shared Task, Chunking and BaseNP Chunking task.
  • Features:
    • Moderately high performance chunker based on Support Vector Machines
    • Independent from the given task, training/testing with any data which can be seen as a "generic" text chunking task
    • Use PKE/ PKI, whcih make the classification (chunking) speed faster than the original SVMs. For details, please see here.
    • Can redefine feature sets (window-size), parsing-direction (forward/backward) and algorithms of multi-class problem (pair wise/one vs rest)
    • Practical chunking time (1 or 2 sec./sentence. it highly depends on the task)
    • Can perform partial chunking
    • C/C++ library


Copyright Remarks

  • open source
Personal tools
CHORUS+