YamCha
From Chorus
Revision as of 16:52, 10 January 2011 by Cimpaniulia (Talk | contribs)
Domain | Text Retrieval |
Media | |
Task | |
Creation Date | |
Copyright | open source |
URL | http://chasen.org/~taku/software/yamcha/ |
Description
- YamCha: Yet Another Multipurpose CHunk Annotator. Is a generic, customizable, and open source text chunker oriented toward a lot of NLP tasks, such as POS tagging, Named Entity Recognition, base NP chunking, and Text Chunking.
- YamCha is using a state-of-the-art machine learning algorithm called Support Vector Machines (SVMs), first introduced by Vapnik in 1995. Is exactly the same system which performed the best in the CoNLL2000 Shared Task, Chunking and BaseNP Chunking task.
- Features:
- Moderately high performance chunker based on Support Vector Machines
- Independent from the given task, training/testing with any data which can be seen as a "generic" text chunking task
- Use PKE/ PKI, whcih make the classification (chunking) speed faster than the original SVMs. For details, please see here.
- Can redefine feature sets (window-size), parsing-direction (forward/backward) and algorithms of multi-class problem (pair wise/one vs rest)
- Practical chunking time (1 or 2 sec./sentence. it highly depends on the task)
- Can perform partial chunking
- C/C++ library
Copyright Remarks
- open source