Metadata-Version: 2.4
Name: python-ucto
Version: 0.6.10
Summary: This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).
Home-page: https://github.com/proycon/python-ucto
Author: Maarten van Gompel
Author-email: proycon@anaproy.nl
License: GPL-3.0-only
Keywords: tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto
Classifier: Development Status :: 5 - Production/Stable
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Programming Language :: Cython
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: POSIX
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Requires: ucto (>=0.36)
Requires-Dist: Cython
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: requires
Dynamic: requires-dist
Dynamic: summary
