CPT Word Lists Screenshot



CPT Word Lists

CPT Word Lists is a collection of tools for processing word
lists and text files, supporting Unicode and unlimited number
of encodings via the Java converters. Its main goal is
to create dictionaries for the other CPT programs
but it can be used as completely independent program.
Features:
The set of operations over textual (plain text or HTML)
files include:
- browsing/searching in any standard encoding including
decomposition and bidi support;
- extract words, calculate letter and word frequencies;
- flexible 'word' definition and filtering;
- change the encoding, the letter case, transliterate;
- standard Unicode and custom normalizations;
- visual/logical order conversion for RTL scripts;
- simple spell checking and tagging.
The word lists (text or dictionary format) operations
include the above plus:
- creating highly compressed dictionaries optionally
including tags, definitions and pictures;
- several types of sorting including user defined
order and alphabets (90 alphabets supplied);
- compare/add/delete functions over dictionaries;
- global assignment of tags and extracting
subsets via selected tags and word length;
- automatic or user defined suffixes packing;
- via user definitions: creating and expanding
munched lists, creating and filtering tagged lists,
translating tags in tagged lists, tagging;
- searching and extracting word patterns;
- creating inverted indexes;
- 3 levels of protecting the dictionaries.


Back to CPT Word Lists Details page

CPT Word Lists Related
New software of Software Development, Misc. Programming