Fast, consistent tokenization of natural language text
