Raspberry Pi 3 Cookbook for Python Programmers
上QQ阅读APP看书,第一时间看更新

Pre-processing data using tokenization

The pre-processing of data involves converting the existing text into acceptable information for the learning algorithm.

Tokenization is the process of dividing text into a set of meaningful pieces. These pieces are called tokens.