Machine Learning for Developers
上QQ阅读APP看书,第一时间看更新

Feature engineering

Feature engineering is in some ways one of the most underrated parts of the machine learning process, even though it is considered the cornerstone of the learning process by many prominent figures of the community.

What's the purpose of this process? In short, it takes the raw data from databases, sensors, archives, and so on, and transforms it in a way that makes it easy for the model to generalize. This discipline takes criteria from many sources, including common sense. It's indeed more like an art than a rigid science. It is a manual process, even when some parts of it can be automatized via a group of techniques grouped in the feature extraction field.

As part of this process we also have many powerful mathematical tools and dimensionality reduction techniques, such as Principal Component Analysis (PCA) and Autoencoders, that allow data scientists to skip features that don't enrich the representation of the data in useful ways.