Machine Learning for Developers

更新时间：2021-07-02 15:47:29

最新章节：Summary

coverpage

Title Page

Credits

Foreword

About the Author

About the Reviewers

www.PacktPub.com

Why subscribe?

Customer Feedback

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Introduction - Machine Learning and Statistical Science

Machine learning in the bigger picture

Types of machine learning

Grades of supervision

Supervised learning strategies - regression versus classification

Unsupervised problem solving–clustering

Tools of the trade–programming language and libraries

The Python language

The NumPy library

The matplotlib library

What's matplotlib?

Pandas

SciPy

Jupyter notebook

Basic mathematical concepts

Statistics - the basic pillar of modeling uncertainty

Descriptive statistics - main operations

Mean

Variance

Standard deviation

Probability and random variables

Events

Probability

Random variables and distributions

Useful probability distributions

Bernoulli distributions

Uniform distribution

Normal distribution

Logistic distribution

Statistical measures for probability functions

Skewness

Kurtosis

Differential calculus elements

Preliminary knowledge

In search of changes–derivatives

Sliding on the slope

Chain rule

Partial derivatives

Summary

The Learning Process

Understanding the problem

Dataset definition and retrieval

The ETL process

Loading datasets and doing exploratory analysis with SciPy and pandas

Working interactively with IPython

Working on 2D data

Feature engineering

Imputation of missing data

One hot encoding

Dataset preprocessing

Normalization and feature scaling

Normalization or standardization

Model definition

Asking ourselves the right questions

Loss function definition

Model fitting and evaluation

Dataset partitioning

Common training terms – iteration batch and epoch

Types of training – online and batch processing

Parameter initialization

Model implementation and results interpretation

Regression metrics

Mean absolute error

Median absolute error

Mean squared error

Classification metrics

Accuracy

Precision score recall and F-measure

Confusion matrix

Clustering quality measurements

Silhouette coefficient

Homogeneity completeness and V-measure

Summary

References

Clustering

Grouping as a human activity

Automating the clustering process

Finding a common center - K-means