Building Machine Learning Systems with Python
上QQ阅读APP看书,第一时间看更新

Learning SciPy

On top of the efficient data structures of NumPy, SciPy offers a magnitude of algorithms for working on those arrays. Whatever numerical heavy algorithm you take from current books on numerical recipes, you will most likely find support for them in SciPy in one way or another. Whether it is matrix manipulation, linear algebra, optimization, clustering, spatial operations, or even fast Fourier transformation, the toolbox is readily filled. Therefore, it is a good habit to always inspect the scipy module before you start implementing a numerical algorithm.

For convenience, the complete namespace of NumPy is also accessible via SciPy. So, from now on, we will use NumPy's machinery via the SciPy namespace. You can check this by easily comparing the function references of any base function, such as the following:

>>> import scipy, numpy
>>> scipy.version.full_version
1.0.0

>>> scipy.dot is numpy.dot
True

The diverse algorithms are grouped into the following toolboxes:

   
          
SciPy packages           Functionalities
cluster           Hierarchical clustering (cluster.hierarchy)
Vector quantization/K-means (cluster.vq)
constants           Physical and mathematical constants
Conversion methods
fftpack           Discrete Fourier transform algorithms
integrate           Integration routines
interpolate           Interpolation (linear, cubic, and so on)
io           Data input and output
linalg           Linear algebra routines using the optimized BLAS and LAPACK libraries
ndimage           n-dimensional image package
odr           Orthogonal distance regression
optimize           Optimization (finding minima and roots)
signal           Signal processing
sparse           Sparse matrices
spatial           Spatial data structures and algorithms
special           Special mathematical functions, such as Bessel or Jacobian
stats           Statistics toolkit

 

The toolboxes that are most pertinent to our goals are scipy.stats, scipy.interpolate, scipy.cluster, and scipy.signal. For the sake of brevity, we will briefly explore some features of the stats package and explain the others when they show up in the individual chapters.