Python Machine Learning Blueprints
上QQ阅读APP看书,第一时间看更新

map

We'll now begin with the map function. The map function works on series, so in our case we will use it to transform a column of our DataFrame, which you will recall is just a pandas series. Suppose we decide that the species numbers are not suitable for our needs. We'll use the map function with a Python dictionary as the argument to accomplish this. We'll pass in a replacement for each of the unique iris types:

Let's look at what we have done here. We have run the map function over each of the values of the existing species column. As each value was found in the Python dictionary, it was added to the return series. We assigned this return series to the same species name, so it replaced our original species column. Had we chosen a different name, say short code, that column would have been appended to the DataFrame, and we would then have the original species column plus the new short code column.

We could have instead passed the map function a series or a function to perform this transformation on a column, but this is a functionality that is also available through the apply function, which we'll take a look at next. The dictionary functionality is unique to the map function, and the most common reason to choose map over apply for a single column transformation. But, let's now take a look at the apply function.