Python Machine Learning Blueprints
上QQ阅读APP看书,第一时间看更新

The Jupyter Notebook

There are a number of libraries that will make the data inspection process easier. The first is Jupyter Notebook with IPython (http://ipython.org/). This is a fully-fledged, interactive computing environment, and it is ideal for data exploration. Unlike most development environments, Jupyter Notebook is a web-based frontend (to the IPython kernel) that is divided into individual code blocks or cells. Cells can be run individually or all at once, depending on the need. This allows the developer to run a scenario, see the output, then step back through the code, make adjustments, and see the resulting changes—all without leaving the notebook. Here is a sample interaction in the Jupyter Notebook:

 Sample interaction in the Jupyter Notebook

You will notice that we have done a number of things here and have interacted with not only the IPython backend, but the terminal shell as well. Here, I have imported the Python os library and made a call to find the current working directory (cell #2), which you can see is the output below my input code cell. I then changed directories using the os library in cell #3, but stopped utilizing the os library and began using Linux-based commands in cell #4. This is done by adding the ! prepend to the cell. In cell #6, you can see that I was even able to save the shell output to a Python variable (file_two). This is a great feature that makes file operations a simple task.

Note that the results would obviously differ slightly on your machine, since this displays information on the user under which it runs.

Now, let's take a look at some simple data operations using the notebook. This will also be our first introduction to another indispensable library, pandas.