Using the latest features of Python 3
The latest version of the Python 2.x branch, Python 2.7, was released in 2010. It will reach its end of life in 2020. On the other hand, the first version of the Python 3.x branch, Python 3.0, was released in 2008. The decade-long transition period between Python 2 and Python 3, which are slightly incompatible, has been somewhat chaotic.
Choosing between Python 2 (also known as Legacy Python) and Python 3 used to be tricky since many Python users had not transitioned to Python 3 yet, and many libraries were only compatible with Python 2. Those times are gone and it is now safe to stick with Python 3 in virtually all cases. The only exceptions are when you have to support old unmaintained libraries, or when your users cannot transition to Python 3 for whatever reason.
In addition to fixing the bugs and annoyances of Python 2 (for example, related to Unicode support), Python 3 brings many interesting features in terms of syntax, capabilities of the language, and new built-in libraries.
How to do it...
- In Python 3,
print()
is a function, whereas it was a statement in Python 2 (it was a historical oddity). This function may accept multiple arguments as well as a few options. Let's create a list:>>> my_list = list(range(10))
We can print the
my_list
object:>>> print(my_list) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
But we can also print the ten numbers in the list:
>>> print(*my_list) 0 1 2 3 4 5 6 7 8 9
Finally we can customize the separator and the end of the string to print:
>>> print(*my_list, sep=" + ", end=" = %d" % sum(my_list)) 0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = 45
- Python 3 supports more advanced variable unpacking features:
>>> first, second, *rest, last = my_list >>> print(first, second, last) 0 1 9 >>> rest [2, 3, 4, 5, 6, 7, 8]
- In Python 3, variable names can contain Unicode characters. This technique may be useful when writing mathematical code. To type mathematical symbols in the Jupyter Notebook, write LaTeX code and press Tab. For example, to create a variable
, type
\pi
and then Tab:>>> from math import pi, cos α = 2 π = pi cos(α * π) 1.000
- Python 3.6 brings new string literals called f-strings. They offer a convenient syntax to define strings depending on existing variables:
>>> a, b = 1, 2 f"The sum of {a} and {b} is {a + b}" 'The sum of 1 and 2 is 3'
- We can add custom annotations to function arguments and output. These function annotations convey no semantics, but they can be used in the code as we like. Here is an example coming from https://stackoverflow.com/a/7811344/1595060:
>>> def kinetic_energy(mass: 'kg', velocity: 'm/s') -> 'J': """The annotations serve here as documentation.""" return .5 * mass * velocity ** 2
These annotations are stored in the
__annotations__
attribute of the function, and they can be used as follows:>>> annotations = kinetic_energy.__annotations__ print(*(f"{key} is in {value}" for key, value in annotations.items()), sep=", ") mass is in kg, velocity is in m/s, return is in J
The
typing
module, which has been included in Python 3.5 on a provisional basis, implements several annotations that can be used to specify typing information in functions. - Python 3.5 brings a new operator
@
for matrix multiplication. It is supported by NumPy 1.10 and later:>>> import numpy as np M = np.array([[0, 1], [1, 0]])
The
*
operator is the element-wise multiplication:>>> M * M array([[0, 1], [1, 0]])
Previously, matrix multiplication could be performed with
np.dot()
. The new syntax is clearer:>>> M @ M array([[1, 0], [0, 1]])
- Python 3.3 brings the new
yield from
syntax that allows you, among other things, to compose multiple generators. For example, the two following functions are equivalent:>>> def gen1(): for i in range(5): for j in range(i): yield j >>> def gen2(): for i in range(5): yield from range(i) >>> list(gen1()) [0, 0, 1, 0, 1, 2, 0, 1, 2, 3] >>> list(gen2()) [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
- The
functools
library provides a@lru_cache
decorator to implement a simple in-memory caching system for Python functions:>>> import time def f1(x): time.sleep(1) return x >>> %timeit -n1 -r1 f1(0) 1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each) >>> %timeit -n1 -r1 f1(0) 1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
Here, the two successive identical calls to
f1(0)
take one second. Now, let's define a cached version of this function:>>> from functools import lru_cache @lru_cache(maxsize=32) # keep the latest 32 calls def f2(x): time.sleep(1) return x >>> %timeit -n1 -r1 f2(0) 1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each) >>> %timeit -n1 -r1 f2(0) 6.14 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
The first call takes one second, whereas the next one returns immediately. In the second case, the function is not called but the output corresponding to the argument of
0
was cached and returned. - The new
pathlib
module offers filesystem facilities that are more convenient to use than the Python 2os.path
methods. The main class isPath
:>>> from pathlib import Path
We instantiate a
Path
object representing the current directory:>>> p = Path('.')
Let's list all Markdown files in the directory:
>>> sorted(p.glob('*.md')) [PosixPath('00_intro.md'), PosixPath('01_py3.md'), PosixPath('02_workflows.md'), PosixPath('03_git.md'), PosixPath('04_git_advanced.md'), PosixPath('05_tips.md'), PosixPath('06_high_quality.md'), PosixPath('07_test.md'), PosixPath('08_debugging.md')]
We can easily get the contents of a text file:
>>> _[0].read_text() '# Introduction\n\n...\n'
Let's obtain the list of subdirectories:
>>> [d for d in p.iterdir() if d.is_dir()] [PosixPath('images'), PosixPath('.ipynb_checkpoints'), PosixPath('__pycache__'),
Finally, we list all files in the
images
subfolder (note the slash/
operator onPath
instances):>>> list((p / 'images').iterdir()) [PosixPath('images/github_new.png'), PosixPath('images/folder.png')]
- Python 3.4 provides a new
statistics
module which implements basic statistical routines. This module may be useful when depending on NumPy or SciPy is not desirable. Let's import the module:>>> import random as r import statistics as st
We create a list of normally-distributed random variables:
>>> my_list = [r.normalvariate(0, 1) for _ in range(100000)]
We compute the mean, median, and standard deviation:
>>> print(st.mean(my_list), st.median(my_list), st.stdev(my_list), ) 0.00073 -0.00052 1.00050
There's more...
Other interesting features of Python 3 include coroutines with the asyncio
module and asynchronous operations with the new await
and async
keywords.
Here are a few references:
- What's New In Python 3.6? at https://docs.python.org/3/whatsnew/3.6.html
- f-strings, at https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- The
yield from
syntax, at https://docs.python.org/3/whatsnew/3.3.html#pep-380 - functools, at https://docs.python.org/3/library/functools.html
- pathlib, at https://docs.python.org/3/library/pathlib.html
- statistics, at https://docs.python.org/3/library/statistics.html
- 10 awesome features of Python that you can't use because you refuse to upgrade to Python 3, at http://www.asmeurer.com/python3-presentation/slides.html
- Python 3 for Scientists, at http://python-3-for-scientists.readthedocs.io/en/latest/
- Cool New Features in Python 3.6, at https://www.youtube.com/watch?v=klKdMxjDaa0
- Python Cookbook, 3rd Edition, Brian Jones and David Beazley, O'Reilly Media, at http://shop.oreilly.com/product/0636920027072.do
- Find the best Python books, at http://pythonbooks.org/
- Buggy Python Code: The 10 Most Common Mistakes That Python Developers Make, at https://www.toptal.com/python/top-10-mistakes-that-python-programmers-make
- Python 3 statement, to promote the deprecation of Python 2 support by 2020, at http://www.python3statement.org