My Notes of the Readability Counts PyCon 2017 Talk by Trey Hunner

Tina Bu
5 min readDec 10, 2019
Readability Counts by Trey Hunner at PyCon 2017

This is my notes for Trey Hunner’s talk on Python readability from PyCon 2017. The slides are available here but I think the content will be easier to consume because there are many before-and-after comparison examples. All credit goes to @treyhunner and a big shoutout to his Python Morsels training.

In this 30 min talk, Trey talked about how to make your code easier to read, including better use whitespaces, line breaks, code structure, naming concepts, choosing the right construct and many more.

First, why readability important?

  • code is more often read than written
  • easier to change or maintain your code
  • easier on-boarding
  • PEP 8 is only a starting point for your project’s style guide

1. Whitespace, Line Breaks, and Code Structure

1.1 Line Length

Line length is the number of characters in one line of text. It used to be a technical limitation (small screens) but now it is a human one. Longer lines are generally harder to read so use line breaks to improve readability and do not rely on automatic line wrapping.

Also not all short lined code is easy to read. When you think about wrapping lines, focus on the readability instead of purely reducing the line length. Short lines are not the end goal, readability is. (Trey made an analogy of poets breaking their lines with purpose, not by line length.)

not readable: random line wrapping

before: random line wrapping

readable: use line breaks to distinguish logic blocks

after: line wrapped according to logic blocks

1.2 Regular Expressions

Regular expressions are code themselves and can be very hard to read. Trey recommends to always turn on the verbose mode which allows you to wrap your regular expressions over multiple lines and insert comments (and whitespaces).

not readable: regular expression all in one line

before: regular expression in one line

readable: regular expression with whitespace and comments (I don’t think regex will every be readable for me tho)

after: use VERBOSE mode to add comments to your regular expression

1.3 Function Calls

Say we are trying to create a jangle model with a bunch of arguments passed to it. There are multiple styles (I used to be #2 but changed to #3 recently, which one are you?) and Trey recommends #3 as the #1 can be difficult to read and #2 one is tricky if your lines are really long.

style 1: insert line breaks when reach max width
style 2: align all parameters
style 3: new line for each parameter

But even #3 can have different variations.

style 3.1: closing parenthesis separate line
style 3.2: function call: trailing comma for last parameter

You can choose whichever style you prefer, but Trey recommends all projects to have a style guide documentation to enforce consistency.

2. Naming

Important concepts deserve names. Naming things is hard because describing things is hard.

2.1 Variable Names

A name should fully and accurately describe its subject. Don’t be afraid to use long variable names. Optimize for maximum accuracy and completeness, not for short names.

Bad variable names

bad naming: two-letter variable names that no one understands

Make it more descriptive with whole words

better naming: use descriptive names

We are using index accesses (i[0]), which can usually be replaced by variables with tuple unpacking, preferred because you can give a name to all meaningful variables.

even better: use tuple unpacking to avoid index access

2.2 Missing Names

Sometimes bad naming exists in the form of no naming. Say we have this function with a too-long if clause:

function naming before: if clause too long and can be abstracted

To make this function more readable, abstract the if logic into its own function and give it a descriptive name.

To further optimize this code, we can remove code duplication in the is_anagram function by giving the variables a name.

function naming: remove code duplication by creating new variables

Make the return logic more descriptive with comments.

function naming: add comments to return clause

As soon as you realize you need to use comments to describe things, try to see if you can create a variable for it. In this example, we can turn the 2 conditions to boolean variables and make the explicit logic more clear and readable.

function naming: name boolean conditions to make the logic more clear

The main idea here is we should always try to convey the intent of the code, not just the math logic. Let’s look at the final before and after comparison:

before:

after:

2.3 More Readable Functions

Say you have a very complex jangle function method which requires detailed comments to make it understandable.

before: a big function

A better way is to abstract the details into helper functions with descriptive names.

after: use helper functions

Trey recommends reading your code out loud to ensure you are describing the intent of the function in detail. Using comments is good, but there are times that the first step towards readability is a missing or better variable name.

So far we have seen 2 examples of how creating names for unnamed concepts/code can improve readability. In general, we should aim at making our code self-documenting instead of relying on extensive comments.

3. Construct

3.1 Context Manager

(my biggest takeaway from this talk.)

Imagine you writing a try finally block for a database read attempt. We need this to ensure that we close the DB connection even exception happens.

But you can also use a context manager to handle the exceptions implicitly.

Special purpose constructs can reduce complexity, when possible, use constructs with specific intent.

Of course, you can use the closing context manager from the standard library.

3.2 List Comprehension

Say you have a for loop that create a list from looping over another list with some conditions.

You should always use list comprehension in this case because it contains less unnecessary information. Think that you are transforming one list to another instead of looping.

3.3 Operator Overloading

Say you created a class to represent items in a shopping cart, with most of the methods interact with the underlying dictionary object in some fancy customized way.

But if you take a close look each method actually corresponds to some native operations of Python objects like dictionaries and lists.

cart.contains(item) vs. item in cart
cart.set(item, q) vs. cart[item] = q
cart.add(item, q) vs. cart[item] += q
cart.remove(item) vs. del cart[item]
cart.count vs. len(cart)
cart.is_empty vs. not cart

We can use the built-in Python operators so the user won’t need to learn our customized functions. The Pythonic name is to use dunder methods for operator overloading.

If the custom object you are creating can use some abstract base class you should always do that. For example use collections.UserList to make a custom list, collections.UserDict to make a custom dictionary and collections.UserString to make a custom string.

3.4 Shared Data

Say this is some code you wrote: one function returns a server object and the other 3 uses it.

If you ever find yourself passing the same data to different functions, create a class. A class is meant to bundle together functionality and data.

To recap, in this section on using construct, Trey recommended using a context manager instead of wrapping code in redundant try finally/except blocks, using list comprehension when making one list out of another, using operator overloading and dunder methods, and using classes for bundling up code and data.

Readability checklist

  1. Can I modify line breaks to improve clarity?
  2. Can I create a variable name for unnamed code?
  3. Can I add a comment to improve clarity?
  4. Can I turn a comment into a better variable name?
  5. Can I use a more specific programming construct?
  6. Have I stated detailed preferences in a style guide?

Trey’s style guide

Recommended Videos

Who am I?

My name is Tina Bu and I write about ml tutorials & productivity. Read my tips on AWS re:Invent.

--

--