Lesson 6, Bit 5: Combining Data Structures

Dictionaries and Tuples

Dictionaries have a method called items that returns a list of tuples, where each tuple is a key-value pair.

Code Output
d = {'a':10, 'b':1, 'c':22}

t = list(d.items())

print(t)
[('a', 10), ('c', 22), ('b', 1)]

As you should expect from a dictionary, the items are in no particular order.

However, since the list of tuples is a list, and tuples are comparable, we can now sort the list of tuples. Converting a dictionary to a list of tuples is a way for us to output the contents of a dictionary sorted by key:

Code Output
d = {'a':10, 'b':1, 'c':22}

t = list(d.items())

print(t)
[('a', 10), ('c', 22), ('b', 1)]
t.sort()

print(t)
[('a', 10), ('b', 1), ('c', 22)]

The new list is sorted in ascending alphabetical order by the key value.

Launch Exercise

Multiple Assignment with Dictionaries

Combining items, tuple assignment, and for, you can see a nice code pattern for traversing the keys and values of a dictionary in a single loop:

for key, val in d.items():
    print val, key

This loop has two iteration variables because items returns a list of tuples and key, val is a tuple assignment that successively iterates through each of the key-value pairs in the dictionary.

For each iteration through the loop, both key and value are advanced to the next key-value pair in the dictionary (still in hash order).

The output of this loop is:

10 a
22 c
1 b

Again, it is in hash key order (i.e., no particular order).

If we combine these two techniques, we can print out the contents of a dictionary sorted by the value stored in each key-value pair.

To do this, we first make a list of tuples where each tuple is (value, key). The items method would give us a list of (key, value) tuples—but this time we want to sort by value, not key. Once we have constructed the list with the value-key tuples, it is a simple matter to sort the list in reverse order and print out the new, sorted list.

Code Output Notes
d = {'a':10, 'b':1, 'c':22} None

We create the dictionary d.

l = list() None

We create an empty list called l.

for key, val in d.items():
    l.append((val, key))
None

Iterate through the dictionary d.  Append to the list l a tuple which contains the value and key of each dictionary entry.

print(l) [(10, 'a'), (22, 'c'), (1, 'b')]

We display the resulting list of tuples.

l.sort(reverse=True) None

sort compares the first element (value) first, and only considers the second element to break ties. The keyword argument reverse=True tells sort to go in decreasing order.

print(l) [(22, 'c'), (10, 'a'), (1, 'b')]

We display the resulting, newly-sorted, list of tuples.

By carefully constructing the list of tuples to have the value as the first element of each tuple, we can sort the list of tuples and get our dictionary contents sorted by value.

Using Tuples as Keys in Dictionaries

Because tuples are hashable and lists are not, if we want to create a composite key to use in a dictionary we must use a tuple as the key.

We would encounter a composite key if we wanted to create a telephone directory that maps from last-name, first-name pairs to telephone numbers. Assuming that we have defined the variables last, first, and number, we could write a dictionary assignment statement as follows:

directory[last, first] = number

The expression in brackets is a tuple. We could use tuple assignment in a for loop to traverse this dictionary.

for last, first in directory:
    print(first, last, directory[last,first])

This loop traverses the keys in directory, which are tuples. It assigns the elements of each tuple to last and first, then prints the name and corresponding telephone number.

Keeping Them All Straight:

Quotes, parentheses, brackets, curly braces: these are important.  Here is a quick chart so you know which is used for which type:

Code Notes
a = ''

String

b = ()

Tuple

c = {}

Dictionary

d = []

List

When you access the individual elements within the sequence, you always use square bracket notation with the index or key:

Code Output
my_string = 'hello!'
print(my_string[0])
'h'
my_tuple = (1, 'two')
print(my_tuple[1])
two
my_dict = {'firstname':'Fred', 'lastname':'Weasley'}
print(my_dict['firstname'])
Fred
my_list = ['one', 2]
print(my_list[0])
one

Sequences: Strings, Lists, and Tuples – Oh my!

Several examples focused on lists of tuples, but almost all of the examples in this lesson also work with lists of lists, tuples of tuples, and tuples of lists. To avoid enumerating the possible combinations, it is sometimes easier to talk about sequences of sequences.

In many contexts, the different kinds of sequences (strings, lists, and tuples) can be used interchangeably. So how and why do you choose one over the others?

To start with the obvious, strings are more limited than other sequences because the elements have to be characters. They are also immutable. If you need the ability to change the characters in a string (as opposed to creating a new string), you might want to use a list of characters instead.

Lists are more common than tuples, mostly because they are mutable. But there are a few cases where you might prefer tuples:

  • In some contexts, like a return statement, it is syntactically simpler to create a tuple than a list. In other contexts, you might prefer a list.
  • If you want to use a sequence as a dictionary key, you have to use an immutable type like a tuple or string.
  • If you are passing a sequence as an argument to a function, using tuples reduces the potential for unexpected behavior due to aliasing.

Because tuples are immutable, they don't provide methods like sort and reverse, which modify existing lists. However Python provides the built-in functions sorted and reversed, which take any sequence as a parameter and return a new sequence with the same elements in a different order.

Debugging Data Structures

Lists, dictionaries and tuples are known generically as data structures; in this chapter we are starting to see compound data structures, like lists of tuples, and dictionaries that contain tuples as keys and lists as values. Compound data structures are useful, but they are prone to what I call shape errors; that is, errors caused when a data structure has the wrong type, size, or composition, or perhaps you write some code and forget the shape of your data and introduce an error.

For example, if you are expecting a list with one integer and I give you a plain old integer (not in a list), it won't work.

When you are debugging a program, and especially if you are working on a hard bug, there are four things to try:

reading: Examine your code, read it back to yourself, and check that it says what you meant to say.

running: Experiment by making changes and running different versions. Often if you display the right thing at the right place in the program, the problem becomes obvious, but sometimes you have to spend some time to build scaffolding.

ruminating: Take some time to think! What kind of error is it: syntax, runtime, semantic? What information can you get from the error messages, or from the output of the program? What kind of error could cause the problem you're seeing? What did you change last, before the problem appeared?

retreating: At some point, the best thing to do is back off, undoing recent changes, until you get back to a program that works and that you understand. Then you can start rebuilding.

Beginning programmers sometimes get stuck on one of these activities and forget the others. Each activity comes with its own failure mode.

For example, reading your code might help if the problem is a typographical error, but not if the problem is a conceptual misunderstanding. If you don't understand what your program does, you can read it 100 times and never see the error, because the error is in your head.

Running experiments can help, especially if you run small, simple tests. But if you run experiments without thinking or reading your code, you might fall into a pattern I call "random walk programming", which is the process of making random changes until the program does the right thing. Needless to say, random walk programming can take a long time.

You have to take time to think. Debugging is like an experimental science. You should have at least one hypothesis about what the problem is. If there are two or more possibilities, try to think of a test that would eliminate one of them.

Taking a break helps with the thinking. So does talking. If you explain the problem to someone else (or even to yourself), you will sometimes find the answer before you finish asking the question.

But even the best debugging techniques will fail if there are too many errors, or if the code you are trying to fix is too big and complicated. Sometimes the best option is to retreat, simplifying the program until you get to something that works and that you understand.

Beginning programmers are often reluctant to retreat because they can't stand to delete a line of code (even if it's wrong). If it makes you feel better, copy your program into another file before you start stripping it down. Then you can paste the pieces back in a little bit at a time.

Finding a hard bug requires reading, running, ruminating, and sometimes retreating. If you get stuck on one of these activities, try the others.