Lesson 5: String Manipulation and Error Handling
A string is a Sequence
A string is a sequence of characters. You can access the characters one at a time with the bracket operator:
fruit = 'banana'
letter =
fruit[1]
The second statement extracts the character at index position 1 from the fruit variable and assigns it to the letter variable.
The expression in brackets is called an index. The index indicates which character in the sequence you want (hence the name).
But you might not get what you expect:
| Code | Result |
|---|---|
fruit = 'banana' |
a |
For most people, the first letter of 'banana' is b, not a. But in Python, the index is an offset from the beginning of the string, and the offset of the first letter is zero.
| Code | Result |
|---|---|
fruit = 'banana' |
b |
So b is the 0th letter ("zero-eth") of 'banana', a is the 1th letter ("one-eth"), and n is the 2th ("two-eth") letter. Here are all of indices of the characters in the word 'banana':
| Index | Character | Using the Bracket Operator |
|---|---|---|
0 |
b |
fruit[0] |
1 |
a |
fruit[1] |
2 |
n |
fruit[2] |
3 |
a |
fruit[3] |
4 |
n |
fruit[4] |
5 |
a |
fruit[5] |
You can use any expression, including variables and operators, as an index, but the value of the index has to be an integer. Otherwise you get:
| Code | Result |
|---|---|
fruit = 'banana' |
TypeError: string indices must be
integers |
Getting the length of a String
len is a built-in function that returns the number of characters
in a string:
| Code | Result |
|---|---|
fruit = 'banana' |
6 |
To get the last letter of a string, you might be tempted to try something like this:
| Code | Result |
|---|---|
fruit = 'banana' |
IndexError: string index out of range |
The reason for the IndexError is that there is no letter in
'banana' with the index 6. Since we started counting at zero, the six letters
are numbered 0 to 5. To get the last character, you have to subtract 1 from
length:
| Code | Result |
|---|---|
fruit = 'banana' |
a |
Alternatively, you can use negative indices, which count backward from the
end of the string. The expression fruit[-1] yields the last letter,
fruit[-2] yields the second to last, and so on.
| Code | Result |
|---|---|
fruit = 'banana' |
a |
Traversal through a string with a Loop
A lot of computations involve processing a string one character at a time. Often they start at the beginning, select each character in turn, do something to it, and continue until the end. This pattern of processing is called a traversal.
One way to write a traversal is with a while loop:
| Code | Output |
|---|---|
fruit = 'banana' |
b |
This loop traverses the string and displays each letter on a line by itself.
The loop condition is index < len(fruit), so when
index is equal to the length of the string, the condition is false,
and the body of the loop is not executed. The last character accessed is the one
with the index len(fruit)-1, which is the last character in the
string.
