Common Errors


We've learned almost everything you will need to know to get started writing code. But, let's be reasonable, we all make mistakes. In reality, the number of bug-free programs you're going to write in your programming lives will approach a very reasonable approximation of zero. You should assume your code will not run, or worse will run and then three months later turn out not to be reporting anything like what you thought it was. Debugging is the rule, not the exception.

A few commonly encountered sources of errors include:

  • Mis-typed variable names

  • Mis-typed variables

  • Non-existent variables (or indices in a variable)

  • Edge effects on lists

  • Unexpected input


Some of these tend to cause python to give up right away, while others will cause insipid bugs that sneak through unnoticed until you present your work in lab meeting and someone calls you on your exciting, but seemingly impossible (and ultimately bogus) result.

RESPONSE 1: An ounce of prevention


Every time you introduce a new piece of logic to your code, test it

Let me say this again (warning, this is a pet peeve of mine),

Every time you introduce a new piece of logic to your code, TEST IT

There are two main ways to accomplish this:
1) Each time you finish a few lines (3-5, at most) of code, print any variables you changed or created in that time.
2) Consider creating small exemplars of input datasets to benchmark your code.
3) Take a firm hand, and start explicitly setting variables to the value you want them to have (often [], '' or 0) just before you use them for the first time.

You will save yourself a ton of time and effort if you ensure your code is doing exactly what you think it is all the way through the program. It is infinitely more difficult to debug 100 lines of code than it is to debug 10 than it is to debug 1. Writing a ton of code that generates output without checking if each component works individually and for the right reason does NOT make you a coding rock star--it makes you sloppy.

OK. Done with lecture.

RESPONSE 2: Get LOUD!


Regardless of where your problem lies, and regardless of when and how it messes up your script, the first thing you should do is start shouting.

Feel better?

Ok, now let's fix your code. Getting loud with your code means using print statements. The key is to get used to reporting the values that your variables have at the point things start going wrong. Often, you won't know where this is without doing a little debugging first, so you should approach debugging with a divide-and-conquer attitude. In other words, add lots of print statements in lots of places to narrow down where things are going wrong and then add additional print statements to figure out what exactly the problem is.

RESPONSE 3: K.I.S.S.


As you get better at coding, you will start to take shortcuts and combine lines. As soon as something doesn't behave as you expect, you should decompose your compound statements, as this is a common source of error.

Compound:
helix_aa.extend(range(int(line[21:25].strip()),int(line[33:37].strip())+1))

Expanded:
start_str = line[21:25]
end_str = line[33:37]
start_stripped = start_str.strip()
end_stripped = end_str.strip()
start = int(start_stripped)
end = int(end_stripped)+1
helix_res = range(start,end)
helix_aa.extend(helix_res)

These two code snippets do the same thing, and python doesn't care which you use, but if you start with the former and something doesn't work, you should immediately switch to the latter, in order to print the values of each intermediate step--this is the easiest way to figure out which step is misbehaving.

Error messages from Python


Good news is that python often barfs when we make a mistake.

###
ASIDE: Python chokes when you make a mistake in coding LANGUAGE--in other words, if it can't understand what you are asking it to do for you. It does NOT complain when you make a mistake in the logic of your program. You're on your own there, which is why the practices described above are really useful.
###

The first type of error that Python recognizes is a syntax error:
#!/usr/bin/python
 
while True print 'Bioinformatics is AWESOME!'
giving:
$./errors.py
File "./errors.py", line 3
while True print 'Bioinformatics is AWESOME!'
^
SyntaxError: invalid syntax

Here, python tells you the following:
1) The problem is on line 3 of the file
2) There is a problem between 'True' and 'print'

What is the problem and how is it fixable?

The second type of error covers everything else that could go wrong. You have already been exposed to a few of these error messages in our lecture on data structures.
TAs = ['Rose', 'Alan', 'Angela', 'Nathan']
print TAs[10]
gives:
$./errors.py
Traceback (most recent call last):
File "./errors.py", line 5, in <module>
print TAs[10]
IndexError: list index out of range

THIS MESSAGE IS NOT THAT SCARY!
READ IT OUT LOUD WITH ME!

Here, python tells you the following:
1) The problem is on line 5
2) It seems to be choking on the expression 'print TAs[10]'
3) The type of error is an 'IndexError'

Here, the problem should now be obvious: we are trying to print something that does not exist. It can either be fixed by extending the list or modifying the print statement.

Python finds the ability to identify errors to be a useful tool. Therefore, these are not called errors but exceptions. There are almost 50 different defined (built-in) exceptions in the current release of Python. Good news is that part of the exception message includes a short definition of the error, so you don't need to remember them all!

Alright. It's cool that python can identify when and (in most cases) where something goes wrong and then gives us some useful information on how to fix things. However, it's kind of lame that the program always quits after an error message. What if I want it to keep going?

Try and Except

Try/except asks python first give something a go. If there is no error, all is well and continue the code. However, if an error occurs, do something reasonable.

For example, when we open files, we might want to check if they are actually there first.
f1 = open('CharlyCat.txt', 'r')
giving:
$./errors.py
Traceback (most recent call last):
File "./errors.py", line 9, in <module>
f1 = open('CharlyCat.txt', 'r')
IOError: [Errno 2] No such file or directory: 'CharlyCat.txt'

We can use a try/except loop here to do something more informative
try:
    f1 = open('CharlyCat.txt', 'r')
except IOError:
    print 'CharlyCat.txt is not available.',
    print ' He is climbing my curtains.'
 
print 'BAD CAT'
giving:
$./errors.py
CharlyCat.txt is not available. He is climbing my curtains.
BAD CAT

You can account for multiple exceptions in the same loop in two ways:
except (RuntimeError, TypeError, NameError):
    print 'Now he is sleeping on his scratching post'
 

Finally

The finally clause runs 'on the way out'. If no error occurs, Python runs the try block then the finally block and then moves on. If an exception occurs, Python runs the finally block and then throws the exception.
def divide(x,y):
    try:
        result = x/y
    except ZeroDivisionError:
        print "divide by zero!"
    else:
        print "result is", result
    finally:
        print "executing final clause"
 
divide(2,1)
print
divide(2,0)
print
divide('Terry','CharlyCat')
giving:
$./errors.py
result is 2
executing final clause

divide by zero!
executing final clause

executing final clause
Traceback (most recent call last):
File "./errors.py", line 31, in <module>
divide('Terry','CharlyCat')
File "./errors.py", line 19, in divide
result = x/y
TypeError: unsupported operand type(s) for /: 'str' and 'str'

In the first case, we make it through the loop just fine.

In the second case, we throw and catch an exception, deal with it gracefully, and move along our merry way.

In the third case, we have found an exception that we are not handling. Here, we get to the finally clause and then print the exception message. How might we address this problem?

Any questions?

Exercises

1) Exception handling we have known.
In lesson 2.1, we learned about two functions that can be applied to remove an item from set: remove() and discard(). We looked at the functions using the following example:
#!/bin/python
 
list_of_letters = ['a', 'a', 'b', 'c','c','c','d','e']
print 'ORIGINAL'
set_of_letters = set(list_of_letters)
print set_of_letters
 
print 'DISCARD'
set_of_letters.discard('q')
print set_of_letters
 
print 'REMOVE'
set_of_letters.remove('q')
print set_of_letters
 
a) Now that we've learned more about exception handling, explain what is happening here.
b) Create a script that contains a list of 5 of your favorite beers or wines or soft drinks (depending on your preference) for a tasting party stored as a set. Each time someone drinks a beverage, it is removed from your fridge and cannot be drunk again. Matt drinks a beverage. Adjust the set as appropriate. Rich sees Matt drinking his beverage and wants the same one. Tell him you're out of that beverage.

2) Improving the teacher's code.
Go back and look at the code you used to do exercise 3.2-3 Bonus (Doing something interesting). Now that you have learned about file and string processing, you should be able to understand the wrapper script that was supplied to help you parse the pdb files.
a) The code is very poorly commented. Figure out what is going on at each step and add comments.
b) There are two locations where exception handling is applied. Why is the exception handling necessary? The implementation is very sloppy. Can you rewrite exception handling? Can you rewrite the code again to avoid exception handling all together? (HINT: PDB files are formatted according to characters not whitespace. Go back to the documentation cited in exercise 3.2-2 for additional information on PDB formatting).

3) All code is bug-free until your first user.
You have another coworker who heard about your AMAZING secondary structure analysis code. She asks if you will analyze her protein, interleukin-19, as well (HINT: use PDB code 1N1F). Crud! This protein breaks your code. Why? Rewrite your code to work on both interleukin-19 and on the original H1N1 neuraminidase example.