Exceptions

Reference: Exceptions in Python tutorial

Exceptions in Python (and other languages) are used to handle situations during program execution when something goes wrong (exceptional cases) in a controlled way.

Let’s start with a motivating example from MP1, finding complementary base pairs:

def complement(c):
    if c == 'A':
        return 'T'
    if c == 'T':
        return 'A'
    if c == 'C':
        return 'G'
    if c == 'G':
        return 'C'

Remember that functions return None if they reach then end without hitting a return statement, so this is equivalent to:

def complement(c):
    if c == 'A':
        return 'T'
    if c == 'T':
        return 'A'
    if c == 'C':
        return 'G'
    if c == 'G':
        return 'C'
    return None

complement_seq is a client of complement (that is to say, it calls complement):

def complement_seq(dna_seq):
    return ''.join(complement(b) for b in reversed(dna_seq))

This compact form is a “generator expression”, which you can read about in the “Goodies” chapter of Think Python. Let’s unpack it to make debugging easier:

def complement_seq(dna_seq):
    result = ''
    for b in reversed(dna_seq):
        c = complement(b)
        result += c
    return result

Passing an invalid argument to complement_seq passes an invalid argument to complement, which raises an exception. The exception is downstream from the call to complement, and has an unrevealing name and message. This makes this difficult to debug.

complement_seq('CAXT')

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-5-37d3d1313b61> in <module>
----> 1 complement_seq('CAXT')


<ipython-input-4-8f9fc6967dcc> in complement_seq(dna_seq)
      3     for b in reversed(dna_seq):
      4         c = complement(b)
----> 5         result += c
      6     return result


TypeError: can only concatenate str (not "NoneType") to str

Return-value-as-error

One technique (frowned on in Python) is to represent an error by an “out-of-band” value. “Out-of-band” means not in the set of valid return values for the function.

def complement(c):
    if c == 'A':
        return 'T'
    if c == 'T':
        return 'A'
    if c == 'C':
        return 'G'
    if c == 'G':
        return 'C'
    return 'error'

complement callers need to know about this. If they don’t know how to recover from the error, they should return an out-of-band value too. Then their callers need to follow this convention as well.

def complement_seq(dna_seq):
    result = ''
    for b in dna_seq[::-1]:
        c = complement(b)
        if c == 'error':
            return 'error'
        result += c
    return result

def function_that_uses_complement_seq():
    # do some stuff that computes dna_seq
    # ...
    comp_seq = complement_seq(dna_seq)
    if comp_seq == 'error':
        return 'error'
    # now the case where comp_seq didn't return an error

And so on, all the way up the call stack. This is getting to be a mess - lots of repeated code and opportunities to make a mistake. Let’s see how we can do better.

Exceptions

Rather than returning a value intended to indicate an error, we can raise an exception.

def complement(c):
    if c == 'A':
        return 'T'
    if c == 'T':
        return 'A'
    if c == 'C':
        return 'G'
    if c == 'G':
        return 'C'
    raise Exception('Invalid nucleobase {!r}'.format(c))

complement('X')

---------------------------------------------------------------------------

Exception                                 Traceback (most recent call last)

<ipython-input-3-1bde63f9f552> in <module>()
     10     raise Exception('Invalid nucleobase {!r}'.format(c))
     11 
---> 12 complement('X')


<ipython-input-3-1bde63f9f552> in complement(c)
      8     if c == 'G':
      9         return 'C'
---> 10     raise Exception('Invalid nucleobase {!r}'.format(c))
     11 
     12 complement('X')


Exception: Invalid nucleobase 'X'

The exception is thrown straight through complement’s callers – even if they don’t know about exceptions. This makes for easier debugging, since we can trace the error back to its original source.

def complement_seq(dna_seq):
    result = ''
    for b in dna_seq[::-1]:
        c = complement(b)
        result += c
    return result

complement_seq('CAXT')

---------------------------------------------------------------------------

Exception                                 Traceback (most recent call last)

<ipython-input-4-962e256e1398> in <module>()
      6     return result
      7 
----> 8 complement_seq('CAXT')


<ipython-input-4-962e256e1398> in complement_seq(dna_seq)
      2     result = ''
      3     for b in dna_seq[::-1]:
----> 4         c = complement(b)
      5         result += c
      6     return result


<ipython-input-3-1bde63f9f552> in complement(c)
      8     if c == 'G':
      9         return 'C'
---> 10     raise Exception('Invalid nucleobase {!r}'.format(c))
     11 
     12 complement('X')


Exception: Invalid nucleobase 'X'

Catching (or handling) exceptions

pay_me_a_complement is a client of complement_seq.

By default, Python will display a stack trace when the user enters an invalid sequence.

def pay_me_a_complement():
    seq = input()
    print('The complement is', complement_seq(seq))

pay_me_a_complement()

ACTXG



---------------------------------------------------------------------------

Exception                                 Traceback (most recent call last)

<ipython-input-5-285ae89e71a5> in <module>()
      3     print('The complement is', complement_seq(seq))
      4 
----> 5 pay_me_a_complement()


<ipython-input-5-285ae89e71a5> in pay_me_a_complement()
      1 def pay_me_a_complement():
      2     seq = input()
----> 3     print('The complement is', complement_seq(seq))
      4 
      5 pay_me_a_complement()


<ipython-input-4-962e256e1398> in complement_seq(dna_seq)
      2     result = ''
      3     for b in dna_seq[::-1]:
----> 4         c = complement(b)
      5         result += c
      6     return result


<ipython-input-3-1bde63f9f552> in complement(c)
      8     if c == 'G':
      9         return 'C'
---> 10     raise Exception('Invalid nucleobase {!r}'.format(c))
     11 
     12 complement('X')


Exception: Invalid nucleobase 'X'

We can however deal with exceptions programatically, and try to do something smart when an error occurs. In Python we use the try…except pattern to handle exceptions.

The following code normally acts exactly the same as the implementation above if the code in the try block runs without causing an exception.

However, if there is an exception within the try block, then the program skips the rest of that block and picks up at the start of the except block instead.

def pay_me_a_complement():
    seq = input()
    try:
        print('The complement is', complement_seq(seq))
    except:
        print('Invalid DNA sequence: {}'.format(seq))
    print('done')

pay_me_a_complement()

CATXC
Invalid DNA sequence: CATXC
done

Be Specific With Exceptions

One problem with the previous implementation is that any error in the try block (or any function it calls) will be caught by the except statement, thereby indiscrimately turning all program errors into an “Invalid DNA sequence” message.

A naked except statement is equivalent to saying except Exception. This will catch any exception that is an instance of the class Exception, which is every exception in Python.

Catching overly broad exceptions like this can mask other problems in your program, and can be incredibly misleading as you’re trying to debug.

In general, you should raise an exception of the appropriate type for the problem that you’ve encountered, being as specific as possible. Consult the full list of built-in Exceptions to choose an appropriate type (ValueError would be a reasonable choice in this situation).

You can then write a more specific except ValueError clause, which will catch those errors but let others pass through.

Custom Exceptions

If you want to be extremely specific with exceptions, you can create your own specific to your application by inheriting from the base Exception class. This isn’t always necessary (the built-in set along with debugging messages is pretty good), but can be useful for larger programs.

class InvalidNucleobaseException(Exception):
    pass

def complement(c):
    if c == 'A':
        return 'T'
    if c == 'T':
        return 'A'
    if c == 'C':
        return 'G'
    if c == 'G':
        return 'C'
    raise InvalidNucleobaseException('Invalid nucleobase {!r}'.format(c))

def pay_me_a_complement():
    seq = input()
    try:
        print('The complement is', complement_seq(seq))
    except InvalidNucleobaseException:
        print('Invalid DNA sequence: {}'.format(seq))

pay_me_a_complement()

CATXT
Invalid DNA sequence: CATXT

One more example

Using try...catch, we can rewrite our complement function more compactly as follows:

def complement(c):
    pairs = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
    try:  # Better to ask forgiveness than permission...
        comp = pairs[c]
    except KeyError:
        raise InvalidNucleobaseException('Invalid nucleobase {!r}'.format(c))
    return comp

print("A ->", complement("A"))
print("X ->", complement("X"))

A -> T

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-19-d20d7dad2ea9> in complement(c)
      3     try:  # Better to ask forgiveness than permission...
----> 4         comp = pairs[c]
      5     except KeyError:

KeyError: 'X'

During handling of the above exception, another exception occurred:

InvalidNucleobaseException                Traceback (most recent call last)

<ipython-input-19-d20d7dad2ea9> in <module>()
      8 
      9 print("A ->", complement("A"))
---> 10 print("X ->", complement("X"))

<ipython-input-19-d20d7dad2ea9> in complement(c)
      4         comp = pairs[c]
      5     except KeyError:
----> 6         raise InvalidNucleobaseException('Invalid nucleobase {!r}'.format(c))
      7     return comp
      8 

InvalidNucleobaseException: Invalid nucleobase 'X'

In this implementation, we try to look up the argument c in the pairs dictionary. If c is a valid nucleobase then this works fine, but if not then the key will not be present in the dictionary, causing a KeyError. We then catch that KeyError and handle it (in this case by raising a more specific/descriptive error).

Note that we’ve included as little code as possible in the try...except block. We don’t expect the creation of the pairs dictionary to raise an error and we’re not prepared to handle it if it does, so it’s not included as part of the try block.

This software pattern is sometimes known as “better to ask forgiveness than permission”, since it plows ahead assuming the dictionary lookup will succeed (hopefully the common case) and deals with the failure if it occurs. The contrasting approach (checking that the input is valid first) is sometimes called “look before you leap”:

def complement(c):
    pairs = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
    if c not in pairs:  # Look before you leap...
        raise InvalidNucleobaseException('Invalid nucleobase {!r}'.format(c))
    return pairs[c]
    
print("A ->", complement("A"))
print("X ->", complement("X"))

Unit Testing

We can (and should) also write unit tests to make sure exceptions are raised properly. Recall that the doctest framework simply checks to see if the printed output matches, so the detailed execution trace that is printed whenever an exception is raised would be cumbersome to deal with. Fortunately, doctest can omit parts of exception output to simplify testing:

def complement(c):
    """
    Return complementary nucleobase of 'c'.
    
    >>> complement('A')
    'T'
    >>> complement('G')
    'C'
    >>> complement('You look nice today')
    Traceback (most recent call last):
    ...
    InvalidNucleobaseException: Invalid nucleobase 'You look nice today'
    >>> complement('C')
    'G'
    
    """
    pairs = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
    if c not in pairs:
        raise InvalidNucleobaseException('Invalid nucleobase {!r}'.format(c))
    return pairs[c]
    
import doctest
doctest.testmod()
#doctest.run_docstring_examples(complement, globals(), verbose=True)

TestResults(failed=0, attempted=4)

Summary Guidelines

Use exceptions (not special return values) to deal with runtime errors in your program
Catch exceptions and try to do something to correct them (even if it is just presenting a helpful error message to your user)
Raise the most specific type of exception possible, and catch a specific type of exception (not all exceptions) to avoid masking errors
Wrap as little of your code in the try...except clause as possible, so that you don’t catch more than you intended