Getting Along with Python

Use a debugger

Debuggers are an essential basic tool for finding bugs and learning about how programs work. A debugger lets you observe your program as it runs. You can control the program’s execution by running code one piece at a time, or by setting “breakpoints” where execution will pause. When the debugger has paused your program, you can look at its state (e.g., the values of its variables).

If you are using debuggers and logging as appropriate, you will find bugs more efficiently and with fewer unpleasant side effects than alternative techniques like scattering print statements throughout your code, or randomly changing things until the program seems to work. These bad habits might seem easier at the time, but they make things harder than necessary.

Which debugger?

Though their interfaces differ in subtle ways, these all do more or less the same things. It probably isn’t critical which one you use, as long as it works for you. If you don’t know or care what to learn, just learn pdb - although its interface lacks polish, pdb comes with Python and serves the purpose.

Here’s a short list of commonly used Python debuggers:

  • pdb comes with Python and offers a very basic text interface driven by typed commands.
  • ipdb is similar to pdb, but a little fancier because it uses IPython. Check it out if you use IPython.
  • pudb offers a full-screen, color console (text mode) interface and is otherwise pretty similar to pdb.
  • pdbpp is also similar to pdb and ipdb, with features like color and tab completion.
  • winpdb is a full-blown GUI debugger for any platform, not just Windows.

When debugging web apps, you’d usually use a debugger tailored for whatever framework you are using.

Feel free to use any debugger’s own docs to learn how to install and use it, or use any tutorial you want. But if you want to be effective in Python, please make sure you know how to use a debugger. If you want a little more introduction, read on.

Pseudo-debugging with python -i

This trick is good for when you want to run a program, then inspect the state that it leaves behind.

python -i myfile.py

This runs the code inside myfile.py, then drops you at a python prompt where you can look at the “post mortem,” run the defined functions and so on.

You can add a “-c” flag if you want to run a snippet from the shell rather than running a file:

python -i -c "from foo import bar; x = bar(2)"

If you use ipython, the same tricks work with that in the same way.

But if you need to understand what is going on in some code DURING its execution, you are going to have to use a real debugger.

Basic debugging with pdb

Here is a brief introduction to pdb for beginners. pdb is the basic debugger supplied with Python itself. pdb looks plain, but the concepts used in any debugger are essentially the same.

An example program

We need something to debug, so here’s a simple example program. This tutorial assumes the following code is saved in /tmp/example.py.

def foo(x):
    """Add 2 to a number.

    :arg x:
        a number to work with.
    :returns:
        the input value plus 2.
    """
    y = x + 2
    return y


def main():
    x = 4
    y1 = foo(x)
    y2 = foo(y1)
    print("y2 was {0}".format(y2))


if __name__ == "__main__":
    main()

Before doing anything with the debugger, reflect for a minute on how this program works. This will help you understand how pdb tells you about programs. Here is a conceptual overview of what happens during execution of /tmp/example.py by the Python interpreter.

  1. The module is imported, running the def foo, def main, and if __name__ lines. Since running it as a script or with pdb makes __name__ == "__main__", it will then run main() (at line 21).
  2. Inside main(), at line 14, the local name x (local to main, that is) is bound to the value 4.
  3. Then, also inside main() at line 15, foo() is called with the value of x (that is, 4).
  4. Inside foo(), the local name x (local to foo, that is) is bound to the value passed in, in this case 4. Then, at line 9, the name y is bound to the value of 4 + 2, i.e., 6, and this value is returned at line 10.
  5. Then, at the completion of main() line 15, the name y1 is bound to the value just returned by foo().
  6. Then in main() at line 16, foo is called with the value of y1 (that is, 6).
  7. Then, inside foo() again: the local name x (local to foo, that is) is bound to the value passed in, in this case 6. Then the name y is bound to the value of 6 + 2, i.e., 8, and this value is returned.
  8. Then, back in main() at the completion of line 16, the name y2 is bound to the value just returned by foo().
  9. In main() at line 17, a message is printed containing the value of y2.
  10. main() runs out of things to do, and returns.
  11. since the module has finished running main, and there is nothing left to do, it exits.

In order to understand tracebacks and debuggers, you need to know about something called the “stack.” The stack represents the current state of a running program in memory. It keeps track of the different functions or methods which are currently running - one running inside another, each having its own “stack frame” to contain its state.

For example, when Python runs a module like example.py, it makes a “stack frame” for that module’s running state, such as the names bound to values at the top level of the module. This stack frame is labeled <module> by pdb. Then inside that, the module runs main(), which has its own stack frame; and main() runs foo(), which has yet another stack frame of its own. When foo() is running, the top stack frame is <module>, and the bottom one is foo(). Just before foo() runs and after it returns, the bottom stack frame is main(). And so on. The contents of the stack change as the program runs.

Running the example in pdb

You can run example.py inside pdb with this command from the command line:

python -m pdb example.py

On Unix-like OSes where pdb is on $PATH, you can alternatively use

pdb example.py

Here’s an example of what you might see first after running this:

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb)

This display may be confusing at first, so here is a little explanation to break it down.

  • The first line tells you which file is running (/tmp/example.py), and the line number pdb is paused at (1). The current selected stack frame is <module>, or in other words the one with the global scope for example.py.
  • The second line -> def foo(x): is the content of the current line of Python code, which is just about to run as soon as you tell pdb to continue.
  • Finally, the (Pdb) line is a prompt - it means the program is ‘paused’ and pdb is waiting for you to type in a command, which will run after you hit enter.

Getting help in pdb

Now what can you do from the (Pdb) prompt? One thing you can do is list available commands by typing help, which should show you something like this:

Documented commands (type help <topic>):
========================================
EOF    bt         cont      enable  jump  pp       run      unt
a      c          continue  exit    l     q        s        until
alias  cl         d         h       list  quit     step     up
args   clear      debug     help    n     r        tbreak   w
b      commands   disable   ignore  next  restart  u        whatis
break  condition  down      j       p     return   unalias  where

To get help on a specific command – continue for example – use help continue:

(Pdb) help continue
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.

In other words, the commands c or cont or continue will run until a breakpoint, and then stop. The curious can use the help command to learn about all the other commands available in pdb.

continue

Let’s try out that continue command. To do that at the (Pdb) prompt, type in continue and press the Enter key.

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) continue

After typing continue and hitting enter, you’ll see that the whole program runs, printing y2 was 8. (If you don’t understand why, refer back to An example program.) Then, as you’ll see, pdb automatically goes back to the beginning.

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) continue
y2 was 8
The program finished and will be restarted
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb)

You can always use quit to exit pdb when you are done debugging, but let’s not do that yet.

next and step

Now let’s just run one top-level line at a time, using next:

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) next
> /tmp/example.py(13)<module>()
-> def main():
(Pdb)

This time the program just runs forward one line each time we type ‘next’. But did you notice that it seemed to skip all the lines in example.py between def foo(x): and def main():? That is because it is stepping through lines at the top level - i.e., the ones that are not inside a function. (If you want to run in smaller steps than that, use step which will stop as soon as possible. Check help step for more information on that.)

For now, let’s keep using next, and pdb will continue to run the program, line-by-line (at the top level).

> /tmp/example.py(13)<module>()
-> def main():
(Pdb) next
> /tmp/example.py(20)<module>()
-> if __name__ == "__main__":
(Pdb) next
> /tmp/example.py(21)<module>()
-> main()
(Pdb) next
y2 was 8
--Return--
> /tmp/example.py(21)<module>()->None
-> main()
(Pdb) next
--Return--
> <string>(1)<module>()->None

Again, the lines between def main(): and if __name__ == "__main__": were not displayed because they are not at the top level; remember that next steps through lines at the currently selected level, so if you really need more detail then use step instead.

break

Suppose we want to see what is going on inside foo when it runs. We can set a breakpoint at the beginning of foo using break foo. Then when we run continue, pdb will keep going until the point when foo begins, where it will stop and wait at the prompt.

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) break foo
Breakpoint 1 at /tmp/example.py:1
(Pdb) continue
> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb)

You can set as many breakpoints in your program as you want. breakpoints can be removed using the clear command. As an exercise, you can read about this by using help clear from inside pdb.

pdb.set_trace()

As an aside, another way to set a breakpoint, from outside pdb, is to temporarily insert this line into your program, at the location you want to execution to stop:

import pdb; pdb.set_trace()

This code actually starts pdb from inside your program. So if you use this, you run the script as normal (NOT using pdb), and when this line is hit the debugger should normally start for you; when using pdb.set_trace(), it isn’t useful to run pdb example.py or python -m pdb example.py (as discussed in Running the example in pdb).

The problem with this technique is that it is very easy to forgetfully leave such lines in the code, which will cause it to hang later. If you use this technique, make sure you remove these lines before committing! (Commit hooks that scan for this special line are sometimes used to ensure it doesn’t find its way into production code.)

This will also have confusing results if done inside a service that does not have access to the console (it will just appear to hang). Debugging web apps often requires using facilities specific to your web framework.

Inspecting state

Now suppose we want to see the values of x and y inside foo. You can usually just enter the name of a variable at the (Pdb) prompt to have pdb evaluate it and show you its value in the current stack frame, in this case from foo.

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) break foo
Breakpoint 1 at /tmp/example.py:1
(Pdb) continue
> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb) x
4
(Pdb) y
*** NameError: name 'y' is not defined
(Pdb) next
> /tmp/example.py(10)foo()
-> return y
(Pdb) y
6
(Pdb)

Note that when the current line in pdb is y = x + 2, x is not yet defined because the line is not finished running yet. After we step through that line it has been run, and then y is defined.

(Actually you can have pdb evaluate whatever Python code you want. For example, you can call a test function directly from pdb in order to see what happens during the test, or you can even change the values of variables from the debugger. But doing complicated things in the debugger tends to confuse what is happening in the program with what is happening in the commands you just typed in, so try to keep it simple.)

where

If at any time you want to see where execution is currently at in the code, use the where command to see the current stack trace, listing the stack frames that are currently on the stack, from top to bottom.

(Pdb) where
  /usr/lib/python2.7/bdb.py(400)run()
-> exec cmd in globals, locals
  <string>(1)<module>()
  /tmp/example.py(21)<module>()
-> main()
  /tmp/example.py(15)main()
-> y1 = foo(x)
> /tmp/example.py(9)foo()
-> y = x + 2

Let’s break this down. We see a list of stack frames and for each one, the current line. The current selected stack frame (the one you are currently inspecting) is marked with > at the start of the line. In this case, that’s main(). Each line starting with -> is the line of code that will next run for each stack frame.

  • The first stack frame is inside bdb.py, which is actually used by pdb itself; it is using exec cmd in globals, locals to run our module, at its own line 400. Usually you would ignore this.
  • The next stack frame is for the module scope <module> - that’s the scope of code that’s in example.py but not inside any function or class. Here we are at line 21 in example.py, which calls function main.
  • The next stack frame is for function main, where the running line is y1 = foo(x). That’s line 15 in example.py.
  • The last stack frame is for function foo, where the running line is y = x + 2. That’s line 9 in example.py.

Note the > to the left of /tmp/example.py(9)foo(). That means this stack frame is the currently selected one.

If the currently selected stack frame is for an invocation of foo and you type y, you will see the value of y that is local to that invocation of foo. If you type x, you will see the value of x that is local to foo, and so on. Similarly, if you type next then pdb will single-step its way through foo, for each line in foo.

We ended up with foo selected because we just set a breakpoint for foo, which is where pdb stopped. But it’s easy to change which stack frame is currently selected - see the next section, up and down.

up and down

Suppose you currently have selected the stack frame for foo, but you want to see variables that are inside main. Or you want to have next single-step through lines in main instead instead of foo. Use up to go from the stack frame for foo (which is at the bottom of the stack) to the stack frame for main (closer to the top of the stack).

(Pdb) u
> /tmp/example.py(15)main()
-> y1 = foo(x)
(Pdb)

Now the current stack frame, marked by >, is for main. The next line is still y1 = foo(x), since pdb has not advanced the execution of the program yet. If you look at where you will see that > has moved in the stack trace to indicate the changed selection. And now when you have pdb evaluate something like x, you will get its value in the context of main rather than the context of foo.

args

Use the args command to see what arguments were given to the function running in the currently selected stack frame. In this example, we see that foo received a value of 4 for x as its only argument.

> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb) args
x = 4

list

This is a shortcut to show you the code context of the current line in the currently selected stack frame. For example, if I have selected foo and execution is at the beginning of the function, I get this:

> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) list
  1  -> def foo(x):
  2         """Add 2 to a number.
  3
  4         :arg x:
  5             a number to work with.
  6         :returns:
  7             the input value plus 2.
  8         """
  9         y = x + 2
 10         return y
 11
(Pdb)

Note that the line to be executed is marked with ->. Of course, docstrings and comments are not executed, so execution skips over these and the line marker will not stop on them either.

Further Reading

For more on pdb, check out the official pdb documentation.