Use a debugger¶
Debuggers are an essential basic tool for finding bugs and learning about how programs work. A debugger lets you observe your program as it runs. You can control the program’s execution by running code one piece at a time, or by setting “breakpoints” where execution will pause. When the debugger has paused your program, you can look at its state (e.g., the values of its variables).
If you are using debuggers and logging as appropriate, you will find bugs more efficiently and with fewer unpleasant side effects than alternative techniques like scattering print statements throughout your code, or randomly changing things until the program seems to work. These bad habits might seem easier at the time, but they make things harder than necessary.
Which debugger?¶
Though their interfaces differ in subtle ways, these all do more or less the same things. It probably isn’t critical which one you use, as long as it works for you. If you don’t know or care what to learn, just learn pdb - although its interface lacks polish, pdb comes with Python and serves the purpose.
Here’s a short list of commonly used Python debuggers:
- pdb comes with Python and offers a very basic text interface driven by typed commands.
- ipdb is similar to pdb, but a little fancier because it uses IPython. Check it out if you use IPython.
- pudb offers a full-screen, color console (text mode) interface and is otherwise pretty similar to pdb.
- pdbpp is also similar to pdb and ipdb, with features like color and tab completion.
- winpdb is a full-blown GUI debugger for any platform, not just Windows.
When debugging web apps, you’d usually use a debugger tailored for whatever framework you are using.
Feel free to use any debugger’s own docs to learn how to install and use it, or use any tutorial you want. But if you want to be effective in Python, please make sure you know how to use a debugger. If you want a little more introduction, read on.
Pseudo-debugging with python -i¶
This trick is good for when you want to run a program, then inspect the state that it leaves behind.
python -i myfile.py
This runs the code inside myfile.py, then drops you at a python prompt where you can look at the “post mortem,” run the defined functions and so on.
You can add a “-c” flag if you want to run a snippet from the shell rather than running a file:
python -i -c "from foo import bar; x = bar(2)"
If you use ipython, the same tricks work with that in the same way.
But if you need to understand what is going on in some code DURING its execution, you are going to have to use a real debugger.
Basic debugging with pdb¶
Here is a brief introduction to pdb for beginners. pdb is the basic debugger supplied with Python itself. pdb looks plain, but the concepts used in any debugger are essentially the same.
An example program¶
We need something to debug, so here’s a simple example program.
This tutorial assumes the following code is saved in /tmp/example.py
.
def foo(x):
"""Add 2 to a number.
:arg x:
a number to work with.
:returns:
the input value plus 2.
"""
y = x + 2
return y
def main():
x = 4
y1 = foo(x)
y2 = foo(y1)
print("y2 was {0}".format(y2))
if __name__ == "__main__":
main()
Before doing anything with the debugger, reflect for a minute on how this
program works. This will help you understand how pdb tells you about programs.
Here is a conceptual overview of what happens during execution of
/tmp/example.py
by the Python interpreter.
- The module is imported, running the
def foo
,def main
, andif __name__
lines. Since running it as a script or with pdb makes__name__ == "__main__"
, it will then runmain()
(at line 21). - Inside
main()
, at line 14, the local name x (local to main, that is) is bound to the value 4. - Then, also inside
main()
at line 15,foo()
is called with the value of x (that is, 4). - Inside
foo()
, the local name x (local to foo, that is) is bound to the value passed in, in this case 4. Then, at line 9, the name y is bound to the value of 4 + 2, i.e., 6, and this value is returned at line 10. - Then, at the completion of
main()
line 15, the name y1 is bound to the value just returned byfoo()
. - Then in
main()
at line 16, foo is called with the value of y1 (that is, 6). - Then, inside
foo()
again: the local name x (local to foo, that is) is bound to the value passed in, in this case 6. Then the name y is bound to the value of 6 + 2, i.e., 8, and this value is returned. - Then, back in
main()
at the completion of line 16, the name y2 is bound to the value just returned byfoo()
. - In
main()
at line 17, a message is printed containing the value of y2. main()
runs out of things to do, and returns.- since the module has finished running main, and there is nothing left to do, it exits.
In order to understand tracebacks and debuggers, you need to know about something called the “stack.” The stack represents the current state of a running program in memory. It keeps track of the different functions or methods which are currently running - one running inside another, each having its own “stack frame” to contain its state.
For example, when Python runs a module like example.py, it makes a “stack
frame” for that module’s running state, such as the names bound to values at
the top level of the module.
This stack frame is labeled <module>
by pdb. Then inside that, the module
runs main()
, which has its own stack frame; and main()
runs foo()
,
which has yet another stack frame of its own. When foo()
is running, the
top stack frame is <module>
, and the bottom one is foo()
. Just before
foo()
runs and after it returns, the bottom stack frame is main()
. And
so on. The contents of the stack change as the program runs.
Running the example in pdb¶
You can run example.py inside pdb with this command from the command line:
python -m pdb example.py
On Unix-like OSes where pdb is on $PATH, you can alternatively use
pdb example.py
Here’s an example of what you might see first after running this:
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb)
This display may be confusing at first, so here is a little explanation to break it down.
- The first line tells you which file is running (
/tmp/example.py
), and the line number pdb is paused at(1)
. The current selected stack frame is<module>
, or in other words the one with the global scope for example.py. - The second line
-> def foo(x):
is the content of the current line of Python code, which is just about to run as soon as you tell pdb to continue. - Finally, the
(Pdb)
line is a prompt - it means the program is ‘paused’ and pdb is waiting for you to type in a command, which will run after you hit enter.
Getting help in pdb¶
Now what can you do from the (Pdb)
prompt? One thing you can do is list
available commands by typing help
, which should show you
something like this:
Documented commands (type help <topic>):
========================================
EOF bt cont enable jump pp run unt
a c continue exit l q s until
alias cl d h list quit step up
args clear debug help n r tbreak w
b commands disable ignore next restart u whatis
break condition down j p return unalias where
To get help on a specific command – continue
for example – use
help continue
:
(Pdb) help continue
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.
In other words, the commands c
or cont
or continue
will run until
a breakpoint, and then stop. The curious can use the help
command to learn
about all the other commands available in pdb.
continue¶
Let’s try out that continue
command. To do that at the (Pdb) prompt, type
in continue
and press the Enter key.
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) continue
After typing continue
and hitting enter, you’ll see that the whole program
runs, printing y2 was 8
. (If you don’t understand why, refer back to An
example program.) Then, as you’ll see, pdb automatically goes back to the
beginning.
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) continue
y2 was 8
The program finished and will be restarted
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb)
You can always use quit
to exit pdb when you are done debugging, but let’s
not do that yet.
next and step¶
Now let’s just run one top-level line at a time, using next
:
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) next
> /tmp/example.py(13)<module>()
-> def main():
(Pdb)
This time the program just runs forward one line each time we type ‘next’.
But did you notice that it seemed to skip all the lines in example.py between
def foo(x):
and def main():
? That is because it is stepping through
lines at the top level - i.e., the ones that are not inside a function. (If
you want to run in smaller steps than that, use step
which will stop as
soon as possible. Check help step
for more information on that.)
For now, let’s keep using next
, and pdb will continue to run the program,
line-by-line (at the top level).
> /tmp/example.py(13)<module>()
-> def main():
(Pdb) next
> /tmp/example.py(20)<module>()
-> if __name__ == "__main__":
(Pdb) next
> /tmp/example.py(21)<module>()
-> main()
(Pdb) next
y2 was 8
--Return--
> /tmp/example.py(21)<module>()->None
-> main()
(Pdb) next
--Return--
> <string>(1)<module>()->None
Again, the lines between def main():
and if __name__ == "__main__":
were not displayed because they are not at the top level; remember that
next
steps through lines at the currently selected level, so if you really
need more detail then use step
instead.
break¶
Suppose we want to see what is going on inside foo
when it runs.
We can set a breakpoint at the beginning of foo
using break foo
. Then
when we run continue
, pdb will keep going until the point when foo
begins, where it will stop and wait at the prompt.
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) break foo
Breakpoint 1 at /tmp/example.py:1
(Pdb) continue
> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb)
You can set as many breakpoints in your program as you want. breakpoints can
be removed using the clear
command. As an exercise, you can read about
this by using help clear
from inside pdb.
pdb.set_trace()¶
As an aside, another way to set a breakpoint, from outside pdb, is to temporarily insert this line into your program, at the location you want to execution to stop:
import pdb; pdb.set_trace()
This code actually starts pdb from inside your program.
So if you use this, you run the script as normal (NOT using pdb), and when this
line is hit the debugger should normally start for you; when using
pdb.set_trace(), it isn’t useful to run pdb example.py
or python -m pdb
example.py
(as discussed in Running the example in pdb).
The problem with this technique is that it is very easy to forgetfully leave such lines in the code, which will cause it to hang later. If you use this technique, make sure you remove these lines before committing! (Commit hooks that scan for this special line are sometimes used to ensure it doesn’t find its way into production code.)
This will also have confusing results if done inside a service that does not have access to the console (it will just appear to hang). Debugging web apps often requires using facilities specific to your web framework.
Inspecting state¶
Now suppose we want to see the values of x
and y
inside foo
.
You can usually just enter the name of a variable at the (Pdb)
prompt to have pdb evaluate it and show you its value in the current stack
frame, in this case from foo
.
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) break foo
Breakpoint 1 at /tmp/example.py:1
(Pdb) continue
> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb) x
4
(Pdb) y
*** NameError: name 'y' is not defined
(Pdb) next
> /tmp/example.py(10)foo()
-> return y
(Pdb) y
6
(Pdb)
Note that when the current line in pdb is y = x + 2
, x
is not yet
defined because the line is not finished running yet. After we step through
that line it has been run, and then y
is defined.
(Actually you can have pdb evaluate whatever Python code you want. For example, you can call a test function directly from pdb in order to see what happens during the test, or you can even change the values of variables from the debugger. But doing complicated things in the debugger tends to confuse what is happening in the program with what is happening in the commands you just typed in, so try to keep it simple.)
where¶
If at any time you want to see where execution is currently at in the code,
use the where
command to see the current stack trace, listing the stack
frames that are currently on the stack, from top to bottom.
(Pdb) where
/usr/lib/python2.7/bdb.py(400)run()
-> exec cmd in globals, locals
<string>(1)<module>()
/tmp/example.py(21)<module>()
-> main()
/tmp/example.py(15)main()
-> y1 = foo(x)
> /tmp/example.py(9)foo()
-> y = x + 2
Let’s break this down.
We see a list of stack frames and for each one, the current line. The current
selected stack frame (the one you are currently inspecting) is marked with
>
at the start of the line. In this case, that’s main()
. Each line
starting with ->
is the line of code that will next run for each stack
frame.
- The first stack frame is inside
bdb.py
, which is actually used by pdb itself; it is usingexec cmd in globals, locals
to run our module, at its own line 400. Usually you would ignore this. - The next stack frame is for the module scope
<module>
- that’s the scope of code that’s in example.py but not inside any function or class. Here we are at line 21 in example.py, which calls functionmain
. - The next stack frame is for function
main
, where the running line isy1 = foo(x)
. That’s line 15 in example.py. - The last stack frame is for function
foo
, where the running line isy = x + 2
. That’s line 9 in example.py.
Note the >
to the left of /tmp/example.py(9)foo()
. That
means this stack frame is the currently selected one.
If the currently selected stack frame is for an invocation of foo
and you
type y
, you will see the value of y that is local to that invocation of
foo. If you type x
, you will see the value of x
that is local to
foo
, and so on. Similarly, if you type next
then pdb will single-step
its way through foo
, for each line in foo
.
We ended up with foo selected because we just set a breakpoint for foo, which is where pdb stopped. But it’s easy to change which stack frame is currently selected - see the next section, up and down.
up and down¶
Suppose you currently have selected the stack frame for foo
, but you
want to see variables that are inside main
. Or you want to have
next
single-step through lines in main
instead instead of foo
.
Use up
to go from the stack frame for foo
(which is at the bottom of
the stack) to the stack frame for main
(closer to the top of the stack).
(Pdb) u
> /tmp/example.py(15)main()
-> y1 = foo(x)
(Pdb)
Now the current stack frame, marked by >
, is for main
. The next line is
still y1 = foo(x)
, since pdb has not advanced the execution of the program
yet. If you look at where
you will see that >
has moved in the stack
trace to indicate the changed selection. And now when you have pdb evaluate
something like x
, you will get its value in the context of main
rather
than the context of foo
.
args¶
Use the args
command to see what arguments were given to the function
running in the currently selected stack frame. In this example, we see that
foo
received a value of 4 for x as its only argument.
> /tmp/example.py(9)foo()
-> y = x + 2
(Pdb) args
x = 4
list¶
This is a shortcut to show you the code context of the current line in the currently selected stack frame. For example, if I have selected foo and execution is at the beginning of the function, I get this:
> /tmp/example.py(1)<module>()
-> def foo(x):
(Pdb) list
1 -> def foo(x):
2 """Add 2 to a number.
3
4 :arg x:
5 a number to work with.
6 :returns:
7 the input value plus 2.
8 """
9 y = x + 2
10 return y
11
(Pdb)
Note that the line to be executed is marked with ->
. Of course, docstrings
and comments are not executed, so execution skips over these and the line marker
will not stop on them either.
Further Reading¶
For more on pdb, check out the official pdb documentation.