Python Profiling
Code profiling is a method to understand how fast your code runs. It's the first step to optimising performance. You can get lots of detailed information about which part(s) of your code run, how many times and for how long each time. Using this information guides you towards where to spend your time when seeking to make your code run faster. That is, you are better off spending your time in areas of the code base that overall take longest to execute and working down the list from there. By "longest to execute" I mean the total run time as measured by the number of calls to that part of the code multiplied by the amount of time it takes to run.
You could, for example, have one function that looks really complicated, takes 10 seconds to run and is only called once. There would be little point in starting optimisation with that function if you have a second function that takes 1 second to run but is called 60 seconds to run. Profiling gives you this information and, as I said, is the first step towards optimisation. It's only the first step as the act of profiling alone does not tell you what to do in order to speed your code up, just where to start looking.
I was recently profiling some of my Python code using the standard Python profiler known as cprofile and quickly realised the results weren't giving me the information I needed. A quick bit of reading around later and I realised cprofile does not very well support asynchronous Python as written with the asyncio library. Enter the Yappi profiler that I've not used before. Hence, here's some notes on how to do some basic profiling.
Example Program
Let's take an example asyncio program that's a simple "Hello, World!" implementation...
for i in range(1,4):
await asyncio.sleep(i)
print("Hello")
async def func2():
for i in range(3):
await asyncio.sleep(1)
print("World")
async def main():
await asyncio.gather(func1(), func2())
asyncio.run(main())
CProfile
Starting with cprofile, we can run the above code like this:
This will give a long list of output that, among a lot of other things, shows you some lines like this:
This doesn't seem to help us very much1
Note that you can also save the results to a file (more about that later) by running this:
python -m cProfile -o hello-cprofile.prof hello.py
Yappi
You can't run Yappi as a module in the same way you can with CProfile so you need to make some code modifications. For the simple script in this example, import the yappi module at the top and surround the call to the asyncio main loop as follows:
with yappi.run():
asyncio.run(main())
yappi.get_func_stats().print_all()
yappi.get_func_stats().save("hello-yappi.prof", type="PSTAT")
This will print some similar sorts of information as obtained through CProfile, only this time the results are much more useful:
hello.py:6 func1 1 0.000026 6.006225 6.006225
The clear difference between the two tools is evident here as Yappi is able to give the correct number of calls made to func1 and func2 and precise timings. You can now easily see that func1 and func2 are both called once each but func1 takes 6 seconds to complete (sleep values of 1+2+3) and func2 takes 3 seconds to complete (sleep values of 1+1+1). Hence, when optimising we would now know to concentrate first on func1 rather than func2.
Snakeviz
Both CProfile and Yappi are capable of saving their profiling to a binary file in a format that can be viewed by a neat little tool called Snakeviz. To run snakeviz, install it and point it at one of the profiles dumped to disk, e.g.
It starts your web browser with some visualisations of the profile. The example above is pretty boring to look through as it's so simple but it does allow you to track down something like this:
Comments