Profiling is a technique to figure out how time is spent in a program. With this statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, hot spot in unexpected location may hint a bug in the program as well.
In this tutorial, we will see how we can use the profiling facility in Python. Specifically, you will see
- How we can compare small code fragments using
timeit
module - How we can profile the entire program using
cProfile
module - How we can invoke a profiler inside an existing program
- What the profiler cannot do
Let’s get started.

Profiling Python Code. Photo by Prashant Saini. Some rights reserved.
Tutorial Overview
This tutorial is in four parts; they are:
- Profiling small fragments
- The profile module
- Using profiler inside code
- Caveats
Profiling small fragments
When you are asked about the different ways of doing the same thing in Python, one perspective is to check which one is more efficient. In Python’s standard library, we have the timeit
module that allows us to do some simple profiling.
For example, to concatenate many short strings, we can use the join()
function from strings or use the +
operator. So how we know which is faster? Consider the following Python code:
longstr = "" for x in range(1000): longstr += str(x)
This will produce a long string 012345....
in the variabe longstr
. Alternative way to write this is:
longstr = "".join([str(x) for x in range(1000)])
To compare the two, we can do the following at the command line:
python -m timeit 'longstr=""' 'for x in range(1000): longstr += str(x)' python -m timeit '"".join([str(x) for x in range(1000)])'
These two commands will produce the following output:
1000 loops, best of 5: 265 usec per loop 2000 loops, best of 5: 160 usec per loop
The above commands is to load the timeit
module and pass on a single line of code for measurement. In the first case, we have two lines of statements and they are passed on to the timeit
module as two separate arguments. In the same rationale, the first command can also be presented as three lines of statements (by breaking the for loop into two lines), but the indentation of each line needs to be quoted correctly:
python -m timeit 'longstr=""' 'for x in range(1000):' ' longstr += str(x)'
The output of timeit
is to find the best performance among multiple runs (default to be 5). Each run is to run the provided statements a few times (which is dynamically determined). The time is reported as the average to execute the statements once in the best run.
While it is true that the join function is faster than the +
operator for string concatenation, the timing above is not a fair comparison. It is because we use str(x)
to make short strings on the fly during the loop. The better way to do are the following:
python -m timeit -s 'strings = [str(x) for x in range(1000)]' 'longstr=""' 'for x in strings:' ' longstr += str(x)' python -m timeit -s 'strings = [str(x) for x in range(1000)]' '"".join(strings)'
which produces:
2000 loops, best of 5: 173 usec per loop 50000 loops, best of 5: 6.91 usec per loop
The -s
option allows us to provide the “setup” code, which is executed before the profiling and not timed. In the above, we create the list of short strings before we start the loop. Hence the time to create those strings are not measured in the “per loop” timing. From the above, we see that the join()
function is two orders of magnitude faster than the +
operator. The more often use of the -s
option is to import the libraries. For example, we can compare the square root function from Python’s math module, from numpy, and using the exponential operator **
as follows:
python -m timeit '[x**0.5 for x in range(1000)]' python -m timeit -s 'from math import sqrt' '[sqrt(x) for x in range(1000)]' python -m timeit -s 'from numpy import sqrt' '[sqrt(x) for x in range(1000)]'
The above produces the following measurement, which we see that math.sqrt()
is fastest while numpy.sqrt()
is slowest in this particular example:
5000 loops, best of 5: 93.2 usec per loop 5000 loops, best of 5: 72.3 usec per loop 200 loops, best of 5: 974 usec per loop
If you wonder why numpy is slowest, it is because numpy is optimized for arrays. You will see its exceptional speed in the following alternative:
python -m timeit -s 'import numpy as np; x=np.array(range(1000))' 'np.sqrt(x)'
where the result is:
100000 loops, best of 5: 2.08 usec per loop
If you prefer, you can also run timeit
in Python code. For example, the following will be similar to the above, but give you the raw total timing for each run:
import timeit measurements = timeit.repeat('[x**0.5 for x in range(1000)]', number=10000) print(measurements)
In the above, each run is to execute the statement 10000 times; the result is as follows, which you can see the result of roughly 98 usec per loop in the best run:
[1.0888952040000106, 0.9799715450000122, 1.0921516899999801, 1.0946189250000202, 1.2792069260000005]
The profile module
Focus on a statement or two for performance is from a microscopic perspective. Chances are, we have a long program and want to see what is causing it to run slow. That happens before we can consider alternative statements or algorithms.
A program running slow can generally due to two reasons: A part is running slow, or a part is running too many times and that added up to take too much time. We call these “performance hogs” the hot spot. Let’s look at an example. Consider the following program that uses hill climbing algorithm to find hyperparameters for a perceptron model:
# manually search perceptron hyperparameters for binary classification from numpy import mean from numpy.random import randn from numpy.random import rand from sklearn.datasets import make_classification from sklearn.model_selection import cross_val_score from sklearn.model_selection import RepeatedStratifiedKFold from sklearn.linear_model import Perceptron # objective function def objective(X, y, cfg): # unpack config eta, alpha = cfg # define model model = Perceptron(penalty='elasticnet', alpha=alpha, eta0=eta) # define evaluation procedure cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1) # evaluate model scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1) # calculate mean accuracy result = mean(scores) return result # take a step in the search space def step(cfg, step_size): # unpack the configuration eta, alpha = cfg # step eta new_eta = eta + randn() * step_size # check the bounds of eta if new_eta <= 0.0: new_eta = 1e-8 if new_eta > 1.0: new_eta = 1.0 # step alpha new_alpha = alpha + randn() * step_size # check the bounds of alpha if new_alpha < 0.0: new_alpha = 0.0 # return the new configuration return [new_eta, new_alpha] # hill climbing local search algorithm def hillclimbing(X, y, objective, n_iter, step_size): # starting point for the search solution = [rand(), rand()] # evaluate the initial point solution_eval = objective(X, y, solution) # run the hill climb for i in range(n_iter): # take a step candidate = step(solution, step_size) # evaluate candidate point candidate_eval = objective(X, y, candidate) # check if we should keep the new point if candidate_eval >= solution_eval: # store the new point solution, solution_eval = candidate, candidate_eval # report progress print('>%d, cfg=%s %.5f' % (i, solution, solution_eval)) return [solution, solution_eval] # define dataset X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1) # define the total iterations n_iter = 100 # step size in the search space step_size = 0.1 # perform the hill climbing search cfg, score = hillclimbing(X, y, objective, n_iter, step_size) print('Done!') print('cfg=%s: Mean Accuracy: %f' % (cfg, score))
Assume we saved this program in the file hillclimb.py
, we can run the profiler in the command line as follows:
python -m cProfile hillclimb.py
and the output will be the following:
>10, cfg=[0.3792455490265847, 0.21589566352848377] 0.78400 >17, cfg=[0.49105438202347707, 0.1342150084854657] 0.79833 >26, cfg=[0.5737524712834843, 0.016749795596210315] 0.80033 >47, cfg=[0.5067828976025809, 0.05280380038497864] 0.80133 >48, cfg=[0.5427345321546029, 0.0049895870979695875] 0.81167 Done! cfg=[0.5427345321546029, 0.0049895870979695875]: Mean Accuracy: 0.811667 2686451 function calls (2638255 primitive calls) in 5.500 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 101 0.001 0.000 4.892 0.048 hillclimb.py:11(objective) 1 0.000 0.000 5.501 5.501 hillclimb.py:2(<module>) 100 0.000 0.000 0.001 0.000 hillclimb.py:25(step) 1 0.001 0.001 4.894 4.894 hillclimb.py:44(hillclimbing) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(<module>) 303 0.000 0.000 0.008 0.000 <__array_function__ internals>:2(all) 303 0.000 0.000 0.005 0.000 <__array_function__ internals>:2(amin) 2 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(any) 4 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(atleast_1d) 3333 0.003 0.000 0.018 0.000 <__array_function__ internals>:2(bincount) 103 0.000 0.000 0.001 0.000 <__array_function__ internals>:2(concatenate) 3 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(copyto) 606 0.001 0.000 0.010 0.000 <__array_function__ internals>:2(cumsum) 6 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(dot) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(empty_like) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(inv) 2 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(linspace) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(lstsq) 101 0.000 0.000 0.005 0.000 <__array_function__ internals>:2(mean) 2 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(ndim) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(outer) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(polyfit) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(polyval) 1 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(prod) 303 0.000 0.000 0.002 0.000 <__array_function__ internals>:2(ravel) 2 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(result_type) 303 0.001 0.000 0.001 0.000 <__array_function__ internals>:2(shape) 303 0.000 0.000 0.035 0.000 <__array_function__ internals>:2(sort) 4 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(trim_zeros) 1617 0.002 0.000 0.112 0.000 <__array_function__ internals>:2(unique) ...
The normal output of the program will be printed first, and then the profiler’s statistics will be printed. From the first row, we see that the function objective()
in our program has run for 101 times that took a total of 4.89 seconds. But this 4.89 seconds are mostly spent on the functions it called, which the total time that spent at that function is merely 0.001 second. The functions from dependent modules are also profiled. Hence you see a lot of numpy functions above too.
The above output is long and may not be useful to you as it can be difficult to tell which function is the hot spot. Indeed we can sort the above output. For example, to see which function is called the most number of times, we can sort by ncalls
:
python -m cProfile -s ncalls hillclimb.py
Its output is as follows, which says the get()
function from a Python dict is the most used function (but it only consumed 0.03 seconds in total out of the 5.6 seconds to finish the program):
2685349 function calls (2637153 primitive calls) in 5.609 seconds Ordered by: call count ncalls tottime percall cumtime percall filename:lineno(function) 247588 0.029 0.000 0.029 0.000 {method 'get' of 'dict' objects} 246196 0.028 0.000 0.028 0.000 inspect.py:2548(name) 168057 0.018 0.000 0.018 0.000 {method 'append' of 'list' objects} 161738 0.018 0.000 0.018 0.000 inspect.py:2560(kind) 144431 0.021 0.000 0.029 0.000 {built-in method builtins.isinstance} 142213 0.030 0.000 0.031 0.000 {built-in method builtins.getattr} ...
The other sort options are as follows:
Sort string | Meaning |
---|---|
calls | Call count |
cumulative | Cumulative time |
cumtime | Cumulative time |
file | File name |
filename | File name |
module | File name |
ncalls | Call count |
pcalls | Primitive call count |
line | Line number |
name | Function name |
nfl | Name/file/line |
stdname | Standard name |
time | Internal time |
tottime | Internal time |
If the program takes some time to finish, it is not reasonable to run the program many times just to find the profiling result in a different sort order. Indeed, we can save the profiler’s statistics for further processing, as follows:
python -m cProfile -o hillclimb.stats hillclimb.py
Similar to above, it will run the program. But this will not print the statistics to the screen but to save it into a file. Afterwards, we can use the pstats
module like following to open up the statistics file and provide us a prompt to manipulate the data:
python -m pstats hillclimb.stats
For example, we can use sort command to change the sort order and use stats to print what we saw above:
Welcome to the profile statistics browser. hillclimb.stat% help Documented commands (type help <topic>): ======================================== EOF add callees callers help quit read reverse sort stats strip hillclimb.stat% sort ncall hillclimb.stat% stats hillclimb Thu Jan 13 16:44:10 2022 hillclimb.stat 2686227 function calls (2638031 primitive calls) in 5.582 seconds Ordered by: call count List reduced from 3456 to 4 due to restriction <'hillclimb'> ncalls tottime percall cumtime percall filename:lineno(function) 101 0.001 0.000 4.951 0.049 hillclimb.py:11(objective) 100 0.000 0.000 0.001 0.000 hillclimb.py:25(step) 1 0.000 0.000 5.583 5.583 hillclimb.py:2(<module>) 1 0.000 0.000 4.952 4.952 hillclimb.py:44(hillclimbing) hillclimb.stat%
You will notice that the stats
command above allows us to provide an extra argument. The argument can be a regular expression to search for the functions such that only those matched will be printed. Hence it is a way to provide a search string to filter.
This pstats
browser allows us to see more than just the table above. The callers
and callees
commands shows us which function calls which function and how many times it is called, and how much time it spent. Hence we can consider that as a breakdown of the function level statistics. It is useful if you have a lot of functions that calls each other and wanted to know how the time spent in different scenarios. For example, this shows that the objective()
function is called only by the hillclimbing()
function but the hillclimbing()
function calls several other functions:
hillclimb.stat% callers objective Ordered by: call count List reduced from 3456 to 1 due to restriction <'objective'> Function was called by... ncalls tottime cumtime hillclimb.py:11(objective) <- 101 0.001 4.951 hillclimb.py:44(hillclimbing) hillclimb.stat% callees hillclimbing Ordered by: call count List reduced from 3456 to 1 due to restriction <'hillclimbing'> Function called... ncalls tottime cumtime hillclimb.py:44(hillclimbing) -> 101 0.001 4.951 hillclimb.py:11(objective) 100 0.000 0.001 hillclimb.py:25(step) 4 0.000 0.000 {built-in method builtins.print} 2 0.000 0.000 {method 'rand' of 'numpy.random.mtrand.RandomState' objects} hillclimb.stat%
Using profiler inside code
The above example assumes you have the complete program saved in a file and profile the entire program. Sometimes, we focus on only a part of the entire program. For example, if we load a large module, it takes time to bootstrap and we want to ignore this from profiling. In this case, we can invoke the profiler only for certain lines. An example is as follows, which modified from the program above:
# manually search perceptron hyperparameters for binary classification import cProfile as profile import pstats from numpy import mean from numpy.random import randn from numpy.random import rand from sklearn.datasets import make_classification from sklearn.model_selection import cross_val_score from sklearn.model_selection import RepeatedStratifiedKFold from sklearn.linear_model import Perceptron # objective function def objective(X, y, cfg): # unpack config eta, alpha = cfg # define model model = Perceptron(penalty='elasticnet', alpha=alpha, eta0=eta) # define evaluation procedure cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1) # evaluate model scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1) # calculate mean accuracy result = mean(scores) return result # take a step in the search space def step(cfg, step_size): # unpack the configuration eta, alpha = cfg # step eta new_eta = eta + randn() * step_size # check the bounds of eta if new_eta <= 0.0: new_eta = 1e-8 if new_eta > 1.0: new_eta = 1.0 # step alpha new_alpha = alpha + randn() * step_size # check the bounds of alpha if new_alpha < 0.0: new_alpha = 0.0 # return the new configuration return [new_eta, new_alpha] # hill climbing local search algorithm def hillclimbing(X, y, objective, n_iter, step_size): # starting point for the search solution = [rand(), rand()] # evaluate the initial point solution_eval = objective(X, y, solution) # run the hill climb for i in range(n_iter): # take a step candidate = step(solution, step_size) # evaluate candidate point candidate_eval = objective(X, y, candidate) # check if we should keep the new point if candidate_eval >= solution_eval: # store the new point solution, solution_eval = candidate, candidate_eval # report progress print('>%d, cfg=%s %.5f' % (i, solution, solution_eval)) return [solution, solution_eval] # define dataset X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1) # define the total iterations n_iter = 100 # step size in the search space step_size = 0.1 # perform the hill climbing search with profiling prof = profile.Profile() prof.enable() cfg, score = hillclimbing(X, y, objective, n_iter, step_size) prof.disable() # print program output print('Done!') print('cfg=%s: Mean Accuracy: %f' % (cfg, score)) # print profiling output stats = pstats.Stats(prof).strip_dirs().sort_stats("cumtime") stats.print_stats(10) # top 10 rows
it will output the following:
>0, cfg=[0.3776271076534661, 0.2308364063203663] 0.75700 >3, cfg=[0.35803234662466354, 0.03204434939660264] 0.77567 >8, cfg=[0.3001050823005957, 0.0] 0.78633 >10, cfg=[0.39518618870158934, 0.0] 0.78633 >12, cfg=[0.4291267905390187, 0.0] 0.78633 >13, cfg=[0.4403131521968569, 0.0] 0.78633 >16, cfg=[0.38865272555918756, 0.0] 0.78633 >17, cfg=[0.38871654921891885, 0.0] 0.78633 >18, cfg=[0.4542440671724224, 0.0] 0.78633 >19, cfg=[0.44899743344802734, 0.0] 0.78633 >20, cfg=[0.5855375509507891, 0.0] 0.78633 >21, cfg=[0.5935318064858227, 0.0] 0.78633 >23, cfg=[0.7606367310048543, 0.0] 0.78633 >24, cfg=[0.855444293727846, 0.0] 0.78633 >25, cfg=[0.9505501566826242, 0.0] 0.78633 >26, cfg=[1.0, 0.0244821888204496] 0.79800 Done! cfg=[1.0, 0.0244821888204496]: Mean Accuracy: 0.798000 2179559 function calls (2140124 primitive calls) in 4.941 seconds Ordered by: cumulative time List reduced from 581 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 4.941 4.941 hillclimb.py:46(hillclimbing) 101 0.001 0.000 4.939 0.049 hillclimb.py:13(objective) 101 0.001 0.000 4.931 0.049 _validation.py:375(cross_val_score) 101 0.002 0.000 4.930 0.049 _validation.py:48(cross_validate) 101 0.005 0.000 4.903 0.049 parallel.py:960(__call__) 101 0.235 0.002 3.089 0.031 parallel.py:920(retrieve) 3030 0.004 0.000 2.849 0.001 _parallel_backends.py:537(wrap_future_result) 3030 0.020 0.000 2.845 0.001 _base.py:417(result) 2602 0.016 0.000 2.819 0.001 threading.py:280(wait) 12447 2.796 0.000 2.796 0.000 {method 'acquire' of '_thread.lock' objects}
Caveats
Using profiler with Tensorflow models may not produce what you would expect, especially if you have written your own custom layer or custom function for the model. If you did it correctly, Tenorflow supposed to build the computation graph before your model is executed and hence the logic will be changed. The profiler output will therefore not showing your custom classes.
Similarly for some advanced modules that involve binary code. The profiler can see you called some functions and marked it as “built-in” methods but it cannot go any further into the compiled code.
Below is a short code of LeNet5 model for the MNIST classification problem. If you try to profile it and print the top 15 rows, you will see that a wrapper is occupying majority of the time and nothing can be shown beyond that:
import numpy as np import tensorflow as tf from tensorflow.keras.datasets import mnist from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, Dense, AveragePooling2D, Flatten from tensorflow.keras.utils import to_categorical from tensorflow.keras.callbacks import EarlyStopping # Load and reshape data to shape of (n_sample, height, width, n_channel) (X_train, y_train), (X_test, y_test) = mnist.load_data() X_train = np.expand_dims(X_train, axis=3).astype('float32') X_test = np.expand_dims(X_test, axis=3).astype('float32') # One-hot encode the output y_train = to_categorical(y_train) y_test = to_categorical(y_test) # LeNet5 model model = Sequential([ Conv2D(6, (5,5), input_shape=(28,28,1), padding="same", activation="tanh"), AveragePooling2D((2,2), strides=2), Conv2D(16, (5,5), activation="tanh"), AveragePooling2D((2,2), strides=2), Conv2D(120, (5,5), activation="tanh"), Flatten(), Dense(84, activation="tanh"), Dense(10, activation="softmax") ]) model.summary(line_length=100) # Training model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) earlystopping = EarlyStopping(monitor="val_loss", patience=2, restore_best_weights=True) model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20, batch_size=32, callbacks=[earlystopping]) # Evaluate print(model.evaluate(X_test, y_test, verbose=0))
In the result below, the TFE_Py_Execute
is marked as “built-in” method and it consumes 30.1 sec out of the total run time of 39.6 sec. Note that the tottime is same as the cumtime meaning from profiler’s perspective, it seems all time are spent at this function and it doesn’t call any other functions. This illustrates the limitation of Python’s profiler.
5962698 function calls (5728324 primitive calls) in 39.674 seconds Ordered by: cumulative time List reduced from 12295 to 15 due to restriction <15> ncalls tottime percall cumtime percall filename:lineno(function) 3212/1 0.013 0.000 39.699 39.699 {built-in method builtins.exec} 1 0.003 0.003 39.699 39.699 mnist.py:4(<module>) 52/4 0.005 0.000 35.470 8.868 /usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py:58(error_handler) 1 0.089 0.089 34.334 34.334 /usr/local/lib/python3.9/site-packages/keras/engine/training.py:901(fit) 11075/9531 0.032 0.000 33.406 0.004 /usr/local/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py:138(error_handler) 4689 0.089 0.000 33.017 0.007 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py:882(__call__) 4689 0.023 0.000 32.771 0.007 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py:929(_call) 4688 0.042 0.000 32.134 0.007 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/function.py:3125(__call__) 4689 0.075 0.000 30.941 0.007 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/function.py:1888(_call_flat) 4689 0.158 0.000 30.472 0.006 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/function.py:553(call) 4689 0.034 0.000 30.152 0.006 /usr/local/lib/python3.9/site-packages/tensorflow/python/eager/execute.py:33(quick_execute) 4689 30.105 0.006 30.105 0.006 {built-in method tensorflow.python._pywrap_tfe.TFE_Py_Execute} 3185/24 0.021 0.000 3.902 0.163 <frozen importlib._bootstrap>:1002(_find_and_load) 3169/10 0.014 0.000 3.901 0.390 <frozen importlib._bootstrap>:967(_find_and_load_unlocked) 2885/12 0.009 0.000 3.901 0.325 <frozen importlib._bootstrap_external>:844(exec_module)
Finally, Python’s profiler gives you only the statistics on time but not memory usage. You may need to look for another library or tools for this purpose.
Further Readings
The standard library modules timeit
, cProfile
, pstats
have their documentation in Python’s documentation:
timeit
module: https://docs.python.org/3/library/timeit.htmlcProfile
module andpstats
module: https://docs.python.org/3/library/profile.html
The standard library’s profiler is very powerful but not the only one. If you want something more visual, you can try out the Python Call Graph module. It can produce a picture of how functions calling each other using the GraphViz tool:
- Python Call Graph: https://pycallgraph.readthedocs.io/en/master/
The limitation of not able to dig into the compiled code can be solved by not using the Python’s profiler but instead, use one for compiled programs. My favorite is Valgrind:
- Valgrind: https://valgrind.org/
but to use it, you may need to recompile your Python interpreter to turn on debugging support.
Summary
In this tutorial, we learned what is a profiler and what it can do. Specifically,
- We know how to compare small code with
timeit
module - We see Python’s
cProfile
module can provide us detailed statistics on how time is spent - We learned to use the
pstats
module against the output ofcProfile
to sort or filter
The post Profiling Python Code appeared first on Machine Learning Mastery.