The solutions for exercise07 of Missingsemester(2020)
debugging
Use
journalctl
on Linux orlog show
on macOS to get the super user accesses and commands in the last day. If there aren’t any you can execute some harmless commands such assudo ls
and check again.
When I run log show --last 1d
, it keeps running for along time. I don’t know how long it make take, so I will just execute harmless sudo ls
to check the log.
> sudo ls
> log show --last 10s | grep -E "sudo"
# you should see the last line of output says
# you once run ls in super user access
Do this hands on
pdb
tutorial to familiarize yourself with the commands. For a more in depth tutorial read this.
I have finished this tutorial. Actually, I fork this repo and add chinese translation.
Install
shellcheck
and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically.#!/bin/sh ## Example: a typical script with several problems for f in $(ls *.m3u) do grep -qi hq.*mp3 $f \ && echo -e 'Playlist $f contains a HQ file in mp3 format' done
ERROR
- SC2045. In line 3, it says Iterating over ls output is fragile. Use globs.
WARNINGS
-
SC2062. In line 5, it says Quote the grep pattern so the shell won’t interpret it.
-
SC3037. In line 6, it says In POSIX sh, echo flags are undefined. I think we can use
bash
instead ofsh
INFO
-
SC2016. In line 6, it says Expressions don’t expand in single quotes, use double quotes for that.
-
SC2086. In line 5, it says Double quote to prevent globbing and word splitting.
-
SC3010. After following the guide of SC2045, I added a line(
[[ -e $f ]] || break
) belowdo
. SC3010 says In POSIX sh, [[ ]] is undefined. Then I change shebang to#!/bin/bash
So the final version of sh
file is ⬇️
#!/bin/bash
## Example: a typical script with several problems
for f in *.m3u
do
[[ -e "$f" ]] || break # handle the case of no *.m3u files
grep -qi "hq.*mp3" "$f" \
&& echo -e "Playlist $f contains a HQ file in mp3 format"
done
(Advanced) Read about reversible debugging and get a simple example working using
rr
orRevPDB
.
PASS :)
Profiling
Here are some sorting algorithm implementations. Use
cProfile
andline_profiler
to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use thenmemory_profiler
to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Useperf
to look at the cycle counts and cache hits and misses of each algorithm.
🤔We can type python -m cProfile -s tottime sorts.py 1000
in the terminal to get the runtime of insertion sort and quicksort. 1000
means that we will run this sorts.py
1000 times. Check documentations for more details.
ncalls tottime percall cumtime percall filename:lineno(function)
...
1000 0.045 0.000 0.045 0.000 sorts.py:12(insertionsort)
33218/1000 0.044 0.000 0.068 0.000 sorts.py:25(quicksort)
...
🤔If you want to use line_profiler
to compare the runtime of insertion sort and quicksort, you need to add @profile
decorator above insertion sort and quicksort(You should install line_profiler
fist by pip install line_profiler
). The code will be like ⬇️
@profile
def insertionsort(array):
...
@profile
def quicksort(array):
...
Then type kernprof -v -l sorts.py
in your terminal. The output will look like: ⬇️
Wrote profile results to sorts.py.lprof
Timer unit: 1e-06 s
Total time: 0.382157 s
File: sorts.py
Function: insertionsort at line 11
Line # Hits Time Per Hit % Time Line Contents
==============================================================
11 @profile
12 def insertionsort(array):
13
14 25881 10925.0 0.4 2.9 for i in range(len(array)):
15 24881 10814.0 0.4 2.8 j = i-1
16 24881 11411.0 0.5 3.0 v = array[i]
17 221660 133321.0 0.6 34.9 while j >= 0 and v < array[j]:
18 196779 113095.0 0.6 29.6 array[j+1] = array[j]
19 196779 88824.0 0.5 23.2 j -= 1
20 24881 13333.0 0.5 3.5 array[j+1] = v
21 1000 434.0 0.4 0.1 return array
Total time: 0.162542 s
File: sorts.py
Function: quicksort at line 24
Line # Hits Time Per Hit % Time Line Contents
==============================================================
24 @profile
25 def quicksort(array):
26 33150 24083.0 0.7 14.8 if len(array) <= 1:
27 17075 9918.0 0.6 6.1 return array
28 16075 10775.0 0.7 6.6 pivot = array[0]
29 16075 46333.0 2.9 28.5 left = [i for i in array[1:] if i < pivot]
30 16075 46896.0 2.9 28.9 right = [i for i in array[1:] if i >= pivot]
31 16075 24537.0 1.5 15.1 return quicksort(left) + [pivot] + quicksort(right)
So we can easily find that the bottleneck(rows which have high % Time
). More details can be found on this
🤔Now we come to the memory_profiler
part. If you haven’t installed it, you may run pip install -U memory_profiler
. The usage of memory_profiler
is quite similar to the line_profiler
. Both of them need to add @profile
first, then you can run ``. The output will be like ⬇️
Filename: sorts.py
Line # Mem usage Increment Occurences Line Contents
============================================================
11 38.176 MiB 38.176 MiB 1000 @profile
12 def insertionsort(array):
13
14 38.176 MiB 0.000 MiB 25694 for i in range(len(array)):
15 38.176 MiB 0.000 MiB 24694 j = i-1
16 38.176 MiB 0.000 MiB 24694 v = array[i]
17 38.176 MiB 0.000 MiB 223188 while j >= 0 and v < array[j]:
18 38.176 MiB 0.000 MiB 198494 array[j+1] = array[j]
19 38.176 MiB 0.000 MiB 198494 j -= 1
20 38.176 MiB 0.000 MiB 24694 array[j+1] = v
21 38.176 MiB 0.000 MiB 1000 return array
Filename: sorts.py
Line # Mem usage Increment Occurences Line Contents
============================================================
24 38.172 MiB 38.148 MiB 33522 @profile
25 def quicksort(array):
26 38.172 MiB 0.000 MiB 33522 if len(array) <= 1:
27 38.172 MiB 0.000 MiB 17261 return array
28 38.172 MiB 0.000 MiB 16261 pivot = array[0]
29 38.172 MiB 0.004 MiB 156337 left = [i for i in array[1:] if i < pivot]
30 38.172 MiB 0.000 MiB 156337 right = [i for i in array[1:] if i >= pivot]
31 38.172 MiB 0.020 MiB 16261 return quicksort(left) + [pivot] + quicksort(right)
It is weird, insertion sort is no better than quicksort in memory usage 😿
🤔If we use the inlace version of quicksort, we can find that it uses slightly less memory.
🤔In Mac, there is no perf
, so I will just skip this.
Here’s some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number.
#!/usr/bin/env python def fib0(): return 0 def fib1(): return 1 s = """def fib{}(): return fib{}() + fib{}()""" if __name__ == '__main__': for n in range(2, 10): exec(s.format(n, n-1, n-2)) # from functools import lru_cache # for n in range(10): # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) print(eval("fib9()"))
Put the code into a file and make it executable. Install prerequisites:
pycallgraph
andgraphviz
. (If you can rundot
, you already have GraphViz.) Run the code as is withpycallgraph graphviz -- ./fib.py
and check thepycallgraph.png
file. How many times isfib0
called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling eachfibN
function now?
First, we need to install prerequisites ⬇️
> pip install pycallgraph> brew install graphviz
Then we can run commands followed ⬇️
> pycallgraph graphviz -- ./fib.py
Open pycallgraph.png
(in the same working dir). We can know that fib0
was called 21 times.
A common issue is that a port you want to listen on is already taken by another process. Let’s learn how to discover that process pid. First execute
python -m http.server 4444
to start a minimal web server listening on port4444
. On a separate terminal runlsof | grep LISTEN
to print all listening processes and ports. Find that process pid and terminate it by runningkill <PID>
.
> python -m http.server 4444> lsof | grep LISTEN> kill <PID> # <PID> can be found in the output of last command
Limiting processes resources can be another handy tool in your toolbox. Try running
stress -c 3
and visualize the CPU consumption withhtop
. Now, executetaskset --cpu-list 0,2 stress -c 3
and visualize it. Isstress
taking three CPUs? Why not? Readman taskset
. Challenge: achieve the same usingcgroups
. Try limiting the memory consumption ofstress -m
.
After I ran stress -c 3
, the htop
shows I am taking 3 CPUs.
In Mac, there is no taskset
or cgroups
.
(Advanced) The command
curl ipinfo.io
performs a HTTP request and fetches information about your public IP. Open Wireshark and try to sniff the request and reply packets thatcurl
sent and received. (Hint: Use thehttp
filter to just watch HTTP packets).
PASS :)