1. Profiling your job
There are a couple steps you can use to check the running time of your script and profile it for efficiency and more speed.
1.1. Single script. Using linux functionalities.
Get the time and memory resources at the end of the analysis.
/usr/bin/time -v python3 run.py
Result:
Command being timed: "python3 run.py"
User time (seconds): 13.96
System time (seconds): 1.05
Percent of CPU this job got: 47%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.33
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 206852
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 146494
Voluntary context switches: 9083
Involuntary context switches: 513
Swaps: 0
File system inputs: 96
File system outputs: 76096
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
1.2. Single script. Using python functionalities.
More detailed profile of the time using cProfile
, run:
python3 -m cProfile -o profile_output.prof run.py
This command creates a profile report in profile_output.prof
, which you can view with:
python3 -m pstats profile_output.prof
Then, interactively:
sort cumtime
stats 20
Or, to view a sorted profile output directly:
python3 -m cProfile -s time run.py
1.3. Sbatch
1.3.1. Original command
# Run the Python script with srun and pass the unique PID
srun --ntasks=1 --exclusive python3 run.py -myPID "$pid" &
1.3.2. Profiling options using strace
(preferred)
The following will output the action times but not the python functions that causes the action.
strace -o ./ignore/trace_${pid}_output.txt -T -tt -f srun --ntasks=1 --exclusive python3 run.py -myPID "$pid"
strace -o ./ignore/trace_${pid}_output.txt -ff -tt -e trace=all python3 -m trace --trace run.py -myPID "$pid"
strace -o ./ignore/trace_${pid}_output.txt -T -tt -f -e trace=all srun --ntasks=1 --exclusive python3 -m trace --trace run.py -myPID "$pid"
trace_file="/scratch/users/bastudil/ignore/trace_${SLURM_ARRAY_TASK_ID}_${pid}.txt"
strace -o $trace_file -T -tt -f -e trace=all srun --ntasks=1 --exclusive python3 -m trace --trace run.py -myPID "$pid" &
# Define the output file for FDs
FD_LOG_FILE="./ignore/fd_log_${SLURM_JOB_ID}.txt"
# Run your job in the background
your_program &
# Log all open FDs every few seconds
while kill -0 $! 2>/dev/null; do
echo "Logging FDs at $(date)" >> "$FD_LOG_FILE"
lsof -p $! >> "$FD_LOG_FILE"
sleep 5 # Log every 5 seconds, adjust as needed
done
# Wait for the job to complete
wait
perf record -F 99 -g -o $trace_data -- srun --ntasks=1 --exclusive python3 -m trace --trace run.py -myPID "$pid" &
perf report -i my_perf_data.data
.
strace -o $trace_file -T -tt -f -e trace=all srun --ntasks=1 --exclusive python3 -m cProfile -o $pstats run.py -myPID "$pid" >> "$python_console" 2>> "$python_error" &
python -m snakeviz "\\wsl.localhost\Ubuntu\home\bastudil_linux\sherlock_scratch\ignore\55148051_80_profile.pstats"
.
pip install pyinstrument
strace -o $trace_file -T -tt -f -e trace=all srun --ntasks=1 --exclusive pyinstrument -o $python_pyinstrument run.py -myPID "$pid" >> "$python_console" 2>> "$python_error" &
If some module take long to import, you could preload the modules using sitecustomize.py
Using a sitecustomize.py file is a convenient way to preload modules in Python. This approach works well for reducing load times in HPC environments because the file is automatically executed by Python whenever the interpreter starts, making it suitable for preloading frequently used libraries across all scripts. Here’s a detailed guide on setting up sitecustomize.py:
import site print(site.getsitepackages())
1.3.3. Profiling options using bin/time
/usr/bin/time -v srun --ntasks=1 --exclusive python3 run.py -myPID "$pid" & > ./ignore/task_${pid}output.txt 2> ./ignore/task${pid}_time.txt
1.3.4. Profiling options using perf
perf stat -e 'cpu-clock,task-clock,context-switches,page-faults,cycles,instructions,cache-misses' srun --ntasks=1 --exclusive python3 run.py -myPID "$pid"