Performance Analysis Resources
Profiling
- Brendan Gregg's perf examples: Examples for how to use
perf
to understand performance on linux and profile hardware performance counters - Tracy
- Magic Trace
- Visual Studio's built-in profiling tools
- Xcode Instruments[1]
Codegen
- Compiler Explorer: Tool for analyzing compiler codegen
- Tip: Make sure to turn on optimizations (
-O3
). You will need to be careful to write your code in such a way that the compiler doesn't optimize it all away.
- Tip: Make sure to turn on optimizations (
NOTE
It's a common pitfall to try to estimate x86 performance by eye. This is impossible to do on modern CPUs in general due to architecture details such as pipelining, superscalar pipelining, branch prediction, micro-architecture details, register renaming, etc. This is especially true for complicated architectures like x86. As such, it's necessary to alway profile and benchmark in addition to looking at the generated code.
x86
- uiCA: Tool for simulating instruction performance on x86
- Tip: Select
llvm-mca
as well asuiCA
and also select "Trace Table"
- Tip: Select
- uiCA's instruction table
- llvm-mca tool on Compiler Explorer
This is an excellent tutorial for Instruments: https://www.jviotti.com/2024/01/29/using-xcode-instruments-for-cpp-cpu-profiling.html ↩︎