Rust benchmarking with Criterion
July 03, 2024 -I was recently looking into benchmarking some Rust code and thought I would write up how to do so with the criterion library. Code for this post is available here at GitHub.
For this toy example we'll be writing code to determine the distance between 2 points. This is relatively easy to do with the pythagorean theorem, but the downside is you will need to use a relatively slow square root calculation to do so.
Many times you don't need the exact distance. Say you have a point and a vector of points and want to know which one in the vector is the closest - in this case what you care about is the relative distance each point is, and can skip the square root and use distance squared.
So how much faster would it be to use squared distance? Lets find out.
Create a library
Lets create a library and add criterion as a dependency.
$ cargo new --lib rust_benchmarking_example
$ cd rust_benchmarking_example
Now lets add criterion
as a dev-dependency
cargo add criterion --dev --features html_reports
Your Cargo.toml
file should now look something like this
[package]
name = "rust_benchmarking_example"
version = "0.1.0"
edition = "2021"
[dependencies]
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
Code
Open src/lib.rs
and add the following code
pub struct Vector2 {
pub x: f64,
pub y: f64,
}
impl Vector2 {
pub fn new(x: f64, y: f64) -> Vector2 {
Vector2 { x, y }
}
pub fn distance(&self, rhs: &Vector2) -> f64 {
let x = self.x - rhs.x;
let y = self.y - rhs.y;
(x * x + y * y).sqrt()
}
pub fn distance_squared(&self, rhs: &Vector2) -> f64 {
let x = self.x - rhs.x;
let y = self.y - rhs.y;
x * x + y * y
}
}
Unit tests
Lets make sure everything is working as expected, add the following tests to src/lib.rs
.
#[cfg(test)]
mod tests {
use super::*;
fn approximately(lhs: f64, rhs: f64) -> bool {
(lhs - rhs).abs() < f64::EPSILON
}
#[test]
fn distance_test() {
let a = Vector2::new(1.0, 5.0);
let b = Vector2::new(-2.0, 1.0);
let result = a.distance(&b);
assert!(approximately(result, 5.0), "{result} was not equal to 5.0");
let result = b.distance(&a);
assert!(approximately(result, 5.0), "{result} was not equal to 5.0");
}
#[test]
fn distance_squared_test() {
let a = Vector2::new(0.0, 0.0);
let b = Vector2::new(3.0, 4.0);
let result = a.distance_squared(&b);
assert!(
approximately(result, 25.0),
"{result} was not equal to 25.0"
);
let result = b.distance_squared(&a);
assert!(
approximately(result, 25.0),
"{result} was not equal to 25.0"
);
}
}
Create a benchmark
Now lets create a benchmark with criterion, these files are put outside of the src
directory in a directory called benches
that you will need to create. Lets do that and also create a file in it called distance.rs
At this point your directory structure should look like this
$ tree
.
├── Cargo.lock
├── Cargo.toml
├── benches
│ └── distance.rs
├── src
│ └── lib.rs
Add the following to your Cargo.toml
file.
[[bench]]
name = "distance"
harness = false
and the following code to benches/distance.rs
use criterion::{criterion_group, criterion_main, Criterion};
use rust_benchmarking_example::*;
pub fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("distance", |bench| {
bench.iter(|| {
let a = Vector2::new(0.0, 0.0);
let b = Vector2::new(3.0, 4.0);
a.distance(&b);
})
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
You can then run the benchmark from the command like with cargo bench
which will run the code a number of times
In the output look for a line like:
distance time: [934.09 ps 935.48 ps 937.28 ps]
Which tells you
- 934.09 ps - fastest benchmark
- 935.48 ps - average benchmark
- 937.28 ps - slowest benchmark
The results are cached, so if you run cargo bench
again it will compare the results against the most recently run ones, here we can see that there was no real change detected - which makes sense as other than some background noise on your computer we are running it on the same code.
distance time: [934.73 ps 935.46 ps 936.38 ps]
change: [-0.1119% +0.0493% +0.2134%] (p = 0.55 > 0.05)
No change in performance detected.
And since we enabled the html_reports feature we can see them by opening target/criterion/report/index.html
in a web browser, where you should see something that looks like this
Benchmark squared distance implementation
Now lets benchmark our distance squared code, change
c.bench_function("distance", |bench| {
bench.iter(|| {
let a = Vector2::new(0.0, 0.0);
let b = Vector2::new(3.0, 4.0);
a.distance(&b);
})
});
to
c.bench_function("distance", |bench| {
bench.iter(|| {
let a = Vector2::new(0.0, 0.0);
let b = Vector2::new(3.0, 4.0);
a.distance_squared(&b);
})
});
and run cargo bench
again. On my machine I get this error
Benchmarking distance: AnalyzingCriterion.rs ERROR: At least one measurement of benchmark distance took zero time per iteration. This should not be possible. If using iter_custom, please verify that your routine is correctly measured.
It appears that the distance_squared()
implementation is so fast it effectively takes zero time to run and can't be measured. Which I suppose makes sense as the much slower version using a square root took picoseconds to run.
If you really want to have this complete you could artificially make it slower by disabling compiler optimizations when benchmarking by adding this to your Cargo.toml
[profile.bench]
opt-level = 0
but you generally wouldn't want to do so since taking measurements on a dev build won't be very useful. If you were to do so, you would see output like
distance time: [9.9843 ns 10.000 ns 10.017 ns]
distance time: [9.3626 ns 9.3730 ns 9.3848 ns]
change: [-6.2155% -6.0317% -5.8404%] (p = 0.00 < 0.05)
Performance has improved.
Alternatively you could also do a benchmark run where only the the a.distance(&b);
call happens and another where it calls that and a.distance_squared(&b);
and you can see that the distance_squared call does not add any noticeable overhead.
But realistically the code in this toy example is probably so fast that it is not a great candidate for benchmarking.