I would like to announce the arrival of justbus-rs v0.2.0!
justbus-rs is an API for LTA's bus arrival timing focused on performance. Initially, justbus-rs was developed to be an example project that uses lta-rs but I got carried away in making things fast after I did a benchmark using wrk
and found out that it can 10x more req/s compared to another similar project I found on github called arrivelah. After more optimisation, the current version is now 25x faster!
Disclaimer
I have no intention to bash nodejs and/or the author of that project. It's a great project and I just wanted to find out the difference in performance between the 2 languages
So how did I make it so fast?
Before we move on to anything here are the benchmarks. Benchmarks ran on my personal computer with i7 3770k @ 4.4Ghz and 16G RAM @ 2200Mhz
.
zeon@zeon-desktop ~ wrk -c100 -d15s -t4 http://localhost:8080/api/v1/timings/83139
Running 15s test @ http://localhost:8080/api/v1/timings/83139
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 4.10ms 27.83ms 839.99ms 99.60%
Req/Sec 64.04k 17.90k 89.25k 46.88%
3812462 requests in 15.09s, 6.37GB read
Non-2xx or 3xx responses: 115
Requests/sec: 252570.08
Transfer/sec: 431.87MB
Hello World Benchmark
zeon@zeon-desktop ~ wrk -c100 -d15s -t4 http://localhost:8080/api/v1/dummy
Running 15s test @ http://localhost:8080/api/v1/dummy
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.33ms 2.74ms 38.44ms 89.56%
Req/Sec 61.03k 15.00k 92.95k 61.47%
3643319 requests in 15.10s, 444.74MB read
Requests/sec: 241334.14
Transfer/sec: 29.46MB
You might be thinking, wow how did the endpoint serving the data be as fast as the one that serves a static hello world
?
Let's take a look at the source code
async fn get_timings(
bus_stop: web::Path<u32>,
lru: web::Data<Cache<u32, String>>,
client: web::Data<LTAClient>,
) -> Result<HttpResponse, JustBusError> {
let bus_stop = bus_stop.into_inner();
let in_lru = lru.get(bus_stop);
let res = match in_lru {
Some(f) => HttpResponse::Ok().content_type("application/json").body(f),
None => {
let arrivals = get_arrival(&client, bus_stop, None)
.await
.map_err(JustBusError::ClientError)?
.services;
let arrival_str = serde_json::to_string(&arrivals).unwrap();
lru.insert(bus_stop, arrival_str.clone());
HttpResponse::Ok()
.content_type("application/json")
.body(arrival_str)
}
};
Ok(res)
}
As you can see from the code above, the strategy boils down to
- Caching responses
Cache<u32, String>
. Cache serialized version instead of structs to avoid unnecessary work. This optimisation added another 100k req/s from the initialCache<u32, Vec<ArrivalBusService>>
- Lock-free
Arc<Cache<u32, String>>
. Initially I usedArc<parking_lot::RwLock<Cache<u32, Vec<ArrivalBusService>>>>
(This yielded 100k req/s which is still way faster than nodejs version). - Using the fastest web framework (ie
Actix-web
) available
You are not telling me the whole picture, what are the drawbacks?
It's Rust itself. While most of the newer programming languages that got released within the past decade were designed with simplicity in mind, (eg. Kotlin, Go etc) to improve developer productivity, Rust has a lot of boilerplate (not as much as Java). Rust compile times is also much longer compared to most mainstream languages.
In essence you are paying in terms of
- Developer productivity
- Compile time (it takes a whooping 4min to compile on
--release
mode on my personal computer)
in exchange for performance and most importantly runtime safety