Crystal by Numbers

Benchmarking Kemal, Ruby on Rails, and Sinatra

TL;DR Kemal delivers 8.3x more requests than Rails and 1.5x than Sinatra, using only 15MiB (against 110MiB in Rails and 47MiB in Sinatra) and 56% of CPU (against 109% in Rails and 114% in Sinatra).

Why doing this?

When you hear anyone speaking about performance, you see a lot of comparative with C lang. But remember: the machine runs the binary code after compiling C, and not the C code itself.

What if we could compile Ruby (or actually a very similar dialect of Ruby) directly into fast machine code? That’s where Crystal lang comes in.

In order to create a benchmark of Crystal, I chose to compare using the workflow I am already used to: web development.

In Crystal’s world I chose Kemal and in Ruby’s I chose both Ruby on Rails and Sinatra.

Software used to build the APIs

Anything else, such as PostgreSQL, Docker, and docker-compose, are the same for all apps.

The apps’ behavior

The code for all three apps are on GitHub (Crystal API using Kemal and Ruby on Rails and Sinatra API).

I chose a very large database (IMDB) to also compare how all of the options behave under I/O pressure. Most synthetic benchmarks only test CPU or memory usage in isolation. But in a real world scenario, many cases are bound by a lot of I/O.

In addition to a heavy database, I want to add at least a simple authentication layer to match more closely a real-world scenario, instead of using simple “Hello World” examples.

The API resulted in two endpoints.

The first one, /titles, returns the Id, Title, Production Year, and the Kind of a Movie. The Kind itself is a relation with kind_type table, so we have to perform a second query to fetch that information and mix up its results into the final JSON.

The second one, /names, basically returns the actors and actresses.

The queries for both of them are the same.

In Kemal (using Crecto), a regular GET /titles is like:

 1.61ms SELECT title.* FROM title ORDER BY id desc LIMIT 10 OFFSET 0
734.6µs SELECT kind_type.* FROM kind_type WHERE  kind_type.id IN ('2', '2', '7', '7', '1', '7', '7', '7', '7', '7')

In Rails:

D, [2017-04-25T16:05:57.377097 #1] DEBUG -- : [68d13454-3488-40ea-9bac-8c75cb1357a2]   Title Load (0.5ms)  SELECT  "title".* FROM "title" ORDER BY id desc LIMIT $1 OFFSET $2  [["LIMIT", 10], ["OFFSET", 0]]
D, [2017-04-25T16:05:57.379987 #1] DEBUG -- : [68d13454-3488-40ea-9bac-8c75cb1357a2]   KindType Load (0.5ms)  SELECT "kind_type".* FROM "kind_type" WHERE "kind_type"."id" IN (2, 7, 1)

In Sinatra (using Sequel):

I, [2017-04-25T16:10:29.526161 #1]  INFO -- : (0.000529s) SELECT "id", "title", "production_year", "kind_id" FROM "title" ORDER BY "id" DESC LIMIT 10 OFFSET 0
I, [2017-04-25T16:10:29.527145 #1]  INFO -- : (0.000338s) SELECT * FROM "kind_type" WHERE ("kind_type"."id" IN (2, 7, 1))

Last but not least: I am not using any “extraordinary” programming concept, like metaprogramming in Ruby or macros in Crystal.

How did we analyze it?

Hardware

The machine used on the test is a laptop which has an SSD, Intel(R) Core(TM) i5–3337U CPU @ 1.80GHz, and 8GB RAM.

No swap memory was activated during the tests.

Requests

The software used to get the numbers for the request was wrk:

% wrk -t10 -c30 -d20s -H "Authorization: Token token:cf45e7b42fbc800c61462988ad1156d2" "http://localhost:3000/titles"

This runs a benchmark for 20 seconds, using 10 threads, and keeping 30 HTTP connections open.

I got the following values for the requests:

Requests/Sec using wrkRequests/Sec using wrk

As expected, GET /titles is the slowest endpoint for all frameworks.

CPU and memory usage

In order to get the numbers for CPU and memory usage I used docker stats like this:

% docker stats --format "{{.MemUsage}}:{{.CPUPerc}}" "crystaldemo_sinatra-prod_1" --no-stream

And while using wrk, I got the following values:

CPU and Memory usage by docker statsCPU and Memory usage by docker stats

For this simple scenario, with controlled (paginated) API endpoints receiving around the same amount of data from the database each time, the memory usage remains stable even after many requests.

But the interesting thing to notice is how both Rails and Sinatra need to exhaust the CPU (2 processes in Puma) while a single Kemal process uses no more than half the capacity of the CPU core, meaning that it’s being limited by the I/O from the database.

And with all those numbers I draw this chart:

Conclusion

The memory usage was expected to be lower in Crystal because it’s compiled down to a very small (6.9Mb) native binary, instead of having to load a heavy interpreter (Ruby) with lots of data (Rails/Sinatra frameworks).

I felt at home developing in Crystal, as it’s built to be really close to Ruby’s features and standard APIs. Objects, classes, everything. Only one thing will annoy you: sometimes you have to explicitly state the type of some variables. Thankfully, there is a type called union type, which in my opinion is a killer feature for a compiled language.

I am at home developing in Sinatra. If you are used to create code outside Rails or even to create POROs, it is possible to develop in Crystal with no problem at all.

For Heroku, there is a buildpack and the deploy is as easy as usual.

If you want to discover other technical details, I encourage you to check out the Crystal and Rails/Sinatra repositories.

Mind you: it does not take too long to learn the language, but as with any language, it takes time to master it 😉

Further reading for Crystal and Kemal:

A special thanks to Serdar Dogruyol for helping out with the development 🙂

Thanks to Fabio Akita, Marcle Rodrigues, Will Soares, Thiago Araújo Silva, and Alessandro Dias.

We want to work with you. Check out our "What We Do" section!