Sunday, April 27, 2014

Fast file for writing terms

Recently I needed to write a massive amount of sensor data into a database and I quickly ran into its limitations. After some analysis I found that the data can be written independently based on the source from where the data are coming. So the solution can be that writing the sensor data into files belonging to the individual sources.

There were no real problems of that solution until I needed to implement a Bitcask-like merge operation. During that operation we open a data file for reading, create a new file for writing and we read all records from the first file, check if some retention condition can be hold, and write the record in the new file if we need to keep that record. It requires a massive amount of write of small data (around 1 KB). The speed of the copy wasn't very convincing, to be gentle.

Erlang file types

In Erlang there are two types of file what we can use. The first is (non-raw) file which spawns a process for the file, so every file operation is a message passing to that process which reacts to the message and reads, writes data. One can feel that it can work with larger binaries but it won't perform brilliantly if the binaries are small. The other is a raw file when there is no controlling process spawned so only an erlang port what we have wrapped. Then according to some fprof profiling the big amount of computing time is spent during port communication.

Fast file

That is what drove me to implement a fast_file based on Joe Armstrong's idea. Instead of writing the data into somewhere which requires cross-context call (kernel call or port command), let us collect the data into a buffer and when the buffer grows big enough just flush the buffer.

Fast file module defines a record which holds a buffer for reading and writing. Yes, one buffer. If we are writing data we are using that buffer as a write buffer. If we want to read, the buffer is synced and we can use it as a read buffer. So fast file remembers the last operation, too.


I wrote small and bigger chunks of binaries into normal Erlang file, raw file and fast file. I ran the tests on my laptop (Core i5 2.4GHz, 6GB ram, 640GB HDD 5400rpm ext4).

TestNormal file   Raw file   Fast file
100 big280ms15ms24ms
1000 big2 336ms123ms222ms
10000 small     338ms153ms7ms
100000 small    2 366ms1 604ms79ms
200000 small    4 854ms3 088ms163ms

In case of one million writes only fast file didn't run into timeout (763ms). We can see that buffering is still a good use case.

How dangerous to buffer data?

I can see questions like what if the process, Erlang VM or OS crashes? Since fast_file creates an ever changing record we need to update our fast file record whenever a read or write happens. The usage of normal file is much more comfortable, we have an {ok, file:io_device()} and reads and writes leave the io device (the port in most cases) unchanged.

If process crashes we lost some data what haven't written yet. The good news is that we don't cross record boundaries during writing, so we don't need to repair the file when we open after a crash. In case of Erlang VM crash, the story is the same. In case of OS crash, it depends on how OS handles the file buffer. Linux knows a commit=nrsecs option during mounting a device. It means that in every nrsecs seconds Linux will sync all data to the device. If the crash happens between two commits there is a change of data loss.

Till I find a good place for my implementation you can check Joe's elib1_fast_write.erl.


  1. Quite a useful post, I learned some new points here. Thanks admin please keep posting updates regularly to enlighten our knowledge.
    JAVA Training in Chennai

  2. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    Android Training in Chennai
    Ios Training in Chennai


Richard Jonas. Powered by Blogger.

About me

My name is Richárd Jónás, live in Budapest, Hungary. In this blog I want to share my coding experiences in Erlang, Elixir and other languages I use. Some topics are simpler ones but you can use them as a reference. I also present some of my thoughts about developing distributed systems.