Templating with Erlang

Learn how to integrate ErlyDTL Django templates into Erlang applications

Build a Riak cluster

Learn about Riak - a distributed, masterless NoSQL database

How to trace a gen_server

Go detail into Erlang applications

Sunday, February 14, 2016

Elixir - REST with cowboy

To be honest, in the last couple of months when I implemented REST interfaces in cowboy, I always implemented the happy path, and I followed the okay-we-will-see principle in case of corner cases. That didn't sound good enough for me so I decided to dig deeper into how to properly implement a REST interface in cowboy. The first thing we need to take care is how many handler module we will need? The fewer is the better of course. We don't want to maintain a bunch of module for a bunch of different HTTP methods. It sounded logical for me that one module for one REST target (product, shop, user, etc).

The next question is how to handle different outcomes, like not found, not authorized, malformed request, etc. So I checked cowboy documentation and I found that in rest modules we need to implement an initialization and handlers for different content types. And for a while I lived with them happily... but they are not enough. There are other functions, callback functions which are good for something.

Cracking the code

In order to know how to implement a proper rest interface we need to understand how cowboy implemented rest handlers. Under the hood cowboy rest handlers are implement via a finite state automaton in the cowboy_rest module. Check the source! Yeah I know, it is intimidatingly long module. Don't fear it is easy. They are just states of the automata.

As you can see the first interaction with the rest FSM is the invocation of the rest_init/2. Here we have opportunity to check the HTTP request and set a default state which will be stored by the underlying rest FSM. Here you can make certain initialisations if you want (get some cookie values, set some of the etc.).

A bit below you can find a good example of handling/calling our callbacks. Check the known_methods/2 function. It will call our ?MODULE:known_methods/2 function. There is a wrapper function to execute that call, called call/3. It gives back no_call if there is no callback exported (so default case in a sense), or other values which we are giving back from our callback functions. If you are unsure what a callback function should give back, check the source of the appropriate function which calls our callback. Here you can see that we can say {:halt, req, state} in our function, or {["GET", "POST], req, state} as a good response. The next/3 function sets the next state of the rest FSM. Easy, right?

Flowcharts

Since most of us understand better the graphs, here are the REST flowcharts which describe what happens on different return values of the different callbacks.

In the known_methods we can check and give back (OPTIONS method for querying them) the possibly methods for a resource. Generally it comes from the rest handler what are implemented what are not, sometimes it depends on the content type of the requests, so we can implement a little logic which generates the possible HTTP methods. In the allowed_methods we can restrict the methods for a specific resource (or a specific user given by a cookie). Let us say we have a read-only table in the SQL database, so we don't want to provide POST and PUT methods. In the is_authorized we can check permission depending on the current user, if we have such a feature.

We have content_types_provided callback for specifying what kind of content type will be generated which of our sub-handler functions. Later you will see how to bind the content type the client wants with the functions we have. In the content_type_accepted we can do the same for the incoming data in the request headers.

Let us implement a very simple data access CRUD like service for products. It will have

  • GET / for listing all the data in json
  • GET /:id for fetching a specific product (results in 200 or 404)
  • POST / for creating a new product (results in 201 CREATED)
  • PUT /:id for updating a product
  • DELETE /:id for deleting a product
You can implement PATCH as a homework. It is a partial update, when the client sends only the changing part of the object.

defmodule Store.ProductHandler do
  require Logger

  def init(protocol, _req, _opts) do
    Logger.info("In init/3 #{inspect protocol}")
    {:upgrade, :protocol, :cowboy_rest}
  end

  # init state to be the empty map
  def rest_init(req, _state) do
    {method, req2} = :cowboy_req.method(req)
    {path_info, req3} = :cowboy_req.path_info(req2)
    state = %{method: method, path_info: path_info}
    Logger.info("state = #{inspect state}")
    {:ok, req3, state}
  end

  def content_types_provided(req, state) do
    {[{"application/json", :handle_req}], req, state}
  end

  def content_types_accepted(req, state) do
    {[{{"application", "json", :"*"}, :handle_in}], req, state}
  end

  def allowed_methods(req, state) do
    {["GET", "PUT", "POST", "PATCH", "DELETE"], req, state}
  end

  # Handling 404 code
  def resource_exists(req, %{:path_info => []} = state) do
    {true, req, state}
  end
  def resource_exists(req, %{:path_info => [id]} = state) do
    Logger.info("Checking if #{id} exists")
    case :ets.lookup(:repo, String.to_integer(id)) do
      [{_id, obj}] ->
        {true, req, Map.put(state, :obj, obj)}
      _ ->
        {false, req, state}
    end
  end

  # Handle DELETE method
  def delete_resource(req, %{:obj => obj} = state) do
    :ets.delete(:repo, obj["id"])
    {true, req, state}
  end

  def handle_req(req, %{:obj => obj} = state) do
    # Handle GET /id
    {Poison.encode!(obj), req, state}
  end
  def handle_req(req, state) do
    # Handle GET /
    response = :ets.tab2list(:repo)
                |> Enum.map(fn({_id, obj}) -> obj end)
                |> Poison.encode!
    {response, req, state}
  end

  # Don't allow post on missing resource -> 404
  def allow_missing_post(req, state) do
    {false, req, state}
  end

  # Handle PUT or POST if resource is not missing
  def handle_in(req, state) do
    {:ok, body, req2} = :cowboy_req.body(req)
    obj = Poison.decode!(body)
    Logger.info("Accepting #{inspect obj}")
    :ets.insert(:repo, [{obj["id"], obj}])
    {true, req2, state}
  end
end

(That is what I like in Elixir, I didn't even need to reformat my code after pasting it here ;) ).

You can see, in the rest_init I cache the method and the path elements in the state which is a map. In the handle_req function we will handle the requests which typically don't have request bodies (GET, HEAD). In the handle_in I handle incoming data (POST, PUT).

In the resouce_exists I check if a specific record exists in the ETS table. But if we get the index (the list of all objects) we don't have such id and we don't need such a check. If I check the existence of an object, I can cache the object, since it is very probable that I need to fetch that object again. I am caching it in the rest FSM state which will be freed after the rest request is terminated, served.

In the handle_req I split the two cases: fetching the index or fetching a specific object. In the handle_in we could get here if the request is a POST or PUT. So I get the body of the request, decode the JSON there with Poision. It gives back a map, so I can build my tuple which I store in the ETS table. Here we can give back true in case of generating a 204 NO CONTENT, or {true, url} to generate a 200 OK with body content.

In the allow_missing_post by giving back false we say that we don't want clients to send POST requests to non-existing resource. So POST / is allowed, and a key will be generated, but POST /99 is not allowed if product with id 99 doesn't exist. In the delete_resource we are covering the DELETE method.

From the flowchart you can see how for example a DELETE request can be handled in detail. Like if the object to be deleted, changed since we fetched the object, with the if-match header we can check if the request has the same Etag, etc. So we have a tons of choices to fully exploit the cowboy rest handlers.

For testing I use the Advanced REST Client Google Chrome extension. I records my http requests, so from the history I can easily replay them.

Future changes

If you see a very recent version of cowboy, you can see that several thing will be changed. For example rest_init and rest_terminate are removed, and init/3 is init/2 and the function returns are refactored too. Probably we will cover the changes later, but since this is the newer version of cowboy from hex.pm we can safely use this version (1.0.4 at the moment).

Share:

Saturday, February 13, 2016

Elixir - the first steps 1

In this blog series I want to share my first steps in Elixir. Okay, maybe not the very first steps since I have been watching this very exciting language for some months. Now I have some kind of picture in my mind, how I can develop something more interesting than Hello World.

To be serious and effective we need to set a big enough goal which we can implement. I have always been on the server side, so let us remain there for now.

Let us build a webshop

Okay, at first it seems to be a brave promise, but with big goals we can achieve a lot, right? So let us start to build an Elixir project which uses Riak as a database backend. And see how we can write an HTML5, Elixir, Riak (NoSQL) application and try not just read and speak about there things, but actually do.

What is Elixir?

Elixir is a dynamic, functional language which runs on Erlang ecosystem. The inventor is José Valim who found interesting to port a Ruby-like language on Erlang VM. For my opinion that is a nice idea, since Erlang syntax is a bit old, sometimes not sexy for the newcomers. But in my opinion the Erlang VM is a great thing if we are speaking about scalability and robustness.

Elixir files are written into files with .ex extensions, after compilation they will be .beam files as you got used to in Erlang. Then those beam files are loaded by the Erlang VM and run them. Let install Elixir.

kiex the Elixir's kerl

In Erlang with kerl we can manage different Erlang installations in a single environment. It is necessary since we want to see how our project can be built with older Erlang versions. The same tool was made for Elixir, so let us install that at first.

\curl -sSL https://raw.githubusercontent.com/taylor/kiex/master/install | bash -s

This will install kiex into ~/.kiex directory. By running kiex we can see

$ ~/.kiex/bin/kiex
kiex commands:
    list                      - shows currently installed Elixirs
    list known                - shows available Elixir releases
    list branches             - shows available Elixir branches
    install          - installs the given release version
    use  [--default] - uses the given version for this shell
    shell            - uses the given version for this shell
    default          - sets the default version to be used
    selfupdate                - updates kiex itself
    implode                   - removes kiex and all installed Elixirs
    reset                     - resets default Elixir version to null

With list known we can get the available Elixir versions from the repo.

$ ~/.kiex/bin/kiex list known
Getting the available releases from https://github.com/elixir-lang/elixir/releases

Known Elixir releases:
    1.2.1
    1.2.0
    1.2.0-rc.1
...

We can install the latest version by ~/.kiex/bin/kiex install 1.2.1 elixir-1.2.1. The only thing you need to do is to source the commands what kiex use elixir-1.2.1 emits. From that time you can execute Elixir environment by iex.

Mix, finally

The very great thing in Elixir is that, it is shipped with a build environment called Mix. In Erlang we have been looking for great build environments, and finally rebar and erlang.mk were born. In Elixir that problem is solved from the first day. Elixir contains a build tool called mix. With mix help we can see all the possibilities we can do with this tool. Let us create a new project. With mix help new see how to create new projects.

$ mix new webshop --app webshop --sup --module Store
* creating README.md
* creating .gitignore
* creating mix.exs
* creating config
* creating config/config.exs
* creating lib
* creating lib/webshop.ex
* creating test
* creating test/test_helper.exs
* creating test/webshop_test.exs

Your Mix project was created successfully.
You can use "mix" to compile it, test it, and more:

    cd webshop
    mix test

Run "mix help" for more commands.

In the mix.exs there is the spine of the project, it is the application descriptor. It contains the Erlang-style application descriptor, the applications to be started and the dependencies. mix.exs is an Elixir module (!).

defmodule Store.Mixfile do
  use Mix.Project

  def project do
    [app: :webshop,
     version: "0.0.1",
     elixir: "~> 1.1",
     build_embedded: Mix.env == :prod,
     start_permanent: Mix.env == :prod,
     deps: deps]
  end

  def application do
    [applications: [:logger],
     mod: {Store, []}]
  end

  defp deps do
    []
  end
end

It says that the project requires Elixir 1.1 at least, and it starts logger application. There is no dependencies. In the config/config.exs we can specify configuration values for the applications. Elixir defines environment when it is speaking about projects. There are dev, test, prod environments and with

MIX_ENV=dev iex -S mix

we can run the application in dev environment. Also we can run the tests by mix test, or can just compile the application by mix compile. With mix run or iex -S mix we can run the application. The later runs the application in console mode, so we can inspect what is happening during running the application.

So now we have an Elixir environment, we know how to install different Elixir versions in the same OS, we also create a project with mix. After that we can do some programming work.

First steps

Mix generated some codes for us in order that we can start easier. It generated a module file, a test file and a config file. Let us check them.

defmodule Store do
  use Application

  def start(_type, _args) do
    import Supervisor.Spec, warn: false

    children = [
      # worker(Store.Worker, [arg1, arg2, arg3]),
    ]

    opts = [strategy: :one_for_one, name: Store.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

If you check the source code of Application module in Elixir you can see that use Application defines a default stop function and also puts the behaviour attribute in the code. So we need to implement the start function already in order that the compiler won't complain.

In the Supervisor.Spec module we can find convenience functions for defining supervision tree element, like supervisors, children and workers. With the worker function it is easier to define a worker child than writing complex supervisor specs by heart. One way or other the start function should return an {:ok, pid} in case of success. In the background the Erlang application controller will collect that pid and insert in in an ETS table, so it won't be forgotten and also this process will be monitored by the application controller.

To put it all together, it is a very basic Elixir application to show how an Erlang-style application can be created. I like the simplicity of the code, there is much less boilerplate code which is fine. In the background however we need to take care what happens with the Erlang OTP processes.

Share:

Sunday, January 24, 2016

Build Riak cluster from source

In the last couple of months I demonstrated what Riak is, what features it has, how easy to manage Riak. That is ok, and one can believe that it is easy to install/setup Riak, but the best way to prove that is to show how to setup a cluster.

Build a cluster

There are many ways to install Riak. There are built binaries per platform, so for example if you have an Enterprise RedHat Linux, you can choose the appropriate binary for you. Now we want to play with Riak, we don't really want to set up multiple physical (or virtual) hosts, but just build a cluster. We can easily do that by compiling the source and build a dev cluster.

Check Riak downloads page for binaries and the source package. With the next couple of commands you will have Riak source in your riak-2.1.3 directory.

mkdir ~/riak
cd ~/riak
curl http://s3.amazonaws.com/downloads.basho.com/riak/2.1/2.1.3/riak-2.1.3.tar.gz -O
tar xzf riak-2.1.3.tar.gz
cd riak-2.1.3

So far so good, now we need to build Riak with 5 nodes. You need to have Erlang 16 or 17 in your path to build Riak, so activate it at first. With DEVNODES=5 make devrel we can build Riak and create 5 dev nodes. As a result we get Riak nodes in dev/dev1, dev/dev2, etc directories with non-conflicting ports specified. You can check that in the dev/dev1/etc/riak.conf and so on. Here are some lines about ports in riak.conf.

...
## Name of the Erlang node
##
## Default: dev1@127.0.0.1
##
## Acceptable values:
##   - text
nodename = dev1@127.0.0.1
## listener.http. is an IP address and TCP port that the Riak
## HTTP interface will bind.
##
## Default: 127.0.0.1:10018
##
## Acceptable values:
##   - an IP/port pair, e.g. 127.0.0.1:10011
listener.http.internal = 127.0.0.1:10018

## listener.protobuf. is an IP address and TCP port that the Riak
## Protocol Buffers interface will bind.
##
## Default: 127.0.0.1:10017
##
## Acceptable values:
##   - an IP/port pair, e.g. 127.0.0.1:10011
listener.protobuf.internal = 127.0.0.1:10017
...

So we have 5 isolated Riak instances here. Let us run the first 3 and build a cluster. During cluster build we need to tell the nodes to join another designated node. In our example we will tell dev2 to join dev1, dev3 to join dev1. It won't happen instantly as we execute the commands, but a cluster plan has been created. We need to check the plan, and we need to commit the plan in order that changes kick in.

$ cd dev
$ for i in dev{1..3}; do $i/bin/riak start; done
$ dev1/bin/riak-admin member-status
=============================== Membership ================================
Status     Ring    Pending    Node
---------------------------------------------------------------------------
valid     100.0%      --      'dev1@127.0.0.1'
---------------------------------------------------------------------------
Valid:1 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

So nodes are running and with member-status we can check the dev1 holds the whole ring. Riak puts all key-value pairs in a ring which is divided into 64 parts by default, called vnodes or partitions. Now dev2 and dev3 nodes will join to dev1, and we check the cluster plan.

$ dev2/bin/riak-admin cluster join dev1@127.0.0.1
Success: staged join request for 'dev2@127.0.0.1' to 'dev1@127.0.0.1'
$ dev3/bin/riak-admin cluster join dev1@127.0.0.1
Success: staged join request for 'dev3@127.0.0.1' to 'dev1@127.0.0.1'
$ dev1/bin/riak-admin cluster plan
============================= Staged Changes ==============================
Action         Details(s)
---------------------------------------------------------------------------
join           'dev2@127.0.0.1'
join           'dev3@127.0.0.1'
---------------------------------------------------------------------------


NOTE: Applying these changes will result in 1 cluster transition

###########################################################################
                       After cluster transition 1/1
###########################################################################

=============================== Membership ================================
Status     Ring    Pending    Node
---------------------------------------------------------------------------
valid     100.0%     34.4%    'dev1@127.0.0.1'
valid       0.0%     32.8%    'dev2@127.0.0.1'
valid       0.0%     32.8%    'dev3@127.0.0.1'
---------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

WARNING: Not all replicas will be on distinct nodes

Transfers resulting from cluster changes: 42
  21 transfers from 'dev1@127.0.0.1' to 'dev3@127.0.0.1'
  21 transfers from 'dev1@127.0.0.1' to 'dev2@127.0.0.1'

You can see that how the ring will be distributed after dev2 and dev3 join to the cluster (which is a 1-node cluster right now). Now we commit the changes, and check how partitions will move.

$ dev1/bin/riak-admin cluster commit
Cluster changes committed

$ dev1/bin/riak-admin transfers
'dev3@127.0.0.1' waiting to handoff 1 partitions
'dev2@127.0.0.1' waiting to handoff 1 partitions
'dev1@127.0.0.1' does not have 8 primary partitions running

Active Transfers:

$ dev1/bin/riak-admin member-status
=============================== Membership ================================
Status     Ring    Pending    Node
---------------------------------------------------------------------------
valid      75.0%     34.4%    'dev1@127.0.0.1'
valid      17.2%     32.8%    'dev2@127.0.0.1'
valid       7.8%     32.8%    'dev3@127.0.0.1'
---------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

$ dev1/bin/riak-admin transfers
'dev3@127.0.0.1' waiting to handoff 3 partitions
'dev2@127.0.0.1' waiting to handoff 3 partitions
'dev1@127.0.0.1' waiting to handoff 15 partitions
'dev1@127.0.0.1' does not have 3 primary partitions running

Active Transfers:

$ dev1/bin/riak-admin transfers
No transfers active

Active Transfers:

When we see that there are no active transfers, we are ready, all the partitions are distributed. Let us check it with a riak-admin member-status.

$ dev1/bin/riak-admin member-status
=============================== Membership ================================
Status     Ring    Pending    Node
---------------------------------------------------------------------------
valid      34.4%      --      'dev1@127.0.0.1'
valid      32.8%      --      'dev2@127.0.0.1'
valid      32.8%      --      'dev3@127.0.0.1'
---------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Node failure

Let us simulate the situation when node 2 is down. Bring it down by dev2/bin/riak stop.

$ dev1/bin/riak-admin ring-status
================================ Claimant =================================
Claimant:  'dev1@127.0.0.1'
Status:     up
Ring Ready: true

============================ Ownership Handoff ============================
No pending changes.

============================ Unreachable Nodes ============================
The following nodes are unreachable: ['dev2@127.0.0.1']

This happens when dev2 crashes for some reason. Probably our monitoring system will detect this situation faster than we could by checking the ring status. But it also can happen that dev2 is not down, but it is unreachable (netsplit). Netsplits can happen when the nettick message cannot be received on time (overloaded network). Maybe the node itself is not overloaded however. So it is a different situation from node crash, so we need a different monitoring tool to detect netsplits.


Extending the cluster

Let us suppose that Black Friday is coming and we expect a growth in the number of transactions. We don't want to extend Riak cluster node by node, but we want to add two nodes at one step (which is the recommended way of extending the cluster). Let us start the two nodes and join them to the cluster.

$ dev4/bin/riak start
$ dev5/bin/riak start
$ dev4/bin/riak-admin cluster join dev1@127.0.0.1
$ dev5/bin/riak-admin cluster join dev1@127.0.0.1
$ dev1/bin/riak-admin cluster plan
$ dev1/bin/riak-admin cluster commit

We pretty much know what to expect from the commands, but I pasted here the cluster plan. It shows how many partitions will be moved during the cluster extension.

$ dev1/bin/riak-admin cluster plan
============================= Staged Changes ==============================
Action         Details(s)
---------------------------------------------------------------------------
join           'dev4@127.0.0.1'
join           'dev5@127.0.0.1'
---------------------------------------------------------------------------


NOTE: Applying these changes will result in 1 cluster transition

###########################################################################
                       After cluster transition 1/1
###########################################################################

=============================== Membership ================================
Status     Ring    Pending    Node
---------------------------------------------------------------------------
valid      34.4%     20.3%    'dev1@127.0.0.1'
valid      32.8%     20.3%    'dev2@127.0.0.1'
valid      32.8%     20.3%    'dev3@127.0.0.1'
valid       0.0%     20.3%    'dev4@127.0.0.1'
valid       0.0%     18.8%    'dev5@127.0.0.1'
---------------------------------------------------------------------------
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Transfers resulting from cluster changes: 49
  4 transfers from 'dev3@127.0.0.1' to 'dev5@127.0.0.1'
  4 transfers from 'dev1@127.0.0.1' to 'dev2@127.0.0.1'
  4 transfers from 'dev3@127.0.0.1' to 'dev1@127.0.0.1'
  4 transfers from 'dev2@127.0.0.1' to 'dev4@127.0.0.1'
  4 transfers from 'dev1@127.0.0.1' to 'dev3@127.0.0.1'
  4 transfers from 'dev3@127.0.0.1' to 'dev2@127.0.0.1'
  4 transfers from 'dev2@127.0.0.1' to 'dev5@127.0.0.1'
  5 transfers from 'dev1@127.0.0.1' to 'dev4@127.0.0.1'
  4 transfers from 'dev2@127.0.0.1' to 'dev1@127.0.0.1'
  4 transfers from 'dev1@127.0.0.1' to 'dev5@127.0.0.1'
  4 transfers from 'dev3@127.0.0.1' to 'dev4@127.0.0.1'
  4 transfers from 'dev2@127.0.0.1' to 'dev3@127.0.0.1'

That is it

So basically we know how to build a cluster in our development machine. Obviously if we install a Riak cluster in a production server environment we need to act differently (install binary packages which uses system-wide /etc, /var/lib, /var/log directories. But the thinking is the same: new nodes always join to existing nodes in the cluster, and we always have to check the cluster plan.

Share:

Wednesday, January 20, 2016

Templating with ErlyDTL 2.

In this second blogpost I introduce some advanced features of Erlydtl templating (in the first post I showed how to create custom behaviours) like template inclusion, extension and basic setup of a template-enabled Erlang application.

Set up environment

Create a directory and download erlang.mk. For the last couple of projects I used erlang.mk over rebar because for me it is better customizable.

curl https://raw.githubusercontent.com/ninenines/erlang.mk/master/erlang.mk -O

In the project we will have src and templates directories. Let us create a Makefile with which we can compile the application including the templates, and also create the release itself. The Makefile looks like this

PROJECT = product
DEPS = cowboy erlydtl

include erlang.mk

Issuing a make command will bootstrap erlang.mk and then we can compile the application by make app. Let us create a simple Restful application.

Restful Cowboy

The product application will have and application description file (product.app.src), the application behaviour (product.erl), a supervisor (product_sup.erl) and a rest handler (product_rest.erl). The supervisor won't do anything but start cowboy and register dispatch rules. That is the minimal set of Erlang files we can live with. File will be put into src directory. See the product.app.src file

{application, product, [
    {description, "Simple product RESTful app"},
    {id, "product"},
    {vsn, "0.0.1"},
    {modules, []},
    {applications, [
        kernel, stdlib,
        cowboy, erlydtl
    ]},
    {registered, []},
    {mod, {product, []}}
]}.

It contains our minimal needs, the dependent applications (cowboy and erlydtl), and it starts the product application module.

-module(product).
-behaviour(application).

-export([start/2, stop/1]).

start(_StartType, _Args) ->
    Rules = [{'_', [{"/product/:id", product_rest, []}]}],
    Dispatch = cowboy_router:compile(Rules),
    cowboy:start_http(product_http, 5, [{port, 8080}],
                      [{env, [{dispatch, Dispatch}]}]),
    product_sup:start_link().

stop(_State) ->
    ok.

The application module starts the supervisor and besides registering cowboy dispatch rules it starts cowboy.

-module(product_sup).
-behaviour(supervisor).

-export([start_link/0]).
-export([init/1]).

start_link() ->
    supervisor:start_link({local, ?MODULE}, ?MODULE, []).

init([]) ->
    {ok, {{one_for_one, 1, 1}, []}}.

Now we stop a bit and create the relx.config to see if we have the minimal application which we want. With relx the release creation is very simple, we just need to specify the release name and the applications we use and that i is. During development it is a good idea to specify dev_mode, in this way the available OTP release won't be copied into our _rel directory but symlinks will be created. Put the relx.config into the project root, erlang.mk will download relx executable automatically, so when we execute make rel make will call relx to create the release in the _rel directory by default. If the extended_start_script is true it will create a product script into _rel/product/bin directory with which we can start the application (or by make run).

{release, {product, "0.0.1"},
          [cowboy, erlydtl, product]}.
{extended_start_script, true}.
{dev_mode, true}.

With make rel the Erlang release is build, so with make run we can run the application. Bingo.

Implement rest handler

For the sake of simplicity the rest handler contains a wired database. It contains a generic product (with id 1) which will be displayed by generic.dtl template (see later), and another product which is rendered by guitar.dtl template.

-module(product_rest).

-export([init/3,
        content_types_provided/2,
        content_types_accepted/2,
        allowed_methods/2,
        handle_get/2,
        handle_post/2]).

init(_Protocol, _Req, _Opts) ->
    {upgrade, protocol, cowboy_rest}.

content_types_provided(Req, State) ->
    Handlers = [{<<"application/json">>, handle_get}],
    {Handlers, Req, State}.

content_types_accepted(Req, State) ->
    Accepted = [{{<<"application">>, <<"json">>, '*'}, handle_post}],
    {Accepted, Req, State}.

allowed_methods(Req, State) ->
    {[<<"GET">>, <<"POST">>, <<"OPTIONS">>], Req, State}.

handle_get(Req, State) ->
    {Param, _} = cowboy_req:binding(id, Req),
    Id = binary_to_integer(Param),
    {ok, Msg} = case {Id, get_product(Id)} of
                    {1, P} ->
                        generic_dtl:render([{product, P}]);
                    {2, P} ->
                        guitar_dtl:render([{product, P}])
                end,
    {Msg, Req, State}.

%% Sample POST handler for sake of example :)
handle_post(Req, State) ->
    {ok, Body, Req2} = cowboy_req:body(Req),
    {process_json(Body), Req2, State}.

process_json(Binary) ->
    case post_handler(Binary) of
        ok ->
            true;
        {error, _Reason} ->
            halt
    end.

get_product(Id) ->
    case Id of
        1 ->
            #{id => 1,
              description => <<"Guitar leather bag">>,
              category => #{name => <<"Other">>}};
        2 ->
            #{id => 2,
              description => <<"Jackson SL-3">>,
              category => #{name => <<"Electric guitar">>},
              frets => 24,
              body => <<"Alder">>,
              pickup => <<"Seymour Duncan">>}
    end.

post_handler(_) ->
    ok.

This is the longest module, there are mandatory functions which implements the REST API (all exported functions). The get_product/1 function is the wired product database, and in the handle_get/2 we will get the Id sent in the URL path, and also gets the product and then render it conditionally.

The base template is generic.dtl which is

{
    "id": {{ product.id }},
    "description": "{{ product.description }}",
    {% block category %}
    "category": "{{ product.category.name }}"
    {% endblock %}
    {% block specific %}
    {% endblock %}
}

It defines the generic part serializing id and description which are in every product. We give a default implementation for category, and we are waiting the specific part, which will be defined by specific products like guitars, keyboards, etc. The guitar.dtl template extends the generic template, and refines the implementation of category.

{% extends "generic.dtl" %}

{% block category %}
    "category": "Guitar/{{ product.category.name }}"
{% endblock %}
{% block specific %},
    "frets": {{ product.frets }},
    "body": "{{ product.body }}"
{% endblock %}

We can use {% include "file.dtl" %} for externalizing complex and/or reusable parts of the template. It is not a big deal, all variables which are in the including template will be visible in the included template.

Takeaways

So when we need to create formatted messages which collect a number of variables or behave differently depending on some input parameters we can use ErlyDTL template with success. Also, creating JSONs or maps which are the datasource of JSONs, can make the source hard-to-understand. A lot of boilerplate code, and only a small lines of real business logic. If it is the case, use templates. In part 1 you can learn how to implement custom ErlyDTL library, which gives the possibility to enrich the functionality of the templates you create.

Share:

Sunday, January 17, 2016

Introduce erlang.mk

As Java doesn't have neither Erlang has a standard tool with which we can handle dependencies, compile the application modules and generates Erlang modules. In Java the Maven and Ivy are for dependency handling, the Erlang world also has a palette from which we can choose if we are talking about dependency handling.

Later in my projects I used rebar which was good, but as I wanted to extend my build with specific steps (generating sys.config from a template, or starting riak during testing) I found it difficult to solve those problems with rebar. Then I found erlang.mk. At first I found it overly complex, but as I made more and more project builds with erlang.mk I liked it. I liked it because of its extensibility. In this post I like to show how erlang.mk works and how one can extend the build lifecycle.

How erlang.mk works

Basically erlang.mk is a big parametric Makefile. At first we need to provide the parameter values and then we can include erlang.mk. Let us create an empty directory and download erlang.mk bootstrap file.

curl https://raw.githubusercontent.com/ninenines/erlang.mk/master/erlang.mk -O
And create a Makefile.
PROJECT = webshop
PROJECT_DESCRIPTION = A webshop application

DEPS = cowboy jsx ejson lager riakc

dep_ejson = git https://github.com/jonasrichard/ejson.git

include erlang.mk

In a typical erlang.mk makefile there will be a project identification and description, and also the DEPS variable will contain a list of dependencies we require. Dependencies are project names, so a question can pop into our mind, how does erlang.mk know what is the url of jsx or cowboy. erlang.mk contains the name, repo url of typical Erlang dependencies, so they are in the erlang.mk file. However ejson is not such a module therefore I need to specify the repo url by specifying the dep_ejson variable.

After running make, erlang.mk will be downloaded and then all the dependencies are fetched and compiled. With make help you can see the targets erlang.mk defines by default. With make deps erlang.mk gets the dependencies from various source repositories. If you add a new dependency you need to fetch that by make deps, if they are not fetched automatically. make app compiles the dependencies and the application itself. With make test we can run unit and common tests. make rel builds the Erlang release and what we can run with the generated start script. Or we can run the project with make run.

As always with environment variables we can specify parameters to the targets, like if we want to run unit tests with cover enabled run COVER=1 make test cover-report. In that way we can specify ERLC_OPTS which is the command line switches of the Erlang compiler. Also, we can write a build environment sensitive makefile, so with ENV=dev make run we can run the application in development environment.

Build lifecycle

Simple makefiles can be debugged with make -sn which prints to the stderr what make will do. Try this with erlang.mk, it will result in a tons of messages. It is because erlang.mk creates variables which contain Erlang code snippets, which will be evaluated by erl -eval. It is a fair way of defining custom build targets by writing Erlang codes. So it is way easier is to open the source of erlang.mk which shows you what will be done. At first this many-thousand-line makefile can be intimidating but after a bit of analyzation you can see that it starts with the general part, then there are embedded parts (like unit testing, cover, dialyzer), then it contains some thousand line of knows dependencies, then the 3rd party plugins come.


In makefile one can write something like

.PHONY: compile eunit

compile:
    erlc src/*.erl

eunit: compile
    erlc test/*.erl -o test
    erl -pz ebin test -eval "run the eunit tests :)"

Which is nice if the parameters are correctly specified. Compile target compiles the source, eunit target depends on compile and it also compiles tests and run them. The only problem is that if we define such a framework, nobody can make hooks which can be execute before or after compilation or running unit tests.

In most cases erlang.mk uses double-colon rules. There may be more double-colon rules with the same name. In that case all rules will be run in the order of their occurence. App target is such a double-colon target, so one can hook commands before and after running app target. If we write a rule before including erlang.mk that rule will occur earlier that the rules in erlang.mk, so it will be execute before the erlang.mk app rule, and vice versa. So one can filter sources before compilation and make an archive file then.

app::
    filter source files, replace things in them

include erlang.mk

app::
    tar vxfz beams.tar.gz ebin/

The only problem with this solution is that in Makefile the first target will be the default target. And we overwrote the default target of erlang.mk which was all and now it is app (erlang.mk had all:: deps app rel as a first target). So we need to write a .DEFAULT_GOAL: all somewhere in the Makefile to set the default target back. Why is it a problem? Because if we are overriding the rel target, the rel target will be the default. And if somebody uses our project as a dependency, when erlang.mk builds dependencies it will go in each deps/project directory and executes a make command without specifying the target. In our case the rel (or app) will be the default target which may or may not work.

But back to the business let us build a release.

Specify relx.config

In default erlang.mk uses relx to build releases. Relx makes Erlang release creation very simple, at least it simplifies the first steps very much. To run relx we need to have a relx.config file which describes what relx needs to do. Here we have a simple relx.config

{release, {webshop_release, "0.0.1"}, [
    webshop
]}.
{extended_start_script, true}.

relx relies on application app file (in this case ebin/webshop.app) content, namely on the application tuple which contains the dependencies. So in relx.config we don't need to specify the dependencies of our applications, only our applications. And relx will traverse the app files and collect all the applications required. So after run make rel the release is created in the _rel directory.

How to go on?

We are just scratching the surface of what we can do with erlang.mk. There are a lot of plugins which can execute not-so-popular tasks, and we also can define 3rd party plugins for erlang.mk. Anyway in this post I described the basic idea how erlang.mk can be used, advanced topic can be understood after understanding this introduction. The main takeaway is, if you have question about erlang.mk, always use the source.

Share:

Sunday, January 10, 2016

Templating with ErlyDTL 1.

Last time I needed to implement a RESTful service in Erlang which got a large JSON and depending on the content it needed to react on different events (like a new product registered in a webshop, etc.). Reacting meant that the application should have called services with JSON response bodies. I don't know if you made even medium sized JSONs (or maps) in your source, it doesn't look like good.

Let us imagine that we have a product service, which needed to provide various information about products like code, desciption, price. We have some database which can give us maps describing our products. Those maps can be any deep, so we want some chaining access of the 3 or 4 level deep things in the maps. The products look like these

-module(product).

-spec create_json(Product::map()) -> map().
create_product() ->
    #{id = 483,
      product_code = "C7HR-BCH",
      brand = "Schecter",
      description = "Schecter C7HR-BCH",
      category = #{name = "electric guitar",
                   category = #{name = "7 string model"}},
      frets = 24,
      body = "mahogany",
      pickup = "2x EMG 707TW"}.

map_to_json() ->
    jsx:encode(create_product()).

In this module we handle data as maps. Maps can be converted to JSON format and back. I am using jsx library to do that. During decoding we should give hint to jsx that we expect maps by jsx:decode(Binary, [return_maps]).

In this simple example you can see what happens when we have embedded JSON objects in the code. When I needed to get object paths from JSONs the situation was even worse. So let us suppose that we need to create a category path from this product resulting "electric guitar/7 string model" string.

-module(product).

get_category_path(Product) ->
    string:join(get_categories).

get_categories(Product) ->
    case maps:is_key(category, Product) of
        true ->
            C = maps:get(category, Product),
            [map:get(name, C)] ++ get_categories(C);
        false ->
            []
end.

The problem is that we need to take care of safeness of accessing map elements. Otherwise we will face with badkey exceptions. That can make the code complex even if we know that if there is no such field as name and the empty string will be fine for now. The category path will be rendered on the webpage anyway, and someone will fix it. But we don't want to crash the page generation.


Solve the problem with templates

ErlyDTL implements Django templates in Erlang environment. One should write templates and save them into files with .dtl extensions. The ErlyDTL compiler will compile them to Erlang source files. In the next step they will be compiled to beam files with erlc as usual. A template will be a generated Erlang module, so for example my.dtl will be my_dtl module. That module has a render(Vars) function which renders the template with the context we provide by giving the variables to the render function. As a result we will have an iolist which is good for optimization point of view, but we need to be aware of having an iolist, when we pass that result to our functions (io:format("~s", ...) handles iolists but now every function prepared for iolists).

{
    "id": {{ product.id }},
    "productCode": "{{ product.product_code }}",
    "brand": "{{ product.brand }}",
    "description": "{{ product.description }}",
    "category": "{{ product.category.name }}",
    "subCategory": "{{ product.category.category.name }}",
    "frets": {{ product.frets|default:22 }},
  {% if product.body %}
    "body": "mahogany",
  {% endif %}
    "pickup": 2x EMG 707TW"
}

Templates contain tags, expression and pure text (see ErlyDTL Github page for details). Between double brackets you can write an expression, which evaluates in the variable context which is passed to the render() function of the module. So basically I need to define a variable product and put them in the context.

my_dtl:render([{product, product:create_product()}]).

With tags like if or ifequal we can write control structures, so if guitar body is not specified we don't write such a property in the JSON. Also with filters I can say that if the number of frets are not specified, let the default be 22. It depends on the business logic, but probably you understand what they are good for.

Custom tags, filters

In most situations the functionality what Django/ErlyDTL gives us is enough, but as always the 10% of the problems make software development complex. So we are almost there, but we need to query the product price from an external database. Or not the price but the availability. How to solve that problem? Since a template can contain only pre-defined variables and expressions, and can contain control structures or things which helps to format texts... so what to do?

The big power what ErlyDTL gives is a possibility to extend the functionality by creating custom tags and filters, or in other words a custom library. To create such a library we need to create a module which implements erlydtl_library behaviour. The module should provides all the filter and tag names it defines. Also it needs to export the functions which implements tags and filters. A custom filter is a one or two parameter function which gets the value of the variable we want to filter. The second optional parameter is the parameter of the filter (like the default value in case of default filter in the example above). Custom tags are two parameter functions which get the variables provided in the parameter list and a rendering option list. A custom tag may return with new variable bindings, so by executing a custom tag we can define variables in the page context. Very powerful tool.

-module(guitar_lib).

-export([version/0, inventory/1]).
-export([get_price/2, frets/2]).

version() -> 1.
inventory(filters) -> [frets];
inventory(tags) -> [get_price].

%% In the lack of a fret number, we can give defaults
%% depending on the guitar brand
frets(undefined, Brand) ->
    case Brand of
        <<"Schecter">> -> 24;
        <<"Jackson">> -> 24;
        _ -> 22
    end;
frets(FretNum, _Brand) ->
    FretNum.

get_price(Vars, _Opts) ->
    case lists:keyfind(id, 1, Vars) of
        {id, Id} ->
            %% Let it crash if service fails
            {ok, Price} = guitar_store:get_price_by_id(Id),
            [{value, Price}];
        false ->
            %% No id specified, we can crash or we can
            %% leave the context variables as they are
            []
    end.

We have the module, so we need to compile it and we need to explain erlydtl compiler that we have a library to be loaded during compilation. We can do it by adding {libraries, [{guitar, guitar_lib}]} to the compiler options. With the guitar name the guitar_lib module will be accessible in the templates. So we can rewrite the template a bit now.

{% load guitar %}
{% get_price id=product.id as price %}
{
    "id": {{ product.id }},
    "productCode": "{{ product.product_code }}",
    "brand": "{{ product.brand }}",
    "description": "{{ product.description }}",
    "price": "{{ price.value|default:"n/a" }}",
    "category": "{{ product.category.name }}",
    "subCategory": "{{ product.category.category.name }}",
    "frets": {{ product.frets|frets:product.brand }},
  {% if product.body %}
    "body": "mahogany",
  {% endif %}
    "pickup": 2x EMG 707TW"
}

We don't want 0 prices otherwise if somebody manages to buy a product in that price, we need to ship it them. Also, frets filter gets the brand of the product in order that it can get the parameter by which it can give sensible defaults. With the load tag we can load the library, so all the features of that library will be accessible. If we are using a small number of modules whose functionalities don't collide, we can load the modules in default by specifying the {default_libraries, [guitar]} tuple in the compiler options.

Custom tags, filters

So custom library is a powerful feature of ErlyDTL since we can implement custom business logic in templates. Also, in the lack of set tag now we can implement a specific variable setter (including some business logic). As always, the advice is that don't make template library overly complex since the goal is to have an easy-to-read template.

Share:

Wednesday, July 9, 2014

Property-based testing

Actually when I worked as a Java developer I didn't hear about the concept of that testing. We wrote small unit tests, some integration tests and a lot of end to end tests. But that kind of testing didn't come up somehow.

What is property-based testing?

Property-based testing is a good additional test when we have enough unit tests but we want to make sure that our functions or modules prepare for any type of incoming data possible. The main idea behind the two tools I know (PropEr, QuickCheck) that let us not to write test cases, let us generate them instead. From the function specifications one can easily guess what kind of inputs a function can receive. If we have an 'add/2' function and the specification says that it adds two numbers, we know that both parameters will be a number. So we can generate infinite number of test cases.

The problem is that we don't know the expected result since we don't know the input parameters. Ok, we can write that 'add(x, y) =:= x + y' but in that case we need to reimplement the function itself inside the test code. We don't want to do that. Instead we can find properties of those operations with which we can describe their nature.

Add is symmetrical, so 'add(x, y) =:= add(y, x)'. It is trivial here but testing an 'equals' method in Java that way is a very very useful test. Property-based tests become much more useful when we have the reverse operation at hand. Imagine that we implemented a 'sub/2' function which subtracts the second parameter from the first one. Great, we can test the two functions together we can write 'sub(add(x, y), y) =:= x'. If we implement a test that way, PropEr tool will generate 100 tests with random numbers, and it checks if the condition we have written is true. Sometimes we get surprising test fails because we didn't know about -0 or 0.0 or -0.0, things like that.

If the test fails PropEr will have the exact test case on which our test failed. It can be very complicated containing long lists or big float numbers and can be hard to understand the error if the tool just spit out those numbers. Instead those tools shrink the test case, they convert the test case to a simpler form and checks if it still fails. If the minimal failing test case found it is reported.

JSON name conversion

I implemented a simple framework which helps to convert Erlang records to JSON. Right now it can convert Erlang record to JSON but there is not way back. So I started to implement decoding too, and since encoding and decoding are two reverse operations we can use QuickCheck or Proper to implement both operations.

The project is here: ejson github repo and our first task is to convert an Erlang atom to json string. Unfortunately Erlang atom set is wider that json names and since we want to convert json values to Javascript object we need to make some restrictions on record field names.

A record field should look like this way: 'number_of_connections', and it should be converted into 'numberOfConnections'. The atom contains small letters and underscores (numbers as well), and the json name will be camel cased accordingly. If there is an underscore in the name the next character will be a capital letter. Those restrictions make it possible to convert the json names into Erlang atoms unambiguously.

-module(ejson_prop).

-include_lib("proper/include/proper.hrl").
-include_lib("eunit/include/eunit.hrl").

all_test() ->
    ?assertEqual(true,
                 proper:quickcheck(camel_case_prop(),
                                   [{to_file, user}])).

identifier_char() ->
    frequency([
        {$z - $a + 1, choose($a, $z)},
        {3, $_},
        {10, choose($0, $9)}
      ]).

record_name() ->
    ?LET(Chars, list(identifier_char()),
         list_to_atom(Chars)).

camel_case_prop() ->
    ?FORALL(Name, 
        ?SUCHTHAT(R, record_name(),
                  ejson_util:is_convertable_atom(R)),
            begin
                CC = ejson_util:atom_to_binary_cc(Name),
                ejson_util:binary_to_atom_cc(CC) =:= Name
            end).

I am using Proper and eunit together. This module has an eunit unit test which is the main entry point. It is picked by eunit and executed. It calls Proper in order to check the 'camel_case_prop' test. The test basically says that for all Name generated if we convert the name to a binary (cc means the camel case) and that we convert that binary back, the converted atom and the generated atom should equal.

The FORALL macro is the executor which generates test cases (see the documentation). The test case will be put in the Name variable. The function record_name() is a generator which generates atom we specified above. It generates a list of identifier characters where those characters are generated by another generator. Inside identifier_char() there is a frequency (generator too) which generated weighted test cases. With 27 weight it will choose between 'a'-'z', with 3 weight it will be an underscore and with weight 10 it will be a decimal.

Obviously we need to filter out some names like '1st_step' or '_main' so we need to include SUCHTHAT macro to filter out all test cases which doesn't conform to 'is_convertable_atom/1' (which enforces those rules). So let us create an erlang module 'ejson_util' and start to put the functions there.

is_convertable_atom(Atom) ->
    %% true if the atom can be converted by the
    %% two functions unambiguously
    L = atom_to_list(Atom),
    start_with_char(L) andalso proper_underscore(L).

start_with_char([L|_]) when L >= $a andalso L =< $z ->
    true;
start_with_char(_) ->
    false.

%% If there is an underscore, it needs to
%% follow by a letter
proper_underscore([]) ->
    true;
proper_underscore([$_, L | T]) when L >= $a
                            andalso L =< $z ->
    proper_underscore(T);
proper_underscore([$_ | T]) ->
    false;
proper_underscore([_ | T]) ->
    proper_underscore(T).

Now Proper can generate test cases. Let us implement the atom-binary conversions during continuously running property tests. Tests can be run this way:

./rebar compile
./rebar eunit apps=ejson

Try to implement the functions by yourself, it is very useful experience. At first the test will fail with the empty atom '', and so on. Now I put here the final implementation of the two functions and the utility functions as well.

atom_to_binary_cc(Atom) ->
    CC = camel_case(atom_to_list(Atom), []),
    list_to_binary(lists:reverse(CC)).

binary_to_atom_cc(Binary) ->
    UScore = underscore(binary_to_list(Binary), []),
    list_to_atom(lists:reverse(UScore)).

camel_case([], R) ->
    R;
camel_case([L], R) ->
    [L|R];
camel_case([$_, L | T], R) ->
    camel_case(T, [string:to_upper(L) | R]);
camel_case([H | T], R) ->
    camel_case(T, [H | R]).

underscore([], R) ->
    R;
underscore([Cap | T], R) when Cap >= $A
                      andalso Cap =< $Z ->
    underscore(T, [Cap + 32, $_ | R]);
underscore([Low | T], R) ->
    underscore(T, [Low | R]).
Share:
Richard Jonas. Powered by Blogger.

About me

My name is Richárd Jónás, live in Budapest, Hungary. In this blog I want to share my coding experiences in Erlang, Elixir and other languages I use. Some topics are simpler ones but you can use them as a reference. I also present some of my thoughts about developing distributed systems.