Prelude

Some months ago I decided to take a more serious look at Elixir (I drop the gory details about my background and studying different technologies, etc, etc). While I confess that I like F# and Functional Programming in general, the main reason here was Phoenix, the web framework. After getting done with preparations and some Elixir from the website, tried to write something with Phoenix, just some CRUD and possibly a chat app. Things were going on too smoothly, without a hassle! I suspected it’s because of Phoenix. So decided to write something else - a Telegram Bot - to go deeper and actually understand Elixir and BEAM (Erlang Virtual Machine or EVM).

Fortunately I’ve found a 4 or 5 days of free time (both business and family). So decided to do a sandboxed, 4 days long, Living-With-Elixir. That was very helpful and this is what I’ve got.

How to start to thinking in Elixir

There are three concepts that we need to keep in mind:

  1. Our code snippets/blocks, have no idea of other code snippets, even inside the very same file. They are not aware of each other’s existence. There’s just messaging. They communicate via messaging. Each snippets runs inside a process - something like a green thread, coroutine, goroutine. And a process have a unique process id or pid naturally, so we can send messages to it.

  2. We do not modify state. There is no such thing as mutable state. Instead, we just pick our state, do something with it, generate a new state (since there is no mutation), recur with new state (which will become the old state of next recur). In short, we keep our state using recursion and modify it using recursion and messaging.

    when processing state, we can receive messages.

  3. OTP seems confusing to people for some reason that I don’t understand. It’s just when we recursively modify our state, we may receive messages at each recur. Now instead of writing a recursive code and handling all sort of messages ourselves, we just write bunch of callbacks and hand them over to OTP. Depending on what we need, OTP with make a server out of those callbacks, or a state machine or something else, and provides us lots of extra goodies too.

As a developer, for sure your mind will see things differently later, and even find out some inaccuracies here, but this path in thinking served me very well, helped me get started relatively fast and led me to actual results.

So our app consists of three concepts, a swarm of running snippets, recursive functions and OTP, which is a set of callbacks to be injected inside a recursive function.

The rest is all familiar, functions and things, and tooling of-course; a hassle-free one, the mix!

Creating something

Let’s create a new app:

$ mix new acrobot --sup --module Acrobot

Installing Elixir is straightforward. And mix is a build tool that comes with Elixir.

Here we tell mix to create a new app for us, named acrobot. The --module Acrobot tells mix to create a module named Acrobot which we will put stuff there and --sup means our application will be created in a supervised manner. If you do not know about something, don’t worry. They are pretty simple concepts. Just pretend you know about them. For example, remember that swarm of code snippets we spoke of? The swarm of processes? A supervisor is simply a process that monitors and supervises other processes. We can tell it to restart them, if they fail, or start new ones so we can talk to them and tell them what to do. By the way, this way your app will never crash! Of-course we have the option to make it crash, if one might like to do so; who knows!

This is what mix built for us:

Elixir Start 1

Not much yet and like any other programming endeavor, we will add stuff as we go. We are using Atom editor. To open a directory as a project, enter $ atom -n acrobot/. The packages used for Elixir programming in Atom are:

Elixir Start 2

The next step is accessing Telegram Bot APIs. For that there is a nice Elixir package named nadia. Assuming you already created your bot and have your bot access token (How do I create a bot?), first we add the dependencies, in this case, package nadia; inside file mix.exs, add it like this:

def deps do
  [{:nadia, "~> 0.4.2"}]
end

Then run $ mix deps.get from inside the project directory to get the dependencies. This will start to get our dependencies and mix always provide clear instructions about what to do if something did not took place as expected.

Now head over to file config/config.exs and set the bot token you’ve got when creating your bot:

config :nadia,
  token: "bot token"

Other options are also available for nadia, like connection timeout, environment variable and the like.

And finally since all packages (almost all) are some sort of services, supervised ones, we have to tell our app to start nadia when starting. Head back to mix.exs and add :nadia to list of applications (services you could read) that should get started at starting point:

def application do
  # Specify extra applications you'll use from Erlang/Elixir
  [extra_applications: [:logger, :nadia],
   mod: {Acrobot.Application, []}]
end

As you see the other application that will start, is :logger which we will use to log things. To start a REPL, type $ iex -S mix. Some compilations would go on (a bit longer the first time) and to see that we are actually connecting to Telegram Bot API, we call Nadia.get_me function:

iex(1)> Nadia.get_me
{:ok,
 %Nadia.Model.User{first_name: "MyAcrobot", id: 123456789, last_name: nil,
  username: "MyAcrobotBot"}}

To see if there are any updates (new messages) we call Nadia.get_updates function:

iex(2)> Nadia.get_updates limit: 5
{:ok,
 [%Nadia.Model.Update{callback_query: nil, chosen_inline_result: nil,
   edited_message: nil, inline_query: nil,
   message: %Nadia.Model.Message{audio: nil, caption: nil,
    channel_chat_created: nil,
    chat: %Nadia.Model.Chat{first_name: "Kaveh", id: 123456789,
     last_name: "Shahbazian", title: nil, type: "private", username: "idc0d"},
    contact: nil, date: 1490177869, delete_chat_photo: nil, document: nil,
    edit_date: nil, entities: nil, forward_date: nil, forward_from: nil,
    forward_from_chat: nil,
    from: %Nadia.Model.User{first_name: "Kaveh", id: 123456789,
     last_name: "Shahbazian", username: "idc0d"}, group_chat_created: nil,
    left_chat_member: nil, location: nil, message_id: 3085,
    migrate_from_chat_id: nil, migrate_to_chat_id: nil, new_chat_member: nil,
    new_chat_photo: [], new_chat_title: nil, photo: [], pinned_message: nil,
    reply_to_message: nil, sticker: nil, supergroup_chat_created: nil,
    text: "Aloha!", venue: nil, video: nil, voice: nil}, update_id: 45285403}]}

As you see we got a message Aloha!!

The Bot

First let’s add another dependency application credo, which helps us write clean Elixir code:

defp deps do
  [
    {:nadia, "~> 0.4.2"},
    {:credo, "~> 0.7", only: [:dev, :test]}
  ]
end

And to get this dependency too, we run again $ mix deps.get. To see what credo would suggest to improve our code quality, we just run $ mix credo.

Credo Output

(from credo GitHub page)

Since we have not much code yet, credo would not tell us much about the codebase - yet.

What do we like our bot do? We could go for a full blown Bot that will have a separate agent to handle each user concurrently (probably containing a state machine handling some logic) and the like. In which case we have to manage/supervise our swarm of users and monitor them if they get disconnected or crashed and the like. And maybe a database for tracing our users and send them different messages occasionally or provide them some data of their interest. We should be able to answer to callback requests or even inline requests.

But for now we just provide the server time to each user that connects to our bot.

From this point forward the only remaining concepts besides those three essential ones, are normal, mostly familiar sequential Elixir syntax. We use tuples, lists and maps and special forms of map like structs/records.

Some might not encounter pattern matching before. It’s just a technique for expanding the structure of data and decompose it to some child elements and then use those elements. We still can access “members” of a struct using a dot syntax (like in JavaScript myObject.member). But pattern matching gives us the opportunity to say if an object has a member named member, then put it inside a variable and give it to me! Very much like duck typing, but instead of using just names (of members), we use the structure of the data - like TypeScript and interfaces in Go.

Getting the chat_id

Create file /lib/acrobot/bot.ex. This is a function that gives us the chat id:

defp get_chat_id(%{:chat => %{:id => id}}) when id != nil do
  {:ok, id}
end

defp get_chat_id(msg) do
  Logger.warn "unknown #{inspect msg}"
  {:unknown_message}
end

defp means we are defining get_chat_id as a private function inside this module. The %{:chat => %{:id => id}} is a pattern for the argument. The argument should have this pattern, which means the argument should be a map (%{}) which contains the :chat key and the value of that key should be another map, containing a :id key. Then the value will be put inside identifier id. As you see we can put some guards on our arguments like when id != nil. The get_chat_id function, will return a tuple {:ok, id}, which says :ok, here is id.

Did we defined get_chat_id twice? While that’s meaningless in other programming languages, here it has a very interesting meaning. Those are different parts of get_chat_id function! As you see the second declaration has no restriction on it’s argument. So every data, that would not match the pattern declared in first clause of get_chat_id function, would go to the second clause, and there we will log it as a warning since there should not be an argument, not matching our desired pattern!

Last update_id

We also need to tell Telegram that we already got the updates to some point and we are interested in received messages/updates that came after that. By passing an update_id to Telegram we tell it after which point it show fetch updates for us. For that we define a find_max_id function which actually gives us the biggest update_id in a list of updates.

defp find_max_id(max_id, []) do
  max_id
end

defp find_max_id(max_id, [h|t]) do
  id =
    if h.update_id > max_id do
      h.update_id
    else
      max_id
    end
  find_max_id id, t
end

Again we use multi-clauses to have a clear declaration of our function. In the first clause we say if the list of updates is empty ([]) then just return the previous max_id. Easy to read and understand! In the second clause we say if the list of updates is not empty and has a head item (h), then put the rest of the list inside t (employing [h|t] pattern). If the list had just one item, then t would match as empty list []. Body of the second clause is clear. We pick the largest of h.update_id and max_id, then we continue our search inside the rest of the list (t as in tail) by calling find_max_id recursively.

We define a helper function to append new updates to the list of previous updates (our state) and at the same time find max update_id:

defp append_update(max_id, old_list, new_list) do
  res = old_list ++ new_list
  {find_max_id(max_id, res), res}
end

Answer incoming messages

Telegram Bot API calls an incoming message, an update. We just want to tell the date and time as an answer to the user. So we write a function and pass all incoming messages to it (we are ignoring many details here like Telegram has a max 30 outgoing messages/second):

defp answer_incoming([]) do
end

defp answer_incoming([h|t]) do
  case get_chat_id(h.message) do
    {:ok, chat_id} ->
      dt = :calendar.local_time()
      Nadia.send_message chat_id, "#{inspect dt}", parse_mode: :html
    err ->
      Logger.error "#{inspect err}"
  end
  answer_incoming t
end

We read it as, if the list of incomings is empty then do nothing, else, try to get the chat id, and if you got it, send back the date and time, then do the same for the rest of the incomings.

V1 No OTP

To understand better, how OTP works, we start with a simple process. What we can do with processes? We can start them - which in Elixir world is called spawning -, we can send message to them and we can receive message from them. Also we can wire-up processes together, which is called linking, so if one process failed, the other one would fail too, or get notified - depending on how we linked them. This is the basics of fantastic supervision trees that creates a robust, fault tolerant app!

To schedule getting updates from Telegram and also keeping track of last update id (the offset), we add this function:

defp schedule_updates(offset, sleep_ms \\ 0) do
  s = self()
  Task.start(fn ->
    if sleep_ms > 0 do
      Process.sleep(sleep_ms)
    end

    res = Nadia.get_updates offset: offset, timeout: @send_timeout
    case res do
      {:ok, updates} ->
        send(s, {:updates, updates})
      _ ->
        send s, res
    end
  end)
end

Task.start is just calling builtin function, spawn, to create a process - with some ceremonial helper properties that we ignore them for now. self() is the pid/process-id of the current running process that we are inside. Very much like this or self in other Object Oriented Programming Languages.

First version of our handler would be:

def handle_v1({:updates, []}, state) do
  schedule_updates(0)
  {:noreply, state}
end

def handle_v1({:updates, updates}, state) do
  l = state
  {max_id, res} = append_update 0, l, updates

  answer_incoming res

  schedule_updates(max_id + 1)
  # if succeeded, state would be empty list
  state = []
  {:noreply, state}
end

And our main recursive loop would be:

def handle_v1_loop(state) do
  state =
    if state == :start do
      schedule_updates(0)
      []
    else
      state
    end
  receive do
    {:updates, []} ->
      {:noreply, state} = handle_v1({:updates, []}, state)
      handle_v1_loop(state)
    {:updates, updates} ->
      {:noreply, state} = handle_v1({:updates, updates}, state)
      handle_v1_loop(state)
  end
end

To run our app, go to command line and type $ iex -S mix. The Elixir REPL will start, running our application. To start the main loop, inside the REPL, enter:

iex(1)> spawn fn -> Acrobot.Bot.handle_v1_loop(:start) end

Now if you send something to this bot, it will return current date and time. As you see inside the body of handle_v1_loop(...) function, there are some concerns. First we have to initialize the starting state. Second, we need to handle different kinds of received messages, using different handlers. This is OTP! Instead of stuffing everything inside a recursive loop, OTP will create that recursive loop for us. We just have to hand over a bunch of callbacks to OTP for initialization and handling different kinds of messages. It’s very much like implementing an interface in other OOP languages.

The way our code is written, it’s not fault tolerant and if it fails, it brings down the whole app.

Let’s instead just write some callbacks and let the OTP take over controlling things.

V2 OTP

Inside file /lib/acrobot/application.ex resides the root supervisor of our application - a bit like main function in other languages, just far more powerful. First let’s add a supervisor for all Tasks in the application. Fortunately there is a builtin one which we will add it to the children inside application.ex - the Acrobot.Application module.

children = [
  # Starts a worker by calling: Acrobot.Worker.start_link(arg1, arg2, arg3)
  # worker(Acrobot.Worker, [arg1, arg2, arg3]),

  supervisor(Task.Supervisor, [[name: Acrobot.TaskSupervisor]])
]

Here we added Task.Supervisor to the children list of our application supervisor. Yeah! Supervisors can supervise other supervisors too! Acrobot.TaskSupervisor is just an alias name, for convenience.

Now we redefine schedule_updates function to take advantage of this supervisor:

defp schedule_updates(offset, sleep_ms \\ 0) do
  s = self()
  Task.Supervisor.start_child(Acrobot.TaskSupervisor, fn ->
    if sleep_ms > 0 do
      Process.sleep(sleep_ms)
    end

    res = Nadia.get_updates offset: offset, timeout: @send_timeout
    case res do
      {:ok, updates} ->
        send(s, {:updates, updates})
      _ ->
        send s, res
    end
  end)
end

And the callbacks that we will pass to OTP, inside /lib/acrobot/bot.ex:

defmodule Acrobot.Bot do
  use GenServer
  require Logger

  @moduledoc """
  Our fantastic bot!
  """

  @error_delay  15 # seconds
  @start_delay  5  # seconds
  @rcvd_timeout 30 # seconds

  def start_link() do
    GenServer.start_link(__MODULE__, [], [])
  end

  ## server callbacks

  def init(state) do
    schedule_updates(0, @start_delay)
    {:ok, state}
  end

  def handle_info({:failed, update}, state) do
    l = state
    state = l ++ [update]
    {:noreply, state}
  end

  def handle_info({:updates, []}, state) do
    schedule_updates(0)
    {:noreply, state}
  end

  def handle_info({:updates, updates}, state) do
    l = state
    {max_id, res} = append_update 0, l, updates

    answer_incoming res

    schedule_updates(max_id + 1)
    # if succeeded, state would be empty list
    state = []
    {:noreply, state}
  end

  def handle_info({:error, msg}, state) do
    Logger.error "#{inspect msg}"
    schedule_updates(0, @error_delay)
    {:noreply, state}
  end

  def handle_info(msg, state) do
    Logger.warn "unknown #{inspect msg}"
    schedule_updates(0, @error_delay)
    {:noreply, state}
  end

  ## helpers

  # helpers we've added so far, do here ...
end

As you see we are implementing a behaviour (which is like an interface in OOP); by use GenServer we mean this module implements GenServer behaviour. start_link help link and start this module as a new process, init does the initialization and handle_info handles (receives) updates. There are other handlers we can add that have different usage, which we could read about them inside GenServer docs and samples - read the docs!

And we have to add it to the children of our root supervisor - our Application - inside application.ex - the Acrobot.Application module.

children = [
  # Starts a worker by calling: Acrobot.Worker.start_link(arg1, arg2, arg3)
  # worker(Acrobot.Worker, [arg1, arg2, arg3]),

  supervisor(Task.Supervisor, [[name: Acrobot.TaskSupervisor]]),
  worker(Acrobot.Bot, [])
]

Now if anything crashes beyond our expectations, we will see it inside the log, and our processes will simply get restarted (and yes there are options one can modifies!).

This was the essence of Elixir programming!

Conclusion

Of-course the original bot I’ve created, spawns new processes per user, does the bookkeeping of processes using gproc and other things. But these three were the essential tools one needs to think in Elixir. Also I’ve played with dyalizer, a fantastic tool that brings compile-time code analysis to Elixir. Overall a pleasant experience but I wasn’t going on as fast as I’d like due to:

  1. I’ve never implemented a large code base in a dynamic language. It was hard for me to find my way around. To be fair, Elixir was far less painful than the others (with some help from Atom).

  2. Finding things, libraries, functions was going on slowly - (My Side Problem - not familiar with ecosystem).

  3. Writing TDD - again I’m not that accustomed to TDD in general (Except for main functionality, My Side Problem - not familiar with ecosystem).

  4. Multi clause functions; the debugging; it was hard for me to find out which data is malformed or which clause is wrong - though it was amazing!

  5. Handling state; pulling state by your teeth every where - getting used to GenServer could eliminate this (My Side Problem - not familiar with ecosystem).

  6. Finding spec of functions - what does these parameters, that start_link accepts in each case, mean? (My Side Problem - not familiar with ecosystem)

  7. Non-pleasant corner cases like that Task (and to some extend Agent) are not very cooperative when called from OTP - I did not expect that and finding out what’s wrong was not going on fast enough (Probably My Side Problem - not familiar with ecosystem, but I do not like this).

Things I liked most, multi clause functions, pattern matching, pipe function chains (F# developers love this and to some extent it resembles LINQ for C# developers) and :observer.start shows your app as an alive organism before your eyes! Elixir ecosystem is not that young, but you might find yourself in need for something that you should create, though you can use many existing Erlang libraries - I’ve use couchbeam for connecting to CouchDB without a hitch!

Complete code can be found here.