# Having fun with Immutable Data Structures: Implementing a Markov Chain in Scala

--

In this article, I will describe a **simple** implementation of a Markov chain using Scala. In particular, I will show how to build a **purely functional** data structure, without the use of any mutable state or variables.

On the way, I will introduce the usage of the builder design pattern, and I will describe a simple method that trains automatically a Markov Chain from a training set.

This article will help in understanding immutable data structures, helping to reach the right **mindset** since these data structures are not so common in **object-oriented** programming like Java or C++.

Generally speaking, immutable objects are **easier/simpler** to reason about. Plus they make **concurrent** programming easier.

The knowledge of **Scala** is not mandatory, you can follow also if you have only a basic Java knowledge.

## What Is a Markov Chain?

Simply said, a Markov Chain is a **directed graph**. It is composed of nodes and directed edges, where each edge links a source node to a destination node.

Plus, each edge is labelled with a probability: this is the **probability of jumping** (or walking) from a current node to a next node. The only constraint is that given a node the **sum of the probabilities **of the outgoing edges should be** 1**.

For example in the above chain, if I’m currently sleeping I have the 60% probability of taking a run as next action, the 20% probability of remaining at sleep or the 20% probability of eating ice cream. The process of going from one node to the next is also called “*random walk*”. The probability of jumping to a particular next state depends only on the current step, the previous path does not help in choosing the next step.

This model helps to solve many mathematical problems (like queueing theory in networking), but also it has many practical applications like studying the **interaction of users with a website** or to implement **text suggestion/autocomplete **(given that you are typing “*good*” what is the probability that your next word will be “*morning*” or “*evening*”?).

# Implementing a chain builder

The first step is to design a builder for a **node** of the graph:

The value of the node is a **generic T value**, then we have a sequence of outgoing links (made up of output state and the probability of walking on that link). Instead of modifying a private field/variable, for each call to *linkedTo(to: T, prob: Probability) *the **copy** method will return a new instance of NodeBuilder, containing the old fields plus the new one updated.

When building the chain the probability constraints are **relaxed**. Only when transforming a *NodeBuilder* to a *MarkovNode* I’m checking that **the sum of the probability is 1**, this is ensured in the *toNode* method. The list of outgoing states is implemented using an **immutable list**-like data structure (appending an element to the list will return a new instance of the list without modifying the old instance. Note that the list is **not copied**, just the new first item points to previous immutable old list).

The chain builder maintains the state of all the nodes of the chain:

I’m using a map for each state *T* we have a related *NodeBuilder* containing the outgoing links. The method *linkTo(from: T, to: T, prob: Probability)* produce a new instance of *ChainBuilder* containing a new immutable *Map* (constructed in a similar way as immutable *Seq* described before, the new *Map* is constructed pointing to the old *Map* data when possible, and copying only a few pointers when needed. The scala documentation says that the **cost of creating a new immutable Map **adding a new item to an old

*Map*is

*effectively constant time**, but this might depend on some assumptions such as maximum length of a vector or distribution of hash keys*).

Finally, this is how the builder usage appears to the end-user:

Probabilities are represented as a rational number (*Probability*(3, 10) = 0.3) to avoid approximations when training the chain from a training set.

# The Markov chain implementation

Finally, the Markov chain implementation is trivial. Each node *T* is pointing to a list of nodes, represented by these classes:

The code here is not complete, the full implementation contains also some convenient methods (like random walks) and some validation in the constructor (ensuring that each destination node is also part of the graph, avoiding **dangling edges**).

# Training the Markov chain from a sequence of steps

I implemented also a utility method that trains a chain from a sequence of ordered steps. For each step, I’m simply counting all the occurrence of distinct next chain steps and using that count to generate the related probability of jump. For example, imagine to train a model from a given textbook:

The Markov chain will then contains for each state (a word) the probability of jumping to a next state (the next word in the phrase). I provided a **random walk** method that generates a random sequence of states (in this example a random English text).

This is a simple implementation, but on my (not-so-new) laptop, I can easily train a model from large **books** in a few seconds and then generating **random phrases** in English using a random walk.

Here you can find more examples and a **full implementation**: https://github.com/DarioBalinzo/Markov4s