Intro to Enumerables and Streams in Elixir

Intro to Enumerables and Streams in Elixir

In an attempt to better understand Elixir I have tasked myself with solving one of programmings most notorious coding challenges – the Fibonacci Number. However, before I can take a crack at solving Fibonacci I need to develop some basic knowledge of two core modules in Elixir: the Enum and Stream module.

Since we’ll be covering a lot of ground, I decided to split this undertaking into three separate blog posts. First, I’m going to examine Enumerables and Elixir’s Enum module. Then, I’m going to discuss the advantages of Elixir’s Stream module and when it’s a good time to use it. And finally, I’m going to wrap it all up by combining these modules to solve the Even Fibonacci number problem on Project Euler.

If you’re like me and are in the process of learning Elixir then you’re going to have to become pretty familiar with Elixir’s Enum module. According to Dave Thomas, the author of Programming Elixir, the Enum module is the “the workhorse of collections.” Thomas goes even further to say that the “Enum module is probably the most used of all the Elixir libraries.”

Okay, so Enum is important. But what’s it actually capable of?

Before we dive into Enum let’s take a step back and discuss what an Enumerable is. Simply put, an Enumerable is a data structure that can be iterated over. In Elixir, we generally use lists and maps to iterate over a collection of data. Iterating over a collection is great and everything, but we’re programmers; we want to be able to do something to the items were iterating over. This is where the Enum module comes in.

The Enum module provides generic functions that can be applied to items in a collection. Say you have a list of digits from 1 to 3 and you want to multiply each digit by 2. With the help of the Enum.map function, you can simply iterate over each item in the list and perform the calculation:

Enum.map([1, 2, 3], fn (x) -> x * 2 end)
#=> [2, 4, 6]

The map function likely feels familiar but the rest of the function may seem somewhat cryptic if it’s your first time viewing an Elixir function.

In most Object Oriented Programming languages, if you want to multiply each item in a collection by 2 you would call the map function (or method) on the collection itself. For example, in Ruby you would do:

[1,2,3].map { |x| x * 2 }
#=> [2, 4, 6]

Here, the map operates directly on the array it’s called upon ([1,2,3]) and returns the updated values.

Elixir, being a functional programming language, doesn’t want their data structures to be mutable like Ruby – they want their data structures to be immutable. To achieve this, Elixir makes a copy of the collection you’d like to update, performs the function on the duplicated collection and then returns it without ever mutating the state of the original collection.

So, the list that was passed into Enum.map([1, 2, 3], fn (x) -> x * 2 end) is completely different then the list that is returned when the function is executed ([2, 4, 6]).

map is just one of many Enum functions that you can use to operate on a collection in Elixir. If you have programming experience you’ll likely recognize many of Enum’s functions as they appear in most contemporary programming languages.

Here are a few other examples of common functions included in the Enum module:

# Sum 
Enum.sum([1,2,3])
#=> 6

# Concat 
Enum.concat([1,2,3], [4, 5, 6])
#=> [1, 2, 3, 4, 5, 6]

# Filter 
Enum.filter([1,2,3,4,5], &(rem(&1, 2) == 0))
#=> [2, 4]

# Reduce
Enum.reduce([1,2,3], fn(x, acc) -> x * acc end)
#=> 6

# Map 
Enum.map(%{1 => 1, 2 => 2, 3 => 3}, fn {k, v} -> v * 2 end)
#=> [2, 4, 6]

The list goes on, and the more exposure you have to Elixir the more you’ll find yourself expanding your Enum function repertoire (you can view all of Elixir’s Enum functions here).

Let’s complicate things. Say I wanted to take a collection of numbers, multiply each item by itself, filter out the odd numbers, and then calculate the sum of what remains? We can certainly perform each function by itself, assign the return value to a variable and then pass that variable into the subsequent function like so:

list              = Enum.map(1..5, &(&1 * &1))
filtered_list     = Enum.filter(list, &(rem(&1, 2) == 0))
Enum.sum(filtered_list) 
#=> 20

Although this returns the correct result it doesn’t seem very Elixir-esque. And do we really need to bind all these return values to variables? It looks like we’re just using the return value of the Enum functions as the first argument in the function that follows it. Instead of passing in a variable, let’s pass in the functions themselves to circumvent any variable binding:

Enum.sum(Enum.filter(Enum.map(1..5, &(&1 * &1)), &(rem(&1, 2) == 0)))
#=> 20

Does this get us our solution? Sure. Does it hurt to look at? Most definitely. Programing is hard enough as it is, let’s not complicate it by writing code that’s hard to follow. Fortunately for us (and our eyes), Elixir provides us with an elegant solution to solve this problem.

The Pipe Operator

The pipe operator (|>), is a useful and ubiquitous operator in Elixir code. It allows you to pass the return value from one function as the first argument to the next function, allowing you to do what we did above, but without all that hideous nesting. To demonstrate, let’s concatenate two lists of numbers and then find the sum of the concatenated list. So instead of doing this:

Enum.sum(Enum.concat([1,2,3], [4,5,6]))
#=> 21

We could utilize the pipe operator and do this:

Enum.concat([1,2,3], [4,5,6]) |> Enum.sum
#=> 21

That’s all the pipe operator does – takes the return value of the expression on the left and passes it in as the first argument to the function on the right. And if the function only takes in one argument (the return value from the function before it), you can omit the first argument altogether, i.e.,Enum.sum.

Now with the help of the pipe operator we can finally turn our code into something bearable to look at:

Enum.map(1..5, &(&1 * &1)) |> Enum.filter(&(rem(&1, 2) == 0)) |> Enum.sum
#=> 20

Much better.

One thing to note about the Enum module is that all the functions are eager. This means that the function will act on a collection immediately. Elixir’s Stream module, on the other hand, is full of lazy functions. Instead of performing an immediate action on the collection, a Stream can prepare the collection for future calculations. In my next blog post, I’ll take a deeper dive into what this actually means and why lazy functions are useful.