Two common mistakes with the colon operator in R

R has a colon operator which makes it really easy to define a sequence of integers. For example, the code 1:10 generates a vector of consisting of the integers from 1 to 10 (inclusive). However, using the colon operator is not without its pitfalls! I will highlight two common mistakes here.

First, imagine that you have a variable n which has value 5. What do you think the following code prints out?

for (i in 1:n+1) print(i)

My first instinct is that it should print out the numbers 1, 2, …, 6 (inclusive), with one number on each line. Wrong! Instead, this is the output we get:

[1] 2
[1] 3
[1] 4
[1] 5
[1] 6

What is going on here? The problem here is one of operator precedence. Just like how \times and \div come before + and -, in R : comes before +. Hence, the code written above is interpreted as

for (i in (1:n)+1) print(i)

which is why the numbers 2 to 6 are printed out instead of the numbers 1 to 5. If we want to print the numbers 1 to n+1 inclusive, put brackets to enforce the correct order for evaluation:

for (i in 1:(n+1)) print(i)

Let’s move on to the second common mistake. Let’s say I have a vector vec and I want to print its elements one by one. The first instinct of most of us would be to write something like this:

for (i in 1:length(vec)) print(vec[i])

This works most of the time, but not all the time. Consider what happens when vec is an empty vector:

vec <- c()
for (i in 1:length(vec)) print(vec[i])


What happened here? The problem is that the colon operator can return a descending sequence of integers! In the code above, length(vec) has value 0, so 1:length(vec) is the same as c(1, 0). It prints out vec[1] and vec[0], which are both NULL.

To avoid this problem, use the seq_along function instead:

for (i in seq_along(vec)) print(vec[i])

You may think that this is not really a big problem; after all, it only fails when we have an empty vector right? Well, there are 2 responses to that. First, you don’t want your code to ever do anything unintended. In this case the mistake was easy to catch; in some cases this mistake might be 3 levels deep in your code which is thousands of lines long— not so easy to catch anymore! The second response is that this mistake will crop up more easily when you don’t start from the first element of the vector.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s