Small gotcha when using negative indexing

Negative indexing is a commonly used method in R to drop elements from a vector or rows/columns from a matrix that the user does not want. For example, the code below drops the third column from the matrix M:

M <- matrix(1:9, nrow = 3)
M
#      [,1] [,2] [,3]
# [1,]    1    4    7
# [2,]    2    5    8
# [3,]    3    6    9

M[, -3]
#      [,1] [,2]
# [1,]    1    4
# [2,]    2    5
# [3,]    3    6

Now, let’s say we want to write a function with signature dropColumns(x, exclude) that takes in a matrix x and a vector of indices exclude, and returns a matrix without the columns listed in exclude. (The result in the code snippet above could be achieved with the call dropColumns(M, 3)). A natural attempt (without checking the indices are within bounds) would be the following:

dropColumns <- function(x, exclude) {
  x[, -exclude, drop = FALSE]
}

Let’s try it out:

dropColumns(M, c(1, 2))
#      [,1]
# [1,]    7
# [2,]    8
# [3,]    9

What happens if exclude is the empty vector (i.e. we don’t want to drop any columns)? The code below shows two different ways of doing this, and BOTH of them don’t give the expected result!

dropColumns(M, c())
# Error in -exclude : invalid argument to unary operator

dropColumns(M, integer(0))
# [1,]
# [2,]
# [3,]

Moral of the story: When doing negative indexing with variables, you need to check for whether the variable is a vector of length zero. The revised function below is one way to get around this issue: let me know if there are more elegant ways to achieve this!

dropColumns <- function(x, exclude) {
  if (length(exclude) > 0) 
    x[, -exclude, drop = FALSE]
  else x
}

dropColumns(M, c())
#      [,1] [,2] [,3]
# [1,]    1    4    7
# [2,]    2    5    8
# [3,]    3    6    9

dropColumns(M, integer(0))
#      [,1] [,2] [,3]
# [1,]    1    4    7
# [2,]    2    5    8
# [3,]    3    6    9

dropColumns(M, c(1, 2))
#      [,1]
# [1,]    7
# [2,]    8
# [3,]    9
Advertisement

4 thoughts on “Small gotcha when using negative indexing

  1. However, this does not have to do with negative indexing.

    > M M
    [,1] [,2] [,3]
    [1,] 1 4 7
    [2,] 2 5 8
    [3,] 3 6 9
    >
    > M[,c()]

    [1,]
    [2,]
    [3,]
    > M[,integer(0)]

    [1,]
    [2,]
    [3,]

    It has to do with what kinds of objects are valid to index a matrix (or vector)

    > c()
    NULL

    > (1:4)[c()]
    integer(0)

    Consider also:
    > (1:4)[NA]
    [1] NA NA NA NA

    > M[,NA]
    [,1] [,2] [,3]
    [1,] NA NA NA
    [2,] NA NA NA
    [3,] NA NA NA

    Like

    • That’s true, although I would note that M[, c()] and M[, integer(0)] give results that one would expect, whereas the versions with negative indexing do not.

      Like

      • I’m not sure about that. In the case of c(), we have

        > c()
        NULL
        > -c()
        Error in -c() : invalid argument to unary operator

        The result is not due to a negative index, it’s due to an error that occurs before the index is passed to the matrix.

        With regard to integer(0), we have

        > integer(0)
        integer(0)
        > -integer(0)
        integer(0)
        > all.equal(-integer(0),integer(0))
        [1] TRUE

        So in this case the negative index and the positive index are actually the same index.

        R is indeed a strange and sometimes subtle language!

        Like

  2. I don’t know if this is really “more elegant” but it can solve your problem :
    dropColumns <- function(x, exclude) {
    x[, -c(ncol(M)+1,exclude), drop = FALSE]
    }

    dropColumns(M,c()) will return the whole M matrix ; so will the call of dropColumns(M,0)

    The trick is that negative indexing with a column numbers that doesn't belong to the object does nothing.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s