A short note on the startsWith function

The startsWith function comes with base R, and determines whether entries of an input start with a given prefix. (The endsWith function does the same thing but for suffixes.) The following code checks if each of “ant”, “banana” and “balloon” starts with “a”:

startsWith(c("ant", "banana", "balloon"), "a")
# [1]  TRUE FALSE FALSE

The second argument (the prefix to check) can also be a vector. The code below checks if “ant” starts with “a” and if “ant” starts with “b”:

startsWith("ant", c("a", "b"))
# [1]  TRUE FALSE

Where things might get a bit unintuitive is when both arguments are vectors of length >1. Why do you think the line of code below returned the result it did?

startsWith(c("ant", "banana", "balloon"), c("a", "b"))
# [1] TRUE TRUE FALSE

This makes sense when we look at the documentation for startsWith‘s return value:

A logical vector, of “common length” of x and prefix (or suffix), i.e., of the longer of the two lengths unless one of them is zero when the result is also of zero length. A shorter input is recycled to the output length.

startsWith(x, prefix) checks if x[i] starts with prefix[i] for each i. In our line of code above, the function checks if “ant” starts with “a” and “banana” starts with “b”. Since x had length greater than prefix, we “recycle” prefix and check if “balloon” starts with “a”.

If you want to check if each x[i] starts with any prefix[j] (with j possibly being different from i), we could do the following:

x <- c("ant", "banana", "balloon")
prefix <- c("a", "b")
has_prefix <- sapply(prefix, function(p) startsWith(x, p))
has_prefix
#          a     b
# [1,]  TRUE FALSE
# [2,] FALSE  TRUE
# [3,] FALSE  TRUE

apply(has_prefix, 1, any)
# [1] TRUE TRUE TRUE
Advertisement

3 thoughts on “A short note on the startsWith function

    • From the documentation: startsWith() is equivalent to but much faster than

      substring(x, 1, nchar(prefix)) == prefix

      or also

      grepl(“^prefix”, x)

      where prefix is not to contain special regular expression characters (and for grepl, x does not contain missing values, see below).

      Liked by 2 people

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s