Chapter 10 Counting TRUE values in logical vectors

Welcome back to Quantitative Reasoning! In our previous tutorial, we wrote this snippet of R code to create a subset of the titanic data frame that only contained the rows associated with 40-year old crew members.

titanic[titanic$age == 40 & titanic$class == "Crew", ]

How many 40-year old crew members were there? Their number must be the same as the number of rows in the data frame we’ve just created. In tutorial 05, we learned that we can find the number of rows with nrow(). We simply pass the expression above as argument to the nrow() function. I’m going to copy and paste the previous expression into the parentheses.

nrow(titanic[titanic$age == 40 & titanic$class == "Crew", ])
## [1] 26

We conclude that there were 26 crew members who were exactly 40 years of age. So far so good, but this last command looks really complicated. At the end of this tutorial, I show you an alternative which is a little bit shorter. To get there, we must take a short detour and learn how to convert logical vectors into numeric vectors.

We noticed in tutorial 02 that we can’t use character objects in arithmetic operations.

"a" + 1
## Error in "a" + 1: non-numeric argument to binary operator

But, interestingly, we can use logical vectors in arithmetic operations without causing an error.

TRUE + 1
## [1] 2

R applies the following rules when it encounters a logical value in an arithmetic operation: TRUE gets converted to 1, and FALSE becomes 0. In programming, this silent conversion from one data type (here logical) to another (here numeric) is called “type coercion”. It happens not only when we use the + operator, but also when we use the sum() function. Applied to a logical vector v, sum() tells us how many elements in v are TRUE.

v <- c(TRUE, FALSE, FALSE, TRUE, TRUE)
sum(v)
## [1] 3

Let’s return to our earlier question: how many 40-year old crew members were on the Titanic? We can simply sum the elements in the logical vector titanic$age == 40 & titanic$class == "Crew".

sum(titanic$age == 40 & titanic$class == "Crew")
## [1] 26

The command with the sum() function is shorter and more direct than our earlier version with nrow(). Can you use the sum() function in a similar manner to find out how many of these 40-year old crew members survived the disaster and how many of them were male? Try it out yourself!

Here is a summary of the main points of this tutorial.

  • It’s possible to perform arithmetic operations with logical vectors.
  • In that case, TRUE is converted to 1, and FALSE gets transformed into 0.
  • We can find the number of TRUE elements in a vector with the sum() function.

Next time we learn how to conveniently summarise information in a data frame with contingency tables.

See you soon.