purrr: using %in% with a list-column

3833 views r
-1

I have a column of question responses and a column of possible correct_answers. I'd like to create a third (logical) column (correct) to show whether a response matches one of the possible correct answers.

I think I may need to use a purrr function but I'm not sure how to use one of the map functions with %in%, for example.

library(tibble)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)

data <- tibble(
  response = c('a', 'b', 'c'),
  correct_answers = rep(list(c('a', 'b')), 3)
)

# works but correct answers specified manually
data %>%
  mutate(correct = response %in% c('a', 'b'))
#> # A tibble: 3 x 3
#>   response correct_answers correct
#>   <chr>    <list>          <lgl>  
#> 1 a        <chr [2]>       TRUE   
#> 2 b        <chr [2]>       TRUE   
#> 3 c        <chr [2]>       FALSE

# doesn't work
data %>%
  mutate(correct = response %in% correct_answers)
#> # A tibble: 3 x 3
#>   response correct_answers correct
#>   <chr>    <list>          <lgl>  
#> 1 a        <chr [2]>       FALSE  
#> 2 b        <chr [2]>       FALSE  
#> 3 c        <chr [2]>       FALSE

Created on 2018-11-05 by the reprex package (v0.2.1)

answered question

2 Answers

5

%in% doesn't check nested elements inside a list, use mapply (baseR) or map2 (purrr) to loop through the columns and check:

data %>% mutate(correct = mapply(function (res, ans) res %in% ans, response, correct_answers))
# A tibble: 3 x 3
#  response correct_answers correct
#  <chr>    <list>          <lgl>  
#1 a        <chr [2]>       TRUE   
#2 b        <chr [2]>       TRUE   
#3 c        <chr [2]>       FALSE  

Use map2_lgl:

library(purrr)
data %>% mutate(correct = map2_lgl(response, correct_answers, ~ .x %in% .y))
# A tibble: 3 x 3
#  response correct_answers correct
#  <chr>    <list>          <lgl>  
#1 a        <chr [2]>       TRUE   
#2 b        <chr [2]>       TRUE   
#3 c        <chr [2]>       FALSE 

Or as @thelatemail commented, both can be simplified:

data %>% mutate(correct = mapply(`%in%`, response, correct_answers)) 
data %>% mutate(correct = map2_lgl(response, correct_answers, `%in%`))

posted this
2

Try the unlist function...

library(tibble)
library(dplyr)

data <- tibble(
    response = letters,
    correct_answers = rep(list(sample(letters, 4)), 26)
)

result = 
    data %>% 
    mutate(correct = response %in% unlist(correct_answers))
table(result$correct)
#FALSE  TRUE 
#   22     4 

posted this

Have an answer?

JD

Please login first before posting an answer.