# Apply a function to pairs of columns in a loop in R

My data look like this:

```
a1 <- runif(30, 1, 100)
b1 <- runif(30, 1, 100)
c1 <- runif(30, 1, 100)
a2 <- runif(30, 1, 100)
b2 <- runif(30, 1, 100)
c2 <- runif(30, 1, 100)
dframe <- data.frame(a1=a1, b1=b1, c1=c1, a2=a2, b2=b2, c2=c2)
```

I want to calculate the correlation between a1 and a2, b1 and b2, c1 and c2, but I'd like to do it in an efficient way, avoiding writing one line of code for each correlation. I tried to write a for loop but I did not succeed.

markus
answered question

### 2 Answers

Here is an option

```
lapply(split.default(dframe, sub("\\d+$", "", names(dframe))), cor)
#$a
# a1 a2
#a1 1.0000000 0.1132033
#a2 0.1132033 1.0000000
#$b
# b1 b2
#b1 1.00000000 0.09113974
#b2 0.09113974 1.00000000
#$c
# c1 c2
#c1 1.0000000 -0.2066311
#c2 -0.2066311 1.0000000
```

We split your data frame column-wise and then iterate over the resulting list with `lapply`

.

markus
posted this

A base R idea,

```
sapply(unique(gsub('\\d+', '', names(dframe))), function(i)
cor(dframe[grepl(i, names(dframe))]))
```

which gives,

`a b c [1,] 1.00000000 1.0000000 1.00000000 [2,] 0.01987806 -0.2247265 -0.08667891 [3,] 0.01987806 -0.2247265 -0.08667891 [4,] 1.00000000 1.0000000 1.00000000`

Sotos
posted this

## Have an answer?

JD