Searching two columns in a data frame

4265 views r
-1

I have a data frame with three columns. The first and second column include the name of places and the third include value. There are 50 unique places. I want to search similar combination in column 1 and Column 2 and and add corresponding values. Similar combinations, for example, include VillageA in Column 1 and VillageD in Column 2, and vice versa (VillageD in column 1 and VillageA in column 2).

Is there easiest way to do in R?

Reproducible examples:

value<-rnorm(6,0.5)
from<-c("VillageA","VillageB","VillageC","VillageD", "VillageB","VillageD")
to<-c("VillageD","VillageC", "VillageB","VillageA","VillageD","VillageB")
df<-data.frame(from,to,value)
df

   from       to             value
1 VillageA VillageD   1.8903532567673
2 VillageB VillageC 0.868595180019032
3 VillageC VillageB  1.47556560739867
4 VillageD VillageA  1.09236209542305
5 VillageB VillageD  1.17212213945941
6 VillageD VillageB   1.8903532567673

Expected result

from       to             value
VillageA VillageD   2.982715352 ## VillageD and VillageA 
VillageB VillageC   2.344160787 ## VillageC and VillageB
VillageB VillageD   3.062475396 ## VillageD and VillageB

There is no definitive combination (A--B or B--A).

answered question

1 Answer

10

Convert factors to character, create new columns in a consistent order by putting the alphabetically first village in one and the alphabetically last in the other, then sum by group:

df$from = as.character(df$from)
df$to = as.character(df$to)

df$a = pmin(df$from, df$to)
df$b = pmax(df$from, df$to)
aggregate(value ~ a + b, data = df, FUN = sum)
#          a        b      value
# 1 VillageB VillageC  0.6702636
# 2 VillageA VillageD  1.6532692
# 3 VillageB VillageD -1.2560672

posted this

Have an answer?

JD

Please login first before posting an answer.