# Take row of a dataframe that add up to a given value

4074 views
-5

I have the following dataframe: For which a have to take users, from the id column, untill the total value add up to a given value, 14 for example. How can I chose the rows to satisfy this condition in an effitient way?

This is the data I'm using for the example:

``````{'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10},
'valor': {0: 5.690813772729765,
1: 5.502473982705203,
2: 7.341171631905721,
3: 6.792634352953639,
4: 3.3972025109972535,
5: 3.417867922325758,
6: 7.336228970419381,
7: 0.048008919685266216,
8: 2.365638019103776,
9: 0.9593678139592221}}
``````

Anything that you've tried, which didn't work?

I'm traying this with loops, but I'm not sure how to test all the cases and even then it is taking for ever to run.

Does it have to be exactly 14?

@LuisRamonRamirezRodriguez you should add the try in the question.

@Nathan The closest one "<="

If you need 'the closest one', you'd have to check every single possibility, which is 2**10 in your example (still doable), but in larger databases this is going to take forever... Why exactly do you need to do this?

@LuisRamonRamirezRodriguez - what is expected output?

3

You could do it like this:

``````total = 0
i = 0
while total < 14:
total += data['id'][i]
i += 1
print(total)
``````

It does not necessarily add up to exactly 14, but basing on your question, that is not necessary.

posted this
5

``````df = df.loc[df['valor'].cumsum() <= 14, 'id']
print (df)
0    1
1    2
Name: id, dtype: int64
``````

Details:

``````print (df['valor'].cumsum())
0     5.690814
1    11.193288
2    18.534459
3    25.327094
4    28.724296
5    32.142164
6    39.478393
7    39.526402
8    41.892040
9    42.851408
Name: valor, dtype: float64

print (df['valor'].cumsum() <= 14)
0     True
1     True
2    False
3    False
4    False
5    False
6    False
7    False
8    False
9    False
Name: valor, dtype: bool
``````

posted this