Looping structure and Pandas

2248 views python
7

I'm learning python and with it pandas and some tools about Data Science. Doing the exercises of a book I wrote the above code on IPython but I receive an error message when the block is executed:

for i in range(len(df1)):
    if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:
        print (df1['Temperature'][i])

Traceback (most recent call last):

File "<ipython-input-140-9f31dd23b324>", line 2, in <module>
    if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:

  File "D:\Programas\Anaconda\lib\site-packages\pandas\core\series.py", line 766, in __getitem__
    result = self.index.get_value(self, key)

  File "D:\Programas\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 3103, in get_value
    tz=getattr(series.dtype, 'tz', None))

  File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value

  File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value

  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item

KeyError: -1

Where df1['Temperature'] is a Data Frame such that Temperature is one of its columns. The code intending to compare two consecutive values of that column and verify the numeric difference between them and print the temperature given a statement. What am I doing wrong?

answered question

Did you read your error message? The last line is pretty unequivocal.

I wasn't relating it to my code but to an internal pandas' message,my bad

You should avoid chained indexing. df1['Temperature'][i] should be replaced with df1.loc[i, 'Temperature']. But it's irrelevant since DYZ's answer shows you how to properly use pandas to solve this problem without a loop.

2 Answers

8

In statement below:

if (df1['Temperature'][i]-df1['Temperature'][i-1]) > 0.1:

when i is 0 then, in df1['Temperature'][i-1] the value of i-1 becomes -1 index which is the error message trying to tell. One way may be to change the range such that i starts from 1 since, it looks for i-1 anyways so, it may not skip 0 index. You can try:

for i in range(1, len(df1)):

Note: you mentioned comparing the consecutive rows, may be you can use absolute value if you do not care about whether it is increasing or decreasing.

posted this
3

You should not use loops like that in Pandas. Pandas works best when your code is vectorized:

big_difference = (df1["Temperature"] - df1["Temperature"].shift(-1)) > 0.1
print(df1[big_difference]["Temperature"])

posted this

Have an answer?

JD

Please login first before posting an answer.