Convert the difference between two dates into an integer

3334 views python
-3

I have a date-frame (df) which the head looks like:

        DATE  BBG  FLAG  DATE_DIFF
0 2014-03-04    A     0        NaT
1 2014-04-04    B     1    31 days
2 2014-04-07    C     0     3 days
3 2014-04-08    D     0     1 days
4 2014-04-09    E     0     1 days

the column DATE_DIFF is the difference between the date in a row and the date in the row above. I used the following to calculate DATE_DIFF:

df['DATE_DIFF'] = (df['DATE']-df['DATE'].shift(1))

I am looking for a way to convert the DATE_DIFF column to interger so the resulting data-frame would look like:

        DATE  BBG  FLAG  DATE_DIFF
0 2014-03-04    A     0        NaN
1 2014-04-04    B     1         31 
2 2014-04-07    C     0          3 
3 2014-04-08    D     0          1 
4 2014-04-09    E     0          1 

I have tried to use:

df['DATE_DIFF'] = (df['DATE']-df['DATE'].shift(1)).astype(int)

but that just gives me the error:

TypeError: cannot astype a timedelta from [timedelta64[ns]] to [int32]

Any help would be much appreciated

Thanks

answered question

2 Answers

5

If you have two date objects, you can just subtract them.

from datetime import date

d0 = date(2008, 8, 18)
d1 = date(2008, 9, 26)
delta = d1 - d0
print delta.days

The relevant section of the docs: https://docs.python.org/library/datetime.html

posted this
13

Your can do the exact same thing you are currently doing just add dt.days

df['diff'] = (df['DATE']-df['DATE'].shift(1)).dt.days

    DATE    BBG FLAG    diff
0   2014-03-04  A   0   NaN
1   2014-04-04  B   1   31.0
2   2014-04-07  C   0   3.0
3   2014-04-08  D   0   1.0
4   2014-04-09  E   0   1.0

just note that the dtype will be a float not an int because the first row is NaN

posted this

Have an answer?

JD

Please login first before posting an answer.