Extract list from tuples and transpose in python

2330 views python
4

I have the dataframe given below. I want to extract the first list from the tuples list and transpose the the list which is extracted to columns.

data = {'Document_No':[0.0,1.0], 'list_of_topics': [
([(0, 0.14572892),
  (1, 0.014889247),
  (11, 0.44593897)],
 [(4, [0]), (5, [4]), (6, [11]), (7, [11]), (8, [11, 4]), (9, [11, 4])],
 [(4, [(0, 0.9999998)]),
  (7, [(11, 0.9999998)]),
  (9, [(4, 0.05520946), (11, 0.93936676)])]),
([(0, 0.2453892),
  (11, 0.78657897)],
 [(4, [0]), (5, [4]), (6, [11]), (7, [11]), (8, [11, 4]), (9, [11, 4])],
 [(4, [(0, 0.9999998)]),
  (7, [(11, 0.9999998)]),
  (9, [(4, 0.05520946), (11, 0.93936676)])])
]}

df = pd.DataFrame(data)

desired result:

  Document_No     0            1                 11
0          0.0  0.14572892  0.014889247     0.44593897
1          1.0  0.2453892   0               0.78657897

My solution:

pd.DataFrame([[j[0] for j in i] for i in df['list_of_topics']], index=df['Document_No']).transpose()
Out[245]: 
Document_No                    0.0                    1.0
0                  (0, 0.14572892)        (0, 0.14572892)
1                         (4, [0])               (4, [0])
2            (4, [(0, 0.9999998)])  (4, [(0, 0.9999998)])

Not getting the desired result. Can anyone help me out in finding where I am doing wrong.

answered question

@pygo, it's already mentioned in the question.

1 Answer

5

You can pick your requrired tuples in column and use regular expressions to extract the data

df.list_of_topics = df.list_of_topics.apply(lambda x: x[0])
df.list_of_topics = df.list_of_topics.astype(str)
df[[0,1,2]] = df.list_of_topics.str.extractall('(\d+.\d+)').unstack().astype(float)

Out:

    Document_No     list_of_topics                                   0           1            2
0   0.0     [(0, 0.14572892), (1, 0.014889247), (11, 0.445...   0.14572892  0.014889247     0.44593897
1   1.0     [(0, 0.2453892), (1, 0.14076247), (11, 0.78657...   0.2453892   0.14076247  0.78657897

posted this

Have an answer?

JD

Please login first before posting an answer.