2

I'm having a problem with pandas. I have a dataframe with three columns: 'id1','id2','amount'.

From this, I would like to create another dataframe which index is 'id1', which columns is 'id2', and the cells contain the corresponding 'amount'.

Let's go for an example:

import pandas as pd
df = pd.DataFrame([['first_person','first_item',10],['first_person','second_item',6],['second_person','first_item',18],['second_person','second_item',36]],columns = ['id1','id2','amount'])

which yields:

     id1              id2             amount
0    first_person     first_item      10
1    first_person     second_item     6
2    second_person    first_item      18
3    second_person    second_item     36

And from this I would like to create a second dataframe which is:

                 first_item    second_item
first_person     10            6
second_person    18            36

Of course, before posting I've worked on it for a time, but all I've managed to do for this is a double 'for loop'... Which for the size of my dataframes is nowhere to be computable. Would you know how to do this in a more pythonic way? (which would obviously be far more efficient than 'for' loops!)

Laurel
  • 5,965
  • 14
  • 31
  • 57
ysearka
  • 3,805
  • 5
  • 20
  • 41

1 Answers1

4

I think you can use pivot with rename_axis (new in pandas 0.18.0):

print df
             id1          id2  amount
0   first_person   first_item      10
1   first_person  second_item       6
2  second_person   first_item      18
3  second_person  second_item      36

print df.pivot(index='id1', columns='id2', values='amount')
        .rename_axis(None)
        .rename_axis(None, axis=1)

               first_item  second_item
first_person           10            6
second_person          18           36
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • It seems to fit perfectly what I need! Execpt for that renaming index problem. Indeed, the '.rename_axis(None)' method returns the following error: 'must pass an index to rename' (I am under Python 2.7 if it matters). – ysearka Apr 26 '16 at 12:44
  • And what is version of `pandas`? `print pd.show_versions()` – jezrael Apr 26 '16 at 12:45
  • 0.17.1 which explains why the rename_axis doesn't work. – ysearka Apr 26 '16 at 12:46
  • If you can update, use `df.index.name = None` and `df.columns.names = None` – jezrael Apr 26 '16 at 12:47
  • Without updating I found on another post: http://stackoverflow.com/questions/29765548/remove-index-name-in-pandas This works perfectly for erasing index and columns names. Thank you for the pivot trick, it helps me a lot! – ysearka Apr 26 '16 at 12:48