2

There are 6 columns of data , 4th column has same values as the first one but some values missing, I would like to know how to sort the 4th column such that same values fall on same row using python.

Sample data

255 12  0.1     255 12  0.1
256 13  0.1     259 15  0.15
259 15  0.15    272 18  0.12
272 18  0.12            
290 19  0.09            

Desired output

255 12  0.1     255 12  0.1
256 13  0.1     
259 15  0.15    259 15  0.15
272 18  0.12    272 18  0.12        
290 19  0.09            
  • 4
    could you post sample data and expected output? – Zero Apr 08 '16 at 06:28
  • I guess there is a way to do that with csv module, but maybe there is a more pythonic way – Whitefret Apr 08 '16 at 09:46
  • I am able to do this using excel with the help of this thread http://stackoverflow.com/questions/23136316/comparing-two-columns-in-excel-inserting-blank-rows-moving-associated-data, but pythonic way would be more friendly as I have to work on large data – shikshaw reincarnates Apr 08 '16 at 10:00
  • I meant with the csv module not in excel. But that would mean you have to read from one csv to write to another, and the code would take some lines to do – Whitefret Apr 08 '16 at 10:58

1 Answers1

0

You can try merge:

print df
     a   b     c      d     e     f
0  255  12  0.10  255.0  12.0  0.10
1  256  13  0.10  259.0  15.0  0.15
2  259  15  0.15  272.0  18.0  0.12
3  272  18  0.12    NaN   NaN   NaN
4  290  19  0.09    NaN   NaN   NaN

print pd.merge(df[['a','b','c']],
               df[['d','e','f']], 
               left_on=['a','b'], 
               right_on=['d','e'], 
               how='left')

     a   b     c      d     e     f
0  255  12  0.10  255.0  12.0  0.10
1  256  13  0.10    NaN   NaN   NaN
2  259  15  0.15  259.0  15.0  0.15
3  272  18  0.12  272.0  18.0  0.12
4  290  19  0.09    NaN   NaN   NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252