2

I have a dataframe, something like:

|   | a | b                |
|---|---|------------------|
| 0 | a | {'d': 1, 'e': 2} |
| 1 | b | {'d': 3, 'e': 4} |
| 2 | c | NaN              |
| 3 | d | {'f': 5}         |
| 4 | d | {'e':8,'f': 5}   |
| 5 | d | {'e':9,'f': 5}   |
| 6 | d | {'f': 7}         |

I am using the following code from df.join(pd.DataFrame.from_records(df['b'].mask(df.b.isna(), {}).tolist())) How can I create column from dictionary keys in same dataframe? and getting result like:

|   | a | b                | d | e | f |
|---|---|------------------|---|---|---|
| 0 | a | {'d': 1, 'e': 2} | 1 | 2 |nan|
| 1 | b | {'d': 3, 'e': 4} | 3 | 8 |nan|
| 2 | c | NaN              |nan|nan|nan|
| 3 | d | {'f': 5}         |nan|nan| 5 |
| 4 | d | {'e':8,'f': 5}   |nan| 4 | 5 |
| 5 | d | {'e':9,'f': 5}   |nan|nan| 5 |
| 6 | d | {'f': 7}         |nan|nan| 7 |

Why are the values in e randomly getting allocated and not by there adjascent rows? How can I solve this issue?

Thanks in advance!

abhi
  • 337
  • 1
  • 3
  • 12
  • 1
    The `b` column in the above and below do not match? Is that intentional? – Henry Ecker Jun 22 '21 at 04:35
  • @HenryEcker, I guess that's a typo – ThePyGuy Jun 22 '21 at 04:41
  • @HenryEcker not the b column, if you look at the expanded columns **e** and it's value in **b** – abhi Jun 22 '21 at 04:48
  • Right. But row 4 and 5 are `{'e':4,'f': 5}` and `{'e':2,'f': 5}` above and `{'e':8,'f': 5}` `{'e':9,'f': 5}` below. I don't know which data to start with to try and replicate the issue. – Henry Ecker Jun 22 '21 at 04:51
  • 2
    Maybe that should be the point of my question. I cannot replicate this behaviour, would you provide a [MRE](https://stackoverflow.com/help/minimal-reproducible-example) as a single contiguous block of code that can be copied into a clean workspace and reproduce the issue? – Henry Ecker Jun 22 '21 at 04:52
  • @HenryEcker sorry it was a typo, I've corrected it. – abhi Jun 22 '21 at 05:11

1 Answers1

1

Reason should be original DataFrame has no default RangeIndex, so after join is wrongly assigned new DataFrame, which has by default default index.

You need set index values by df.index for correct align new DataFrame.

df.join(pd.DataFrame(df['b'].mask(df.b.isna(), {}).tolist(), index=df.index))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252