0

I have a dataframe

Name    SubName
AB      ABCD
UI      10UI09
JK      89-JK-07
yhk     100yhk0A

I need a column added mentioning the characters in SubName which are not in Name.

Name    SubName    Remainder
AB      ABCD       CD
UI      10UI09     1009
JK      89-JK-07   89--07
yhk     100yhk0A   1000A 
spd
  • 334
  • 1
  • 12

3 Answers3

2

You can also use apply to get the new columns, like this:

df["Remainder"] = df.apply(lambda x: (x["SubName"].replace(x["name"], "")), axis=1)

Output:

name    SubName    Remainder
AB       ABCD        CD
UI      10UI09      1009
JK     89-JK-07    89--07
yhk    100yhk0A     1000A
1

You need to use a loop here, you can use a regex:

import re
df['Remainder'] = [re.sub(f'[{"".join(set(a))}]', '', b)
                   for a,b in zip(df['Name'], df['SubName'])]

Alternative with join and set (could be faster in some cases):

df['Remainder'] = [''.join([c for c in b if c not in S])
                   if (S:=set(a)) else b
                   for a,b in zip(df['Name'], df['SubName'])
                  ]

output:

  Name   SubName Remainder
0   AB      ABCD        CD
1   UI    10UI09      1009
2   JK  89-JK-07    89--07
3  yhk  100yhk0A     1000A
mozway
  • 194,879
  • 13
  • 39
  • 75
1
 df['Remainder'] = df.apply(lambda x: x.SubName.replace(x.Name, ''), axis = 1)
ouflak
  • 2,458
  • 10
  • 44
  • 49