Compare strings in two pandas columns and write remainder in new column

Question

I have a dataframe

Name    SubName
AB      ABCD
UI      10UI09
JK      89-JK-07
yhk     100yhk0A

I need a column added mentioning the characters in SubName which are not in Name.

Name    SubName    Remainder
AB      ABCD       CD
UI      10UI09     1009
JK      89-JK-07   89--07
yhk     100yhk0A   1000A

score 2 · Answer 1 · answered Mar 23 '22 at 12:38

2

You can also use apply to get the new columns, like this:

df["Remainder"] = df.apply(lambda x: (x["SubName"].replace(x["name"], "")), axis=1)

Output:

name    SubName    Remainder
AB       ABCD        CD
UI      10UI09      1009
JK     89-JK-07    89--07
yhk    100yhk0A     1000A

answered Mar 23 '22 at 12:38

AmrSherbiny

23
5

It is giving me the same SubName column. – spd Mar 25 '22 at 05:53

score 1 · Accepted Answer · answered Mar 23 '22 at 12:27

You need to use a loop here, you can use a regex:

import re
df['Remainder'] = [re.sub(f'[{"".join(set(a))}]', '', b)
                   for a,b in zip(df['Name'], df['SubName'])]

Alternative with join and set (could be faster in some cases):

df['Remainder'] = [''.join([c for c in b if c not in S])
                   if (S:=set(a)) else b
                   for a,b in zip(df['Name'], df['SubName'])
                  ]

output:

  Name   SubName Remainder
0   AB      ABCD        CD
1   UI    10UI09      1009
2   JK  89-JK-07    89--07
3  yhk  100yhk0A     1000A

please go through this https://stackoverflow.com/q/71634131/17778275 — spd, Mar 27 '22 at 09:12

score 1 · Answer 3 · edited Mar 23 '22 at 14:47

1

 df['Remainder'] = df.apply(lambda x: x.SubName.replace(x.Name, ''), axis = 1)

edited Mar 23 '22 at 14:47

ouflak

2,458
10
44
49

answered Mar 23 '22 at 13:14

Subhash Tulsyan

146
7

https://stackoverflow.com/q/71634131/17778275 – spd Mar 27 '22 at 09:12

Compare strings in two pandas columns and write remainder in new column

3 Answers3