0

In cases specific special characters could be added as follows:

=FILTER(D1:D, REGEXMATCH(D1:D, "^("&REGEXREPLACE(TEXTJOIN("|", 1, A:A),"([().])","\\$1")&")$")=FALSE)

But when wanting to analyze all of them, the best approach would be a negative match, fetching all the different values of A-Z, a-z, and 0-9.

So, I tried to reproduce it with a direct addition that finds the values but when matching the values, it returns the number of matches to zero:

=FILTER(D1:D, REGEXMATCH(D1:D, "^("&REGEXREPLACE(TEXTJOIN("|", 1, A:A),"([^A-Za-z0-9])","\\$1")&")$")=FALSE)

What is the correct approach in this case?

Google Sheets data with expected result

Digital Farmer
  • 1,705
  • 5
  • 17
  • 67
  • Do you have sample data with expected results? Btw, `([^A-Za-z0-9])` is better written like `[\W_]`, any non-word character *or* underscore. Also, if not captured you can refer to `\\$0`. – JvdV Jun 27 '22 at 13:35
  • hello @JvdV , sorry for not putting the spreadsheet link, I completely forgot. – Digital Farmer Jun 27 '22 at 13:45
  • Been looking at the desired results but I cant yet figure out the rules why that is the desired results. – JvdV Jun 27 '22 at 14:24
  • Hi @JvdV I'm looking for values that are in column D but not in column A. – Digital Farmer Jun 27 '22 at 14:27
  • I thought I had an answer but I'm still confused with the expected results (your explaination in the sheet). Are you looking for case-sensitive matching as well? – JvdV Jun 27 '22 at 15:09
  • @JvdV Yes, I'm filtering out the values that exist in column D but don't exist in column A, taking into account any differences, including uppercase or lowercase letters, accents, punctuation, and special characters. – Digital Farmer Jun 27 '22 at 15:29
  • Then, why is 'Toronto FC' from D:D in your desired results when clearly it is in column A:A with extra '*' being the only difference. Basically all of your examples are found in column A:A? – JvdV Jun 27 '22 at 17:11
  • Because ```Toronto FC``` is different to ```Toronto * FC``` @JvdV (means that the team name has changed in the database, so from now on it will no longer appear as ```Toronto * FC``` but as ```Toronto FC```) and this is how I will need to officially use it from now on. – Digital Farmer Jun 27 '22 at 17:25

1 Answers1

2
=FILTER(D1:D, 
  REGEXMATCH(D1:D, 
    "^("&TEXTJOIN("|", 1, 
      REGEXREPLACE(A:A,"[.*+?^${}()|[\]\\]","\\$0"))&")$")=FALSE)

Regular expression of all special characters: [.*+?^${}()|[\]\\]

Reference:

Escaping

idfurw
  • 5,727
  • 2
  • 5
  • 18