0

Given the following text (note the newline character):

 foo bar foo5 bar foo-bar foo qux quux\n foo

I wish to get all matches for foo and qux quux, expect in the cases any of these appears next to a digit, letter, or underscore, i.e.

foo bar foo5 bar foo-bar foo qux quux\n bar foo

Using the following regex:

(?:\W)(foo|qux quux)(?:\W|$)

I get a match for all desired occurrences of foo:

foo bar foo5 bar foo-bar foo qux quux\n bar foo

The problem is that I don't get a match for qux quux, since the single whitespace that precedes it has already been matched as a non-capturing group (?:) of the 3rd match:

 ...
 Match 3
   Full match: " foo "
   Group 1   : "foo"
 ...

How can I also get qux quux?

NOTE: I understand that by inserting 2 whitespaces between foo and qux quux would make my regex work, but this is silly.

nnunes
  • 123
  • 1
  • 7

1 Answers1

2

Use word boundaries, that does what you are looking for. Like this:

\b(foo|qux quux)\b

Here is a little example you can play with:

https://regex101.com/r/oYeixv/1

And a nice write up of what a word boundary is: https://www.regular-expressions.info/wordboundaries.html

sniperd
  • 5,124
  • 6
  • 28
  • 44