2

I have below code that removes whole words that contain any pattern

$patterns = ["are", "finite", "get", "er"];
$string = "You are definitely getting better today";

$re = '\S*('.implode('|', $patterns).')\S*';
$string = preg_replace('#'.$re.'#', '', $string);
$string = preg_replace('#\h{2,}#', ' ', $string);
echo $string;

the output of the above code is

You today

I want to split this code into two functions so that the first function only removes whole words present in the pattern and a second function that only removes words that contain any pattern.

I expect the output of the function one that remove only whole words

You definitely getting better today (**are** is removed)

and output of the other function that remove whole word that contain pattern

You are today (**definitely getting better** are removed)
wp78de
  • 18,207
  • 7
  • 43
  • 71
DMP
  • 523
  • 4
  • 19

1 Answers1

2

The first part is basic: Only match whole keywords (actually, you can find dozens of Q&As like that, e.g this)

\b(?:are|finite|get|er)\b

Which can be applied to your code like this: $re = '\b('.implode('|', $patterns).')\b';

The second part is a bit more involved: While you keep expanding substring matches to match the entire word you want to exclude words that match whole keywords.
We can use a lookahead to achieve this like that:

(?!\b(?:are|finite|get|er)\b)\S*(?:are|finite|get|er)\S*

Demo, Sample Code:

$patterns = ["are", "finite", "get", "er"];
$string = "You are definitely getting better today";
$alternations = ''.implode('|', $patterns);
$re = '(?!\b(?:'.$alternations.')\b)\S*(?:'.$alternations.')\S*';
$string = preg_replace('#'.$re.'#', '', $string);

If the \b does not work for you and you'd like to go with space as word boundary use lookarounds:

(?<=\s)(?:are|finite|get|er)(?=\s)

Sample Code (updated) case 1.

wp78de
  • 18,207
  • 7
  • 43
  • 71