2

I need a regex that will match several strings in a specific order separated by anything including newlines.

So, if the 3 strings are cat, <dog, </bird> then:

cat abcd abc <dog abc </bird>

matches, but

cat abcd abc </bird> abc <dog

does not.

EDIT: one more example:

catabcd abc <dog abc </bird>

and any such variation where the search terms are not separated by word boundaries should also match.

One final example, it should be greedy in that:

cat abcd
</bird>
<dog
<dog
cat
</bird>

Does NOT match.

I have tried lookahead: (?=.*?cat)(?=.*?dog)(?=.*?bird).* but this does not enforce order (and this particular example only works on one line).

Note: I am using notepad++, but can resort to perl if necessary.

Glen Yates
  • 159
  • 1
  • 8

3 Answers3

1

can resort to perl if necessary

Here is the way to do it with Perl.

separated by anything including newlines

In Perl, use the modifier s for . to match anything including newline (this modifier means matching as a single line).

Thus, you can match your input this way: m/.*cat.*dog.*bird.*/s.

This is the source code, its output is matches:

#!/bin/perl -W

$content = " cat abcd
abc dog abc
bird";

print "matches\n" if ($content =~ m/.*cat.*dog.*bird.*/s);
Alexandre Fenyo
  • 4,526
  • 1
  • 17
  • 24
1

I'm not sure where you found lookaheads, since they are usually more complex to understand than the basic features in regex... which are what I would use for your task given the info you provided:

\bcat\b.*?\bdog\b.*?\bbird\b

Screenshot

Make sure that 'Regular expression' and '. matches newline' are both checked, and that your cursor is at the beginning of the file.

The \b that I used are to ensure that the words you stated match. They ensure that the word is not preceded nor followed by another word character (so that cat will match, but cats will not).

Jerry
  • 70,495
  • 13
  • 100
  • 144
  • Jerry, thanks for the answer, but please see the edit, in that the search terms might not be bounded by word boundaries. Also, is there a way to get this to work across lines on regexr.com which is where I was testing this? – Glen Yates Sep 08 '17 at 13:06
  • @GlenYates Ok, then I think you can remove them. I added that description at the end so that you could change it yourself if you didn't need them ^^ And I don't like regexr, it doesn't have all the things I believe it should have. I use regex101.com where I can have the `s` flag to make `.` match newlines, and there's a new feature there called unit tests that was added since I used it last time that looks useful! xD https://regex101.com/r/hpQ8Jl/1/tests – Jerry Sep 08 '17 at 13:14
  • Please see my hopefully last edit, I can't have specifically the last search term in the middle of what would otherwise be a match, i.e. `cat ` should NOT match. – Glen Yates Sep 08 '17 at 14:41
  • @GlenYates Ah, I just saw your comment now. Welp, too late now. – Jerry Sep 25 '17 at 16:41
1

It may you need something looks like this:

cat(?:(?!bird|cat).)*dog(?:(?!dog|bird).)*bird

It matches only one cat and after this only one dog and then only one bird

with the help of negative look-ahead assertion

Shakiba Moshiri
  • 21,040
  • 2
  • 34
  • 44
  • This is getting closer, however there are 2 additional cases that don't work, `cat dog cat bird` and `cat dog dog bird`. Perhaps more clearly, I cant have a dog (or multiple dogs) between a cat and a bird, regardless of the existence of other cats between the dog and the bird. Please note, I've pretty much given up on a regex solution for this and am going to code. – Glen Yates Sep 08 '17 at 17:25
  • What they are? comment here – Shakiba Moshiri Sep 08 '17 at 17:27
  • To clarify my above comment, when I say "I cant have a dog (or multiple dogs) between a cat and a bird...", I mean I want to detect this condition and thus it should be a match. – Glen Yates Sep 08 '17 at 17:40
  • @GlenYates I am not sure to understand you well. So it should match multiple `dog`s between only one `cat` and only one `bird`? If it is not the case please add any possibilities you have to your question. – Shakiba Moshiri Sep 08 '17 at 17:45
  • Basically yes, but additionally the existence of 1-n cats between the dog and the bird should still result in a match, so we have: `cat dog bird`, `cat dog dog bird`, and `cat dog cat bird` should all match. – Glen Yates Sep 08 '17 at 17:55
  • @GlenYates try these one – Shakiba Moshiri Sep 08 '17 at 17:59
  • You sir, are a genius! – Glen Yates Sep 08 '17 at 18:09