If this
(°[0-5])
matches °4
and this
((°[0-5][0-9]))
matches °44
Why does this
((°[0-5])|(°[0-5][0-9]))
match °4 but not °44?
If this
(°[0-5])
matches °4
and this
((°[0-5][0-9]))
matches °44
Why does this
((°[0-5])|(°[0-5][0-9]))
match °4 but not °44?
Because when you use logical OR in regex the regex engine returns the first match when it find a match with first part of regex (here °[0-5]
), and in this case since °[0-5]
match °4
in °44
it returns °4
and doesn't continue to match the other case (here °[0-5][0-9]
):
((°[0-5])|(°[0-5][0-9]))
A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].
You are using shorter match first in regex alternation. Better use this regex to match both strings:
°[0-5][0-9]?
Because the alternation operator |
tries the alternatives in the order specified and selects the first successful match. The other alternatives will never be tried unless something later in the regular expression causes backtracking. For instance, this regular expression
(a|ab|abc)
when fed this input:
abcdefghi
will only ever match a
. However, if the regular expression is changed to
(a|ab|abc)d
It will match a
. Then since the next characyer is not d
it backtracks and tries then next alternative, matching ab
. And since the next character is still not d
it backtracks again and matches abc
...and since the next character is d
, the match succeeds.
Why would you not reduce your regular expression from
((°[0-5])|(°[0-5][0-9]))
to this?
°[0-5][0-9]?
It's simpler and easier to understand.