Is it possible to use a regex to match "February 2009", for example?
Asked
Active
Viewed 6.5k times
39
-
3Is it allowed to match "Undecimber 15000"? – kennytm Apr 16 '10 at 19:05
-
The restrictions are January - December, followed by 1990 - 2010. Fortunately, non-english isn't a concern. – Jeremy Apr 16 '10 at 19:08
-
1*followed by 1990 - 2010* — Is it intentional to have this tight upper limit? There's only 8 months left to 2011. – kennytm Apr 16 '10 at 19:16
5 Answers
59
Along the lines of
\b(?:Jan(?:uary)?|Feb(?:ruary)?|...|Dec(?:ember)?) (?:19[7-9]\d|2\d{3})(?=\D|$)
that's
\b # a word boundary (?: # non-capturing group Jan(?:uary)? # Jan(uary) |Feb(?:ruary)? # |... # and so on |Dec(?:ember)? # Dec(ember) ) # end group # a space (?: # non-capturing group 19[7-9]\d|2\d{3} # 1970-2999 ) # end group (?=\D|$) # followed by: anything but a digit or the end of string

Tomalak
- 332,285
- 67
- 532
- 628
33
I had to work on this to match a few fringe examples, but I ended up using
(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})
to capture dates with word months in them

Beerswiller
- 659
- 6
- 8
-
4Just a minor thing, for the months instead of (Nov|Dec) it should be (?:Nov|Dec), or at least I had to change that in order for it to work with Python otherwise it was returning an empty [''] match – Walter R Aug 03 '17 at 19:04
-
You can add (?i)(regex_part_to_make_case_insensitive) or (?i)regex_part_to_make_case_insensitive(?-i) depending on the regex processor you are using. – Onyr Apr 22 '21 at 16:04
4
Modifying Beerswiller's answer, if you want "st"/"nd"/"rd" variations:
(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}(st|nd|rd|th)?)?(([,.\-\/])\D?)?((19[7-9]\d|20\d{2})|\d{2})*

Pedro Machin
- 51
- 3
-
-
@ZujajMisbahKhan it is not possible to make all mathematical calculations in a regular expression to see which months have 28, 29, 30 or 31, also depending on whether the year is odd or not. For this you can take the resulting matched date and validate it with you language of choice functions. – Alex P. May 10 '23 at 18:05
3
This regex accounts for some spacing around the comma.
Sometimes it's not always in the right place.
((\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?)(\d{1,2}(st|nd|rd|th)?)?((\s*[,.\-\/]\s*)\D?)?\s*((19[0-9]\d|20\d{2})|\d{2})*
-
1I would suggest `Sep(?:t(?:ember)?)?`. I've seen Sept used in numerous datasets. – Sancarn Jun 08 '23 at 09:09
0
The regex below will take into account the max number of days for the relevant month and also take into account leap years for February.
^(((0[1-9]|[12][0-9]|3[01])[ ]\b(?:Jan(?:uary)?|Mar(?:ch)?|May|Jul(?:y)?|Aug(?:ust)?|Oct(?:ober)?|Dec(?:ember)?)|(0[1-9]|[12][0-9]|30)[ ]\b(?:Apr(?:il)?|Jun(?:e)?|Sep(?:tember)?|Nov(?:ember)?)|(0[1-9]|1\d|2[0-8])[ ]\b(?:Feb(?:ruary)?))[ ]\d{4}|29[ ]\b(?:Feb(?:ruary)?)[ ](\d{2}(0[48]|[2468][048]|[13579][26])|([02468][048]|[1359][26])00))$

LW001
- 2,452
- 6
- 27
- 36

Sarah Reynolds
- 1
- 1