39

Is it possible to use a regex to match "February 2009", for example?

Jeremy
  • 9,023
  • 20
  • 57
  • 69

5 Answers5

59

Along the lines of

\b(?:Jan(?:uary)?|Feb(?:ruary)?|...|Dec(?:ember)?) (?:19[7-9]\d|2\d{3})(?=\D|$)

that's

\b                  # a word boundary
(?:                 # non-capturing group
  Jan(?:uary)?      # Jan(uary)
  |Feb(?:ruary)?    #
  |...              # and so on
  |Dec(?:ember)?    # Dec(ember)
)                   # end group
                    # a space
(?:                 # non-capturing group
  19[7-9]\d|2\d{3}  # 1970-2999
)                   # end group
(?=\D|$)            # followed by: anything but a digit or the end of string
Tomalak
  • 332,285
  • 67
  • 532
  • 628
33

I had to work on this to match a few fringe examples, but I ended up using

(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})

to capture dates with word months in them

Beerswiller
  • 659
  • 6
  • 8
  • 4
    Just a minor thing, for the months instead of (Nov|Dec) it should be (?:Nov|Dec), or at least I had to change that in order for it to work with Python otherwise it was returning an empty [''] match – Walter R Aug 03 '17 at 19:04
  • You can add (?i)(regex_part_to_make_case_insensitive) or (?i)regex_part_to_make_case_insensitive(?-i) depending on the regex processor you are using. – Onyr Apr 22 '21 at 16:04
4

Modifying Beerswiller's answer, if you want "st"/"nd"/"rd" variations:

(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}(st|nd|rd|th)?)?(([,.\-\/])\D?)?((19[7-9]\d|20\d{2})|\d{2})*

  • It also passes 34rth July 1999 as a valid date! – Zujaj Misbah Khan Oct 20 '20 at 06:24
  • @ZujajMisbahKhan it is not possible to make all mathematical calculations in a regular expression to see which months have 28, 29, 30 or 31, also depending on whether the year is odd or not. For this you can take the resulting matched date and validate it with you language of choice functions. – Alex P. May 10 '23 at 18:05
3

This regex accounts for some spacing around the comma.

Sometimes it's not always in the right place.

((\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?)(\d{1,2}(st|nd|rd|th)?)?((\s*[,.\-\/]\s*)\D?)?\s*((19[0-9]\d|20\d{2})|\d{2})*
Acapulco
  • 3,373
  • 8
  • 38
  • 51
Todd
  • 41
  • 1
0

The regex below will take into account the max number of days for the relevant month and also take into account leap years for February.

^(((0[1-9]|[12][0-9]|3[01])[ ]\b(?:Jan(?:uary)?|Mar(?:ch)?|May|Jul(?:y)?|Aug(?:ust)?|Oct(?:ober)?|Dec(?:ember)?)|(0[1-9]|[12][0-9]|30)[ ]\b(?:Apr(?:il)?|Jun(?:e)?|Sep(?:tember)?|Nov(?:ember)?)|(0[1-9]|1\d|2[0-8])[ ]\b(?:Feb(?:ruary)?))[ ]\d{4}|29[ ]\b(?:Feb(?:ruary)?)[ ](\d{2}(0[48]|[2468][048]|[13579][26])|([02468][048]|[1359][26])00))$
LW001
  • 2,452
  • 6
  • 27
  • 36