This question is a folo to a previous question I asked about how to best model different kind of time quantities and timeframes: In a database, how to store event occurrence dates and timeframes for fast/elegant querying?
Given a table of events, I'd like the simplest way to model and query events that have these kinds of occurrences:
- One-time: XY Rock band has a show on Dec. 12, 2014 at the Rockhouse
- Annually: Volunteer at the soup kitchen on Thanksgiving morning
- Monthly: Free night at the MoMA every first Saturday
- Weekly: Regular business hours
I've been kicking around doing a schema in this form:
- Name
- Description
- start_datetime
- end_datetime
- frequency_type (string, e.g. 'Weekly', 'Monthly')
- mon (boolean)
- tues
- wed
- thu
- fri
- sat
- sun (all booleans)
- schedule (text)
- frequency_description (text)
A common usecase I foresee is that on a given Tuesday...say, 4/5/2016, I want to find everything that is happening on that Tuesday..including all businesses that are open on regular Tuesdays, anything that happens monthly on a Tuesday, and anything happening on that specific date.
So the pseudocode query would be something like:
SELECT * from events WHERE `tues`=TRUE || DATE(start_datetime) = '2016-04-05'
At the application/controller level I could apply the necessary logic to exclude all "monthly" Tuesday events that don't happen on the first Tuesday, using a key/store in frequency_description (I'm going to ignore for discussion's sake, the "annual" edge case in which something happens every fourth thursday of November or some such thing). It'd be nice to do that exclusion in the query but I'm not sure how to design the table to allow that and still keep a simple SELECT.
I'm also predicting that it's not necessary to do a query in which I find all businesses open on Tuesday at 9AM...So the individual day fields can just be space-efficient booleans, with the schedule field being a date-store of my non-normalized specific information. The application will have logic to parse and format it for display.
Is this overkill? Let's say 70% of my events will be one-time, which eliminates the need for the mon,tue,wed, etc. and the schedule and frequency_description text-key-stores...
Should I instead have two tables? One for events, and one for some kind of event_relation in which the day_fields and key-store-textfields are joined?
That seems like a more efficient use of space...on the other hand, my query would have to be a SELECT and JOIN...which may be slower.
When dealing with a magnitude of records numbering from 10k to 100k, and doing simple EC2 hosting...should I care more about efficient space usage in my database (not just pure data storage space, but all the associated overhead with text fields and numerous columns)...or should I care more about simple SELECT statements?