-1

I'm looking to create a yearly overview of tasks that employees have performed. Tasks are stored in a schedule table where each task has an identifier (e.g. 'ABC'), a starting date, and an end date. The columns of this table thus are: id, employee, task, startDate, and endDate.

Some data could include:

id employee task startDate endDate
1 Mr. Anderson ABC 2023-01-05 2023-01-08
2 Mr. Anderson DEF 2023-01-06 2023-01-07
3 Ms. Beatrice ABC 2023-01-04 2023-01-06
4 Mr. Anderson ABC 2023-01-10 2023-01-12

What I would like to do, is show a cross tab, where you have employees on the left, and the number of days these employees have performed every type of task in a given period of time (i.e. 2023).

So, the result would be:

Employee ABC DEF
Mr. Anderson 7 2
Ms. Beatrice 3 0

I know how to COUNT the occurrence of every task for every employee, which I'm doing with something like:

SELECT COUNT(id) FROM schedule 
WHERE employee='Mr. Anderson' AND task='ABC'

This gives me 2 because there are 2 entries in the database where Mr. Anderson performs ABC. However, he performs ABC for 4+3 days (end date included in the count), so the result should be 7.

Does anyone know how I can update my query to give me the wanted result?

A solution with PIVOT would be acceptable as well (I haven't yet figured out how to work with PIVOT, but if there's a more elegant way to do this (without doing the query for every employee and every task) then please, be my guest.

EDIT: Sorry, I'm on MySQL.

Laurens Swart
  • 1,234
  • 9
  • 24

1 Answers1

1

Instead of using COUNT() us SUM() and aggregate the number of days between the start and end dates.

While inside SQL, I do not recommend pivoting the data, that's best left outside of SQL. So, aim for three columns; the employee, the task, and the sum of elapsed days. (Both the SQL language and the DBMS implementations are designed with this normalised structure in mind.)

SELECT
  employee,
  task,
  SUM(DATEDIFF(endDate, startDate)+1)   total_days
FROM
  schedule
GROUP BY
  employee,
  task

The +1 is needed because you've opted to use inclusive end dates. (Normally they'd be exclusive; the range between the 1st of Jan and the 1st of Jan should normally be 0 days)

MatBailie
  • 83,401
  • 18
  • 103
  • 137
  • Hey! Thanks for your advice and your query. When I try to run it on my database, I'm getting an error: Incorrect parameter count in the call to native function 'DATEDIFF' – Laurens Swart Jan 09 '23 at 18:00
  • @LaurensSwart As I said, that's SQL Server syntax. I'll edit the answer for MySQL. – MatBailie Jan 09 '23 at 18:40
  • Ah right, sorry, I wasn't aware that SQL Server != MySQL :-) Thank you so much for your help! – Laurens Swart Jan 09 '23 at 18:40
  • SQL Server is a product owned by Microsoft, but MySQL is a product now owned by Oracle. – MatBailie Jan 09 '23 at 18:42
  • Ah I see. You edited your answer but the query looks unchanged to me? – Laurens Swart Jan 09 '23 at 18:52
  • @LaurensSwart `DATEDIFF(DAY, startDate, endDate)` *(SQL Server; how many Days from startDate to endDate)* became `DATEDIFF(endDate, startDate)` *(MySQL; what is endDate minus startDate, implicitly measured in days)* – MatBailie Jan 09 '23 at 21:27
  • Damn this query is fast :-D! I was querying COUNTs for every employee and every task, which took the page like 5 seconds to load. Now it's almost instantaneous. Thank you so much! – Laurens Swart Jan 10 '23 at 07:54
  • @LaurensSwart Each query you run has it's own overhead. Being able to query multiple employees / tasks at once is always going to be faster, even if badly written. SQL is all about *sets* of data. – MatBailie Jan 10 '23 at 08:08
  • Thanks again! Do you think you can also help me with a follow-up question? https://stackoverflow.com/questions/75067457/sum-amount-of-days-an-employee-performs-a-task-using-datediff-but-only-count-da – Laurens Swart Jan 10 '23 at 08:45