0

I have a table that acts as a message log, with the two key tables being TIMESTAMP and TEXT. I'm working on a query that grabs all alerts (from TEXT) for the past 30 days (based on TIMESTAMP) and gives a daily average for those alerts.

Here is the query so far:

--goback 30 days start at midnight
declare @olderdate as datetime
set @olderdate = DATEADD(Day, -30, DATEDIFF(Day, 0, GetDate()))

--today at 11:59pm
declare @today as datetime
set @today = dateadd(ms, -3, (dateadd(day, +1, convert(varchar, GETDATE(), 101))))
print @today

--Grab average alerts per day over 30 days
select 
    avg(x.Alerts * 1.0 / 30)
from
    (select count(*) as Alerts 
     from MESSAGE_LOG 
     where text like 'The process%' 
       and text like '%has alerted%'
       and TIMESTAMP between @olderdate and @today) X

However, I want to add something that checks whether there were any alerts for a day and, if there are no alerts for that day, doesn't include it in the average. For example, if there are 90 alerts for a month but they're all in one day, I wouldn't want the average to be 3 alerts per day since that's clearly misleading.

Is there a way I can incorporate this into my query? I've searched for other solutions to this but haven't been able to get any to work.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    Well that is how average works. If you had 90 alerts and averaging them over 30 days what would you expect the average to be? Do you want the "average" to be the average of only those days where there are alerts? Would the average in your example be 90? Isn't that more misleading than saying we had an average of 3 per day? – Sean Lange Jan 24 '19 at 20:14
  • did you have a column named TIMESTAMP that is actually of datatype DATETIME? if so that's a bit confusing. – TheMouseMaster Jan 24 '19 at 20:25

3 Answers3

0

This isn't written for your query, as I don't have any DDL or sample data, thus I'm going to provide a very simple example instead of how you would do this.

USE Sandbox;
GO

CREATE TABLE dbo.AlertMessage (ID int IDENTITY(1,1),
                               AlertDate date);

INSERT INTO dbo.AlertMessage (AlertDate)
VALUES('20190101'),('20190101'),('20190105'),('20190110'),('20190115'),('20190115'),('20190115');
GO

--Use a CTE to count per day:
WITH Tots AS (
    SELECT AlertDate,
           COUNT(ID) AS Alerts
    FROM dbo.AlertMessage
    GROUP BY AlertDate)
--Now the average
SELECT AVG(Alerts*1.0) AS DayAverage
FROM Tots;
GO

--Clean up
DROP TABLE dbo.AlertMessage;
Thom A
  • 88,727
  • 11
  • 45
  • 75
0

Instead of dividing by 30 to get the average, divide by the count of distinct days in your results.

select 
    avg(x.Alerts * 1.0 / x.dd)
from
    (select count(*) as Alerts, count(distinct CAST([TIMESTAMP] AS date)) AS dd
...
Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
0

You're trying to compute a double-aggregate: The average of daily totals.

Without using a CTE, you can try this as well, which is generalized a bit more to work for multiple months.

--get a list of events per day
DECLARE @Event TABLE
(
    ID INT NOT NULL IDENTITY(1, 1)
    ,DateLocalTz DATE NOT NULL--make sure to handle time zones
    ,YearLocalTz AS DATEPART(YEAR, DateLocalTz) PERSISTED
    ,MonthLocalTz AS DATEPART(MONTH, DateLocalTz) PERSISTED
)
/*
INSERT INTO @Event(EntryDateLocalTz)
SELECT DISTINCT CONVERT(DATE, TIMESTAMP)--presumed to be in your local time zone because you did not specify
FROM dbo.MESSAGE_LOG
WHERE UPPER([TEXT]) LIKE 'THE PROCESS%' AND UPPER([TEXT]) LIKE '%HAS ALERTED%'--case insenitive
*/
INSERT INTO @Event(DateLocalTz)
VALUES ('2018-12-31'), ('2019-01-01'), ('2019-01-01'), ('2019-01-01'), ('2019-01-12'), ('2019-01-13')
--get average number of alerts per alerting day each month
--  (this will not return months with no alerts,
--  use a LEFT OUTER JOIN against a month list table if you need to include uneventful months)
SELECT
    YearLocalTz
    ,MonthLocalTz
    ,AvgAlertsOfAlertingDays = AVG(CONVERT(REAL, NumDailyAlerts))
FROM
    (
        SELECT
            YearLocalTz
            ,MonthLocalTz
            ,DateLocalTz
            ,NumDailyAlerts = COUNT(*)
        FROM @Event
        GROUP BY YearLocalTz, MonthLocalTz, DateLocalTz
    ) AS X
GROUP BY YearLocalTz, MonthLocalTz
ORDER BY YearLocalTz ASC, MonthLocalTz ASC

Some things to note in my code:

  1. I use PERSISTED columns to get the month and year date parts (because I'm lazy when populating tables)
  2. Use explicit CONVERT to escape integer math that rounds down decimals. Multiplying by 1.0 is a less-readable hack.
  3. Use CONVERT(DATE, ...) to round down to midnight instead of converting back and forth between strings
  4. Do case-insensitive string searching by making everything uppercase (or lowercase, your preference)
  5. Don't subtract 3 milliseconds to get the very last moment before midnight. Change your semantics to interpret the end of a time range as exclusive, instead of dealing with the precision of your datatypes. The only difference is using explicit comparators (i.e. use < instead of <=). Also, DATETIME resolution is 1/300th of a second, not 3 milliseconds.
  6. Avoid using built-in keywords as column names (i.e. "TEXT"). If you do, wrap them in square brackets to avoid ambiguity.
Elaskanator
  • 1,135
  • 10
  • 28