1

I want to find the number of consecutive records (in chronological order, according to a field [when]) matching a condition (here data > 10). Thanks to this answer I arrived at this:

SELECT tt.i, COUNT(*) AS count
FROM  t tt WHERE data > 10
GROUP BY i, (SELECT COUNT(*) FROM t WHERE [when] < tt.[when] AND NOT data > 10 AND i=tt.i );

I also need the results to be sorted chronologically, and although it looks like GROUP BY sorts them automatically, it does not seem guaranteed, so I need to add an ORDER BY clause:

SELECT tt.i, COUNT(*) AS count
FROM  t tt WHERE data > 10
GROUP BY i, (SELECT COUNT(*) FROM t WHERE [when] < tt.[when] AND NOT data > 10 AND i=tt.i )
ORDER BY i, (SELECT COUNT(*) FROM t WHERE [when] < tt.[when] AND NOT data > 10 AND i=tt.i );

This works, but is not good due to subquery repetition. So I want to use an alias:

SELECT tt.i, COUNT(*) AS count, (SELECT COUNT(*) FROM t WHERE [when] < tt.[when] AND NOT data > 10 AND i=tt.i ) AS xid
FROM  t tt WHERE data > 10
GROUP BY i, xid
ORDER BY i, xid;

This works too, but I do not want xid to be in my results; I put it in the SELECT just to be aliased. Furthermore, I do not understand why this works, since according to this answer GROUP BY is executed before the SELECT, hence should no have access to the alias xid.

Here is my test table, in case one would like to try the query out:

CREATE TABLE t(i INT , [when] DATETIME, data INT);
INSERT INTO t(i, [when], data) VALUES (1, '20130813', 1);
INSERT INTO t(i, [when], data) VALUES (2, '20130812', 121);
INSERT INTO t(i, [when], data) VALUES (1, '20130811', 132);
INSERT INTO t(i, [when], data) VALUES (2, '20130810', 15);
INSERT INTO t(i, [when], data) VALUES (1, '20130809', 9);
INSERT INTO t(i, [when], data) VALUES (2, '20130808', 1435);
INSERT INTO t(i, [when], data) VALUES (1, '20130807', 143);
INSERT INTO t(i, [when], data) VALUES (2, '20130806', 18);
INSERT INTO t(i, [when], data) VALUES (1, '20130805', 19);
INSERT INTO t(i, [when], data) VALUES (2, '20130804', 1);
INSERT INTO t(i, [when], data) VALUES (1, '20130803', 1234);
INSERT INTO t(i, [when], data) VALUES (2, '20130802', 124);
INSERT INTO t(i, [when], data) VALUES (1, '20130801', 6);
  1. How could I make my query correct (ensure a correct order), avoiding both repeating the subquery and outputting xid in the results?
  2. Why does my last query work in spite of the clause execution order?
Community
  • 1
  • 1
unagi
  • 428
  • 3
  • 15

1 Answers1

1

To remove columns from the final result, use a subquery:

SELECT only, what, I, want
FROM (SELECT ...);

The SQL standard specifies that aliases from the SELECT clause are not available in the GROUP BY clause to allow that specific execution order. However, SQLite does not enforce all restrictions from the standard, and if it actually finds a definition for xid, it just uses it. But when a standard-conforming interpretation and some other one conflict, SQLite tends to choose the former; consider the result of this query:

SELECT a AS b
FROM (SELECT 1 AS a, 'x' AS b UNION ALL
      SELECT 2,      'y'      UNION ALL
      SELECT 3,      'x')
GROUP BY b;
CL.
  • 173,858
  • 17
  • 217
  • 259
  • Thank you. 1. Is the SELECT guaranteed to output results in the same order it found them? I am concerned by the fact that the outer SELECT would be allowed to change the order of the results given by the inner one; in which case I would have to ensure the ordering of the outer query too. 2. Is it imprudent to use this query unsupported by the SQL standard? I have tried several RDBMS on [SQL Fiddle](http://www.sqlfiddle.com/#!15/569ba/1) and they all seemed to accept it: 3. Is it possible without an additional subquery? In this simple case the overhead might not be high however, I thought. – unagi Feb 27 '16 at 23:57
  • 1
    If you have doubts, you can *move* the ORDER BY to the outer query. There is no database that actually conforms 100% to the standard. The subquery will be flattened or implemented as a coroutine, if possible, so there will be no overhead. – CL. Feb 28 '16 at 07:21
  • That is enough to solve my problem, thank you. However regarding 2.: would it be possible to write this query so that it is compliant with the SQL standard? The fact that no RDMS is 100% standard compliant does not imply that it is not possible to write all/most-RDMS compatible SQL. At least Oracle and SQL server seem to reject alias use in GROUP BY clause. – unagi Feb 28 '16 at 09:36
  • That would be a disappointing standard if there was no way to write it without repeating the subquery. – unagi Feb 28 '16 at 10:56
  • 1
    I mean, `SELECT ... FROM (SELECT ... AS xid ...) GROUP BY xid ORDER BY xid`. – CL. Feb 28 '16 at 15:09
  • Moving the GROUP BY to the outer query, indeed! Thank you. Edit: I just tried it out and it yields different results. – unagi Feb 28 '16 at 22:21