3

I've got a Snowflake SQL query I'm trying to execute in R via ODBC connection that looks like this

SET quiet=TRUE;

USE SOMEDATABASE.SOMESCHEMA;

--Select timestamp of last sale per customer
DROP TABLE IF EXISTS sales;
CREATE TEMPORARY TABLE sales(CustomerId VARCHAR(16777216), SaleTS TIMESTAMP_NTZ(9));

INSERT INTO sales
SELECT CustomerId, 
       SaleTS
FROM SALES
WHERE SaleTS>= '2020-11-19 00:00:00'
AND SaleTS <= '2020-11-19 23:59:59.999'
GROUP BY CustomerId;

--Use temp table to get correct row from sales table
SELECT  SUM(SalesDetail.price) as SumPrice
        COUNT(*) as SoldVolume
FROM sales
LEFT JOIN SALES as SalesDetail
    ON Sales.CustomerId = SalesDetail.CustomerId 
    AND sales.SaleTS = SalesDetail.SaleTS 

Querying Microsoft SQL Server from R I'd normally include set nocount no; at the top of the query to ensure only the last step is returned to R to avoid the error Actual statement count 6 did not match the desired statement count 1. Error makes sense, SQL is returning 6 components when R is expecting 1 (6 one for each step in my SQL query). In Snowflake there doesn't appear to be an option to set nocount on in the same way. My question is how do I avoid the above error. Does anyone have any experience of executing a mutli-step Snowflake SQL query via R? How can I get R to receive just the last statement from the ODBC connection. So far I've tried set nocount=TRUE;, set echo=FALSE;, set message=FALSE;, SET quiet=TRUE etc

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
Morgan Ball
  • 760
  • 9
  • 23
  • you are inserting from same table into a temp table and joining back to the same table? why not just get the sum and count directly? – eshirvana Nov 20 '20 at 17:40
  • and also `SET NOCOUNT ON` option just get rids of count messages sql returns , not the result , if you have multiple set results ( multiple selects) returning back from sql , you still get nultiple result set in your output – eshirvana Nov 20 '20 at 17:42
  • This is just an example, my actual script is longer and involves more than 1 table. Really would like to solve the mulit-step issue. There are plenty of good reasons to structure a SQL query in this way rather than heavily nested single query approach. – Morgan Ball Nov 20 '20 at 17:55
  • R see the messages as an output and that's the point I need to stop the messages. – Morgan Ball Nov 20 '20 at 17:56

1 Answers1

3

Snowflake SQL is expressive enough and proposed code could be structured as single query:

WITH cte AS (
    SELECT CustomerId, MAX(SaleTS) AS SaleTS  -- here agg function is required
    FROM SALES
    WHERE SaleTS>= '2020-11-19 00:00:00'
    AND SaleTS <= '2020-11-19 23:59:59.999'
    GROUP BY CustomerId
)
SELECT  SUM(SalesDetail.price) as SumPrice
        COUNT(*) as SoldVolume
FROM cte
LEFT JOIN SALES as SalesDetail
    ON Sales.CustomerId = SalesDetail.CustomerId 
    AND sales.SaleTS = SalesDetail.SaleTS;

The original query is using the same name for both table and temporary table differing only by case sales vs SALES, which is error-prone.

Second: database and schema could be set up during establishing connection, so there is no need for USE inside script. Alternatively fully qualified name could be used in the script.


I guess the intent of query is as follow:

WITH cte AS (
  SELECT *
  FROM SOMEDATABASE.SOMESCHEMA.SALES
  WHERE SaleTS BETWEEN '2020-11-19 00:00:00' AND '2020-11-19 23:59:59.999'
  QUALIFY ROW_NUMBER() OVER(PARTITION BY CustomerId ORDER BY SaleTS DESC) = 1
)
SELECT COUNT(*) AS SoldVolume, SUM(price) as SumPrice
FROM cte;

If it is possible that a single person has two entries for exactly the same SaleTS then RANK() OVER(...) should be used instead.

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • 1
    Nice use of `with`, I take your point that in this way you can remove the need for temp tables. – Morgan Ball Nov 25 '20 at 15:34
  • 1
    Note with Oracle, I have often has to do ```SELECT * FROM (WITH cte as (...))``` as the drivers seem to expect ```SELECT``` when calling from R (or VBA). Just FYI in case OP runs into this in R. +1 as SQL is more expressive than people generally give it credit for. – Cole Nov 26 '20 at 17:20
  • what if you are using set var1='test'; before this? the set is considered a separate statement it seems – mike01010 Jun 16 '23 at 16:56