1

I have a table like the following:

| user_id | product_purchased |
-------------------------------
|    111  |        A           |
|    111  |        B           |
|    222  |        B           |
|    222  |        B           |
|    333  |        C           |
|    444  |        A           |

I want to pivot the table to have user ids as rows and counts of each product purchased as by the user as columns. So for the above table, this would look like:

| user_id | product A | product B | product C |
-----------------------------------------------
|    111  |     1      |      1    |     0    |
|    222  |     0      |      2    |     0    |
|    333  |     0      |      0    |     1    |
|    444  |     1      |      0    |     0    |

I know this can be done manually using countif statements:

#standardsql
select user_id,
       countif(product_purchased = 'A') as 'A',
       countif(product_purchased = 'B') as 'B',
       etc,
group by user_id

However, in reality the table has too many possible products to make it feasible to write all of the options out manually. Is there a way to do this pivoting in a more automated and elegant way?

bbodek
  • 99
  • 3
  • 9
  • You can put the results into arrays, but I don't know if that meets your needs. – Gordon Linoff Aug 05 '18 at 21:06
  • Hey Gordon, I basically need a way to create the 2nd table above with an arbitrary numbers of columns dependent on how many products there are in the 1st table, but without manually writing logic for each of the individual products. – bbodek Aug 05 '18 at 21:09

1 Answers1

4

in reality the table has too many possible products to make it feasible to write all of the options out manually

Below is for BigQuery Standard SQL

You can do this in two steps - first prepare dynamically pivot query by running below

#standardSQL
SELECT CONCAT('SELECT user_id, ', 
  STRING_AGG(
    CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased)
  ), 
  ' FROM `project.dataset.your_table` GROUP BY user_id')
FROM (
  SELECT product_purchased 
  FROM `project.dataset.your_table`  
  GROUP BY product_purchased
)

as a result you will get string representing the query that you need to run to get desired result

As an example, if to apply to dummy data from your question

#standardSQL
WITH `project.dataset.your_table` AS (
  SELECT 111 user_id, 'A' product_purchased UNION ALL
  SELECT 111, 'B' UNION ALL
  SELECT 222, 'B' UNION ALL
  SELECT 222, 'B' UNION ALL
  SELECT 333, 'C' UNION ALL
  SELECT 444, 'A' 
)
SELECT CONCAT('SELECT user_id, ', 
  STRING_AGG(
    CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased)
  ), 
  ' FROM `project.dataset.your_table` GROUP BY user_id')
FROM (
  SELECT product_purchased 
  FROM `project.dataset.your_table`  
  GROUP BY product_purchased
)

you will get below query (formatted for better view here)

SELECT
  user_id,
  COUNTIF(product_purchased = "A") AS product_A,
  COUNTIF(product_purchased = "B") AS product_B,
  COUNTIF(product_purchased = "C") AS product_C
FROM `project.dataset.your_table`
GROUP BY user_id

Now, you can just run this to get desired result without manual coding

Again, if to run it against dummy data from your question

#standardSQL
WITH `project.dataset.your_table` AS (
  SELECT 111 user_id, 'A' product_purchased UNION ALL
  SELECT 111, 'B' UNION ALL
  SELECT 222, 'B' UNION ALL
  SELECT 222, 'B' UNION ALL
  SELECT 333, 'C' UNION ALL
  SELECT 444, 'A' 
)
SELECT
  user_id,
  COUNTIF(product_purchased = "A") AS product_A,
  COUNTIF(product_purchased = "B") AS product_B,
  COUNTIF(product_purchased = "C") AS product_C
FROM `project.dataset.your_table`
GROUP BY user_id   
-- ORDER BY user_id

you get expected result

Row user_id product_A   product_B   product_C    
1   111     1           1           0    
2   222     0           2           0    
3   333     0           0           1    
4   444     1           0           0    

Is there a way to do this pivoting in a more automated and elegant way?

You can easily automate above using any client of your choice

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230