3

I am somewhat stuck right now and could use some quick MySQL help.

Say I have a table containing reviews for products, columns product_id, customer, date. Now for each distinct product, I want to get the 3 first reviews. ORDER BY date is straightforward, but I'm stuck on how to get exactly 3 entries for each distinct product_id. Since it's all in the same table, INNER JOIN doesn't really make any sense, neither does GROUP BY. So, any ideas?

Edit: I've been successfully using this query on SQLFiddle:

SELECT * FROM (
  SELECT customer, DATE, p_asin, @curTop := 
   CASE WHEN (
     @prevP_asin <> p_asin
   )
   THEN 1 
   ELSE @curTop +1
   END AS Top, @prevP_asin := p_asin
  FROM reviews2 r, (
    SELECT @curTop :=0
  ) A
 ORDER BY r.p_asin, r.date ) B
WHERE top <=3

But for some reason, when I try to apply it to my tables in PhpMyAdmin, the numbering with Top doesn't start by 1 for new products, instead counts all together. How can Fiddle and the actual table behave so differently?

void
  • 41
  • 6

3 Answers3

2

The following will only work if we can assume date values are unique for a given product. If that is not true, then you will need to add a unique value into the where statement after "p1.date < p.date" so each row gets a unique rank.

SQLFiddle here

SELECT * FROM (
    SELECT 
      product_id,
      customer, (
      select count(*) as cnt from products p1 
          where p1.product_id = p.product_id and
                p1.date < p.date
        ) rnk
    FROM  
      products P
    ORDER BY
      P.product_id, P.date
    ) A
  WHERE rnk < 3
tylert
  • 111
  • 4
1

In Oracle, you could use rank() over but that isn't available in MySQL. I used this page for an idea on how to do it: rank over in mysql

Also, here it is in SQLFiddle

SELECT * FROM (
SELECT 
  product_id,
  customer,
  @curRank := case when (@prevProductID <> product_id)
               then  1 else   @curRank + 1 
             end as Rnk,
  @prevProductID := product_id
FROM
   products p, (
     SELECT @curRank := 0
   ) r
ORDER BY
  P.product_id, P.date
) A
  WHERE Rnk <= 3
Community
  • 1
  • 1
tylert
  • 111
  • 4
  • Thanks, that works wonderfully on SQLFiddle - I copied part of my schema there and adjusted your statement; but for some reason (even though it should be the same) when I try to execute it on my database (using PhpMyAdmin) it doesn't start counting from 1 again for each different customer (or product in my table), but just counts through all the entries, leaving me with only 3 rows from the last WHERE. How can it be that MySQL behaves so differently? – void May 02 '13 at 13:18
  • Ah, I misread the problem and made it get the first three rows for each Customer. Let me rewrite it for Product_ID. Also -- I updated the SQLFiddle. Can you post your query -- if it is different than mine, and maybe we can see why it's not working? – tylert May 02 '13 at 14:00
  • Does the schema in your SQLFiddle match the schema exactly on your site? I can't think of why it would return a different result. – tylert May 02 '13 at 14:12
  • Yes, I exported my schema in order to ensure that it's identical. Same schema, same query; that's what I don't really get. – void May 02 '13 at 14:15
  • Unfortunately I don't think I know enough about MySQL to suggest how to debug that. Short of figuring out why that query is not working, I have a different suggestion I'll post. – tylert May 02 '13 at 14:37
1

I know it's been a while since the question was asked, but now that SQL 8.0 has been out for a while, figured i might add that we can use ROW_NUMBER() to partition by the key column (product_id in the OP's question.) + order by another column (date in the OP's question.)

In a nut shell, what we want to do is:

  1. Join the tables in question
  2. ORDER BY the key column (product_id) AND by any another columns you desire.
  3. Give row numbers to each record––resetting the count at each distinct value for the key column [each new product_id we encounter]
  4. use the row number from above step as a WHERE condition to select the 1st n rows (e.g. WHERE row_number <= 3 to get the 1st 3 rows for each distinct value from the key column)

Example:

SELECT FULL_TABLE.*
FROM (
         SELECT P.*,
                PD.*,
                ROW_NUMBER() over (
                    PARTITION BY P.id
                    ORDER BY P.date
                    ) AS row_num
         FROM Products AS P
                  LEFT JOIN ProductDetails AS PD on P.id = PD.productId
         ORDER BY P.id, row_num
     ) AS FULL_TABLE
WHERE row_num <= 3;
T. Dayya
  • 690
  • 11
  • 12