0

Given the following table (products_filter):

enter image description here

How can I do a SELECT ... FROM products LEFT JOIN products_filter ... in such a way that it only returns products which have ALL the specified (filter_id,filter_value) pairs.

Example: for (filter_id, filter_value) = (1,1),(3,0) it should only return the product with id 90001, because it matches both values.

2 Answers2

1

If the specified filter pairs is restricted to a deifnite number the the following query should work.

Select a. Product_id
From products a
Left outer join
(Select product_id,filter_id,filter_value,count(*)
From product_filter
Where filter_id in (1,1) and filter_value in(3,0)
Group by product_id,filter_id,filter_value
Having count(*)=2)b
On(a.product_id=b.product_id)
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • If you highlight text and press the `{}` button it will format it as a code sample. Also a distinct count might be needed here since you could have 2 rows with duplicate filter values – Conrad Frix Jan 25 '16 at 19:05
  • I think the filter_Id and filter values sets are not going to work. For one filter_value isn't 3 the ID is... secondarily the paired sets would be transitive meaning that if we had XY and AB as values you's would return XA, XB, YA, YB; not just XA and YB. However the overall approach is solid. – xQbert Jan 25 '16 at 19:22
0

As you only said you wanted the PRODUCTS values having the desired filter attributes... I've limited results to just product.*

The below query uses an inline view with the count of distinct filters by product ID. The outer where clause then uses the distinct count (in case duplicate filters could exist for a product) of the filter_IDs.

The # in the where clause should always match the number of where clause paired sets in the inline view.

Your sample data indicated that the paired sets could be a subset of all filters. so this ensures each filter pair (or more) exists for the desired product.

SELECT p.* 
FROM products p
LEFT JOIN (SELECT product_ID, count(Distinct filter_ID) cnt
           FROM products_Filter
           WHERE (Filter_ID = 1 and filter_value = 1)
              or (Filter_ID = 3 and filter_value = 0)
           GROUP BY Product_ID) pf
   on P.Product_ID = PF.Product_ID
 WHERE pf.cnt = 2
xQbert
  • 34,733
  • 2
  • 41
  • 62
  • this seems to work, could you please mention what indexes should i create in order to avoid performance issues? –  Jan 25 '16 at 20:02
  • Indexes on P.Product_Id and PF.ProductID as well as a combined index on Filter_ID and Filter_value would improve performance of the above. if not already existing. Additionally if we can guarantee filter_ID is distinct for each product_ID in Products_Filter then removal of key word distinct in the count could also improve performance as the check wouldn't' have to execute. If there is a UNIQUE index on Product_Id and filter_Id in products_Filter... then key word distinct could be safely removed. However if a product can have multiple values for the same filter; then you couldn't do this. – xQbert Jan 25 '16 at 20:06
  • a product can have only one filter_value per filter_id, i have added unique index on (product_id,filter_id) but if i remove the keywork `distinct` i get ` Using temporary;`, which is not happening with distinct in place –  Jan 25 '16 at 20:17
  • hmm... [this stack article](http://stackoverflow.com/questions/13633406/using-index-using-temporary-using-filesort-how-to-fix-this) shows why 'using temporary' may exist; but I can't see how/why it would apply here. if there is a unique index on the desired columns, I can't see why this would occur. DId you add an order by to either select? – xQbert Jan 25 '16 at 22:34