4

I have been tasked with creating a search function that when searched, certain fields will have more weight than others.

Here is an simplified example.

cars (table)
year, make, model, color, type (columns)

Let's say someone searches for the following:

Year: 1968
Make: Ford
Model: Mustang
Color: Red
Type: Sports Car

If the cars in the table have none of the correct fields they should not show up, but if record has some of the correct fields but not all they should still show up. But certain fields should be weighted higher than others.

For instance maybe they are weighted like this:
Column - Weight
Year - 30
Make - 100
Model - 85
Color - 10
Type - 50

So if a record matches the search in the "make" field and the "model" field, that record would be above a record that matched in the "year", "color" and "type" field, because of the weights we placed on each column.

So lets say that the query matches at least one field for two records in the database, they should be ordered by the most relevant based on the weight:

1971, Ford, Fairlane, Blue, Sports Car (weight = 185)
1968, Dodge, Charger, Red, Sports Car (weight = 90)

I have been racking my brain trying to figure out how to make this work. If anyone has done something like this please give me an idea of how to make it work.

I would like to do as much of the work in MySQL as possible via joins, I think this will be bring up the results faster than doing most of the work in PHP. But any solution to this problem would be much appreciated.

Thanks in advance

  • how about a text column you just plop meta info in and use FULLTEXT search in natural language mode. See this [link](http://stackoverflow.com/a/30677347/1816093). I am not suggesting for a second it adheres to your weighting scheme – Drew Sep 03 '15 at 14:27
  • You may want to check [sphinx][1] or [elasticsearch][2] for that matter [1]: http://sphinxsearch.com/ [2]: https://www.elastic.co/products/elasticsearch – Gori Sep 03 '15 at 14:34

4 Answers4

3

Bear with me, this is going to be a strange query, but it seems to work on my end.

SELECT SUM(
  IF(year = "1968", 30, 0) +
  IF(make = "Ford", 100, 0) +
  IF(model = "Mustang", 85, 0) +
  IF(color = "Red", 10, 0) +
  IF(type = "Sports Car", 50, 0)
) AS `weight`, cars.* FROM cars 
  WHERE year = "1968" 
  OR make = "Ford"
  OR model = "Mustang"
  OR color = "Red"
  OR type = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;

Basically, this groups all results by their id (which is necessary for the SUM() function, does some calculations on the different fields and returns the weight as a total value, which is then sorted highest-lowest. Also, this will only return results where one of the columns matches a supplied value.

Since I don't have an exact copy of your database, run some tests with this on your end and let me know if there's anything that needs to be adjusted.

Expected Results:

+============================================================+
| weight | year | make      | model    | color  | type       |
|============================================================|
| 130    | 1968 | Ford      | Fairlane | Blue   | Roadster   |
| 100    | 2014 | Ford      | Taurus   | Silver | Sedan      |
| 60     | 2015 | Chevrolet | Corvette | Red    | Sports Car |
+============================================================+

So, as you can see, the results would list the closest matches, which in this case are two Ford (+100) vehicles, one from 1968 (+30), and a Red Sports Car (10 + 50) as the closest matches (using your criteria)

One more thing, if you also want to display the rest of the results (ie results with a 0 weight match score) simply remove the WHERE ... OR ..., so it will check against all records. Cheers!

Further to the comments below, checking the weight after a LEFT JOIN on a pivot table:

SELECT SUM(
  IF(cars.year = "1968", 30, 0) +
  IF(cars.make = "Ford", 100, 0) +
  IF(cars.model = "Mustang", 85, 0) +
  IF(cars.color = "Red", 10, 0) +
  IF(types.name = "Sports Car", 50, 0)
) AS `weight`, cars.*, types.* FROM cars 
LEFT JOIN cars_types ON cars_types.car_id = cars.id
LEFT JOIN types ON cars_types.type_id = types.id
  WHERE year = "1968" 
  OR cars.make = "Ford"
  OR cars.model = "Mustang"
  OR cars.color = "Red"
  OR types.name = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;

Here is a picture of the LEFT JOIN in practice:

enter image description here

As you can see, the Cobalt matches on color (silver) and model (Cobalt) (85 + 10) while the Caliber matches on type (Sports Car) (50). And yes, I know a Dodge Caliber isn't a Sports Car, this was for example's sake. Hope that helped!

Tim Lewis
  • 27,813
  • 13
  • 73
  • 102
  • I think this will be my best option. Thanks a lot for the help and the very quick response. :-) – ThEwHiStLeR Sep 03 '15 at 16:55
  • No problem. Glad this could help you. Cheers :) – Tim Lewis Sep 03 '15 at 17:29
  • Now a monkey wrench was thrown into this. Some of the results now have multiple options. Example: Let's say the type column was removed and instead we have 2 new tables, one called types, and one is car_types. Types holds all types of cars like Muscle Car, Luxury Car, Sedan, etc. and car_types has a car_id column and a type_id column so that you can select multiple types for each car. So now I need to weight that field based on if the car has any of the types that may be selected. Any ideas? Thanks – ThEwHiStLeR Sep 10 '15 at 19:33
  • So essentially a pivot table for `Cars <-> Cars_Types <-> Types`. Let me look into that, might be tricky... I may not have an answer for you before 5 (when work is done, hope that's alright) – Tim Lewis Sep 10 '15 at 20:25
  • I think I figured it out... see next comment for how I got it to work – ThEwHiStLeR Sep 10 '15 at 20:29
  • SELECT SUM( IF(`cars`.`year` = "1968", 30, 0) + IF(`cars`.`make` = "Ford", 100, 0) + IF(`cars`.`model` = "Mustang", 85, 0) + IF(`cars`.`color` = "Red", 10, 0) + IF(`car_type`.`type_id` = "1", 50, 0) ) AS `weight`, `cars`.*, group_concat(`TYPE`.`TYPE`) FROM `cars` LEFT JOIN `car_type` ON (`car_type`.`car_id` = `cars`.`car_id`) LEFT JOIN `TYPE` ON (`TYPE`.`type_id` = `car_type`.`type_id`) WHERE `cars`.`YEAR` = "1968" OR `cars`.`make` = "Ford" OR `cars`.`model` = "Mustang" OR `cars`.`color` = "Red" OR `car_type`.`type_id` = '1' GROUP BY `cars`.`car_id` ORDER BY `weight` DESC – ThEwHiStLeR Sep 10 '15 at 20:30
  • You're close. I'll post the updated answer that I came up with. – Tim Lewis Sep 10 '15 at 20:39
0

If I understand your logic you can just do something like direct comparison in PHP between the value requested and the value returned. The query will sound like:

SELECT Year,Make,Model,Color,Type 
FROM table 
WHERE year='$postedyear' OR make='$postedmake' 
   OR model='$postedmodel' OR color='$postedcolor'

Then in php looping between the results:

foreach($results as $result){
    $score = 0;
    if($result['year']==$postedyear{$score=$score+30;}
    //continue with the other with the same logic.
}

After each foreach iteration $score will be the score of that selected row. If you push the score to the $result array you can also sort it by score before displaying the results.

Lelio Faieta
  • 6,457
  • 7
  • 40
  • 74
0

Variation on @lelio-faieta

In php you can have a result array containing arrays of values for each item matching at least one of the search terms, the associative array of values to match and the associate array of weights, both with the same indexes. You would just get an array of matches for each index. (maybe use array_intersect_assoc()) Then you multiply by the weights and sum, add to the original data. Then you do have to sort the result array at that point.

Elin
  • 6,507
  • 3
  • 25
  • 47
  • I prefer not to play too much with array in php with things like sorting, intersect ecc if it is not necessary because I have seen that on big set of data array function are often very slow but your point is good too! – Lelio Faieta Sep 03 '15 at 15:29
-2

There is a solution doing this via the mysql query directly, but that would end up with an overgrown resource thirsty query for every single search you perform. Doing it in PHP is not much difference in resource usage, bounding to several loops in results and processing it.

I've had a very similar project and my best suggestion would be: "use SphinxSearch"

Very easy to install, needs a bit of a learning curve to setup afterwards, but very similar to mysql queries etc. With this you can apply weights to every column match and rank your results afterwards. Also, it is a multitude of time faster that typical mysql queries.

  • Since SO is a site for sharing knowledge can you share with us the way to do it with a query? – Lelio Faieta Sep 03 '15 at 14:59
  • Someone already beat me to it. I would write a very similar example to Tim Lewis above: Quoting: SELECT SUM( IF(year = "1968", 30, 0) + IF(make = "Ford", 100, 0) + IF(model = "Mustang", 85, 0) + IF(color = "Red", 10, 0) + IF(type = "Sports Car", 50, 0) ), AS `weight`, cars.* FROM cars WHERE year = "1968" OR make = "Ford" OR model = "Mustang" OR color = "Red" OR type = "Sports Car" GROUP BY cars.id ORDER BY `weight` DESC; – Angelos Hadjiphilippou Sep 03 '15 at 20:02