2

I am wondering if it is possible to do a MATCH() AGAINST() (fulltext) search in a way that words, who are not directly next to each other, need to be in a specific order? On my site, when users type words between double-quotes, the search will only display results that have those words in that particular order. For example, if this is the search term:

"pizza road"

it will return 'The best pizza down the road', but not 'the road to the best pizza'.

I know that BOOLEAN mode will allow for 'phrase' searches, but this means the words need to be next to each other (separated with just a space, period, comma, etc), which obviously won't return either of the results listed above: http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

Currently I am using regexp to perform this action, but needless to say that this is a bad way to search a varchar column (pointlessly fulltext indexed), even though it is just little under 200,000 rows. So what I would like to know is if it is possible to perform the same search using just MATCH() AGAINST()? Or should I combine it with with the regexp? Or is there even another method I haven't thought of?

I can provide table/ code samples upon request.

Thanks in advance!

Battle_707
  • 708
  • 5
  • 15

2 Answers2

2

Have you tried something like:

SELECT * 
FROM `table` 
WHERE MATCH(`first_column`) 
      AGAINST('+pizza +road' IN BOOLEAN MODE)
      HAVING `first_column` LIKE '%pizza %road%' 
Michael Robinson
  • 29,278
  • 12
  • 104
  • 130
  • @Qtax it might, I haven't tested it (can't test right now :( ) – Michael Robinson Jun 09 '11 at 09:32
  • 1
    No, this won't work. An additional problem is that in this row, the words are typically not space separated. The words are normally separated by periods, underscores and sometimes hyphens. This means that any type of LIKE statement will need to use '%%', meaning it won't use an index. – Battle_707 Jun 09 '11 at 14:43
1

Full text search, as typically implemented in relational databases, is a hack designed to provide only a very limited subset of expected functionality in an attempt to compete with better technology. Full text indexing is the Apple to RDBMS's Orange.

To be able to search using what amounts to a 'NEAR' operator as supported in traditional full text search engines, you would need to implement your own inverted index with positional meta data, which is how traditional search engines do it.

And while certainly possible, most developers would rather use a full text search engine to solve this problem, rather than expend the substantial effort needed to reinvent this wheel.

Rob Raisch
  • 17,040
  • 4
  • 48
  • 58
  • I am a little confused by that last line of yours; I have a fulltext index on the column. Can you please clarify what you mean a little more? What you are saying about the inverted indexes sounds like a smart way. I am not necessarily concerned with disk space, so a larger index won't be an issue. What might be an issue is that in my search script I append a wildcard by default. Would that affect this reversed index much? Also, another question on the side: the column is utf8 encoded. Would changing that to latin1 speed things up? – Battle_707 Jun 12 '11 at 06:27
  • Creating your own inverted index is a complex and time consuming process. This is not an index in the sense that Mysql uses. See my answer to http://stackoverflow.com/questions/6034976/how-to-optimize-mysql-boolean-full-text-search-or-what-to-replace-it-with-c/6146107#6146107 for more detail. – Rob Raisch Jun 12 '11 at 14:09