What is the order MySQL joins the tables, how is it chosen and when does STRAIGHT_JOIN comes in handy?
-
Similar question: http://stackoverflow.com/questions/512294/when-to-use-straight-join-with-mysql – Jake Jul 12 '11 at 20:18
3 Answers
MySQL
is only capable of doing nested loops (possibly using indexes), so if both join tables are indexed, the time for the join is calculated as A * log(B)
if A
is leading and B * log(A)
if B
is leading.
It is easy to see that the table with fewer records satisfying the WHERE
condition should be made leading.
There are some other factors that affect the join performance, such as WHERE
conditions, ORDER BY
and LIMIT
clauses etc. MySQL
tries to predict the time for the join orders and if statistics are up to date does it quite well.
STRAIGHT_JOIN
is useful when the statistics are not accurate (say, naturally skewed) or in case of bugs in the optimizer.
For instance, the following spatial join:
SELECT *
FROM a
JOIN b
ON MBRContains(a.area, b.area)
is subject to a join swap (the smaller table is made leading), however, MBRContains
is not converted to MBRWithin
and the resulting plan does not make use of the index.
In this case you should explicitly set the join order using STRAIGHT_JOIN
.

- 413,100
- 91
- 616
- 614
-
-
And what statistics are you talking about? Haven't heard of it in mysql :( – Vadim Samokhin Jul 24 '11 at 16:14
-
@Zapadlo: logarithm base depends on several factors, including index sparseness and record size. Statistics are stored in `INFORMATION_SCHEMA.STATISTICS`. – Quassnoi Jul 25 '11 at 08:31
As others have stated about the optimizer and which tables may meet the criteria on smaller result sets, but that may not always work. As I had been working with gov't contract / grants database. The table was some 14+ million records. However, it also had over 20 lookup tables (states, congressional districts, type of business classification, owner ethnicity, etc)
Anyhow with these smaller tables, the join was using one of the small lookups, back to the master table and then joining all the others. It CHOKED the database and cancelled the query after 30+ hours. Since my primary table was listed FIRST, and all subsequent were lookup and joined AFTER, just adding STRAIGHT_JOIN at the top FORCED the order I had listed and the complex query was running again in just about 2 hrs (expected for all it had to do).
Get whatever is your primary basis to the top with all subsequent extras later I've found, definitely helps.

- 47,638
- 12
- 72
- 142
-
1But what is the reason? It doesn't correlate with common sense, or I miss something ;) – Vadim Samokhin Jul 24 '11 at 16:11
The order of tables is specified by the optimizer. Straight_join comes in handy when the optimizer does it wrong, which is not so often. I used it only once in a big join, where the optimizer gave one particular table at first place in join (I saw it in explain select command), so I placed the table so that it is joined later in the join. It helped a lot to speed up the query.

- 57,621
- 49
- 238
- 373