MySQL Innodb fail to use index due to extremely wrong rows estimation

Question

I've an innodb table, the query on the table looks like below.

SELECT *
FROM x
WHERE now() BETWEEN a AND b

I've create a composite index on (a,b), the query returns around 4k rows, while the total number of rows in the table is around 700k.

However, when I get EXPLAIN of the execution plan, I found that the query did not use the expected index. Because the estimated rows is around 360k, extremely larger than the actual value.

I know just like many posts (such as Why the rows returns by "explain" is not equal to count()?) had explained, EXPLAIN only get roughly estimation. But FORCE INDEX solution is very tricky and may bring potential performance risks in the future.

Is there any way that I can make MySQL to get a more accurate estimation (the current one is 90 times larger)? Thanks.

No, it fails to use the index because the index is useless. It is more efficient to scan the table. — Rick James, Nov 14 '16 at 20:00

score 0 · Answer 1 · edited Jun 20 '20 at 09:12

0

InnoDB only keeps approximate row counts for tables. This is explained in the documentation of SHOW TABLE STATUS:

Rows

The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40 to 50%.

I don't think there's any way to make InnoDB keep accurate row counts, it's just not how it works.

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 10 '16 at 07:34

Barmar

741,623
53
500
612

Thanks for your response. So if we don't split the table into small ones, the only way to speed up query is `FORCE INDEX`? My concern is that when data distribution changed or index changed in the future, we may not be aware of the effect of `FORCE INDEX`. It's too tricky to select index manually. – twds Nov 10 '16 at 07:56
I haven't really researched it, so I can't answer that. – Barmar Nov 10 '16 at 07:58
My guess is they determined that the impact on query optimization isn't big enough to require them to solve the problem of keeping accurate counts. – Barmar Nov 10 '16 at 07:59

score 0 · Answer 2 · answered Nov 14 '16 at 19:59

This particular construct is hard to optimize:

WHERE constant BETWEEN col1 AND col2

No MySQL index can be devised to make it run fast. The attempts include:

INDEX(col1) -- will scan last half of table
INDEX(col2) -- will scan first half of table
INDEX(col1, col2) -- will scan last half of table

(Whether it does more of the work in the index BTree depends on ICP, covering, etc. But, in any case, lots of rows must be touched.)

One reason it cannot be improved upon is that the 'last' row in the 'half' might actually match.

If the (col1, col2) pairs do not overlap, then there is a possibility of improving the performance by being able to stop after one row. But MySQL does not know if you have this case, so it cannot optimize. Here is an approach for non-overlapping IP-address lookups that is efficient.

MySQL Innodb fail to use index due to extremely wrong rows estimation

2 Answers2