A table raw_data
has an index ix_raw_data_timestamp
:
CREATE TABLE IF NOT EXISTS public.raw_data
(
ts timestamp without time zone NOT NULL,
log_msg character varying COLLATE pg_catalog."default",
log_image bytea
)
CREATE INDEX IF NOT EXISTS ix_raw_data_timestamp
ON public.raw_data USING btree
(ts ASC NULLS LAST)
TABLESPACE pg_default;
For some reason the index is not used for the following query (and therefore is very slow):
SELECT ts,
log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 5e6;
The result of EXPLAIN (analyze, buffers, format text)
for the query above:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9752787.07..10336161.14 rows=5000000 width=50) (actual time=789124.600..859046.614 rows=5000000 loops=1)
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Gather Merge (cost=9752787.07..18421031.89 rows=74294054 width=50) (actual time=789085.442..822547.099 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Sort (cost=9751787.05..9844654.62 rows=37147027 width=50) (actual time=788203.880..795491.054 rows=1667070 loops=3)
Sort Key: "ts"
Sort Method: external merge Disk: 1758904kB
Worker 0: Sort Method: external merge Disk: 1762872kB
Worker 1: Sort Method: external merge Disk: 1756216kB
Buffers: shared hit=12234 read=888521, temp read=2039471 written=2664654
-> Parallel Seq Scan on raw_data (cost=0.00..1272131.27 rows=37147027 width=50) (actual time=25.436..119352.861 rows=29717641 loops=3)
Buffers: shared hit=12141 read=888520
Planning Time: 5.240 ms
JIT:
Functions: 7
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.578 ms, Inlining 76.678 ms, Optimization 24.578 ms, Emission 13.060 ms, Total 114.894 ms
Execution Time: 877489.531 ms
(20 rows)
But it is used for this one:
SELECT ts,
log_msg
FROM raw_data
ORDER BY ts ASC
LIMIT 4e6;
EXPLAIN (analyze, buffers, format text)
of the query above is:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..9408157.15 rows=4000000 width=50) (actual time=15.081..44747.127 rows=4000000 loops=1)
Buffers: shared hit=24775 read=61155
-> Index Scan using ix_raw_data_timestamp on raw_data (cost=0.57..209691026.73 rows=89152864 width=50) (actual time=2.218..16077.755 rows=4000000 loops=1)
Buffers: shared hit=24775 read=61155
Planning Time: 1.306 ms
JIT:
Functions: 3
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.406 ms, Inlining 1.121 ms, Optimization 7.917 ms, Emission 3.721 ms, Total 13.165 ms
Execution Time: 59028.951 ms
(10 rows)
Needless to say that the aim is to get all queries to use the index no matter the size, but I cannot seem to find a solution.
PS:
- There's about
89152922
rows in the database.
Edit:
After increasing the memory to 2G (SET work_mem = '2GB';
), the query is a little faster (doesn't use disk anymore) but still nowhere as fast:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5592250.54..6175624.61 rows=5000000 width=50) (actual time=215887.445..282393.743 rows=5000000 loops=1)
Buffers: shared hit=12224 read=888531
-> Gather Merge (cost=5592250.54..14260731.75 rows=74296080 width=50) (actual time=215874.072..247030.062 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=12224 read=888531
-> Sort (cost=5591250.52..5684120.62 rows=37148040 width=50) (actual time=215854.323..221828.921 rows=1667147 loops=3)
Sort Key: "ts"
Sort Method: top-N heapsort Memory: 924472kB
Worker 0: Sort Method: top-N heapsort Memory: 924379kB
Worker 1: Sort Method: top-N heapsort Memory: 924281kB
Buffers: shared hit=12224 read=888531
-> Parallel Seq Scan on raw_data (cost=0.00..1272141.40 rows=37148040 width=50) (actual time=25.899..107034.903 rows=29717641 loops=3)
Buffers: shared hit=12130 read=888531
Planning Time: 0.058 ms
JIT:
Functions: 7
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.642 ms, Inlining 53.860 ms, Optimization 23.848 ms, Emission 11.768 ms, Total 90.119 ms
Execution Time: 300281.654 ms
(20 rows)