From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | James Coleman <jtc331(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date: | 2018-09-06 17:39:47 |
Message-ID: | 16617.1536255587@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
James Coleman <jtc331(at)gmail(dot)com> writes:
> A fairly common planning problem for us is what we call "most recent first" queries; i.e., "the 50 most recent <table> rows for a <foreign key>".
> Here's a basic setup:
> -- created_at has very high cardinality
> create table foo(pk serial primary key, owner_fk integer, created_at timestamp);
> create index idx_foo_on_owner_and_created_at on foo(owner_fk, created_at);
> -- technically this data guarantees unique created_at values,
> -- but there's no reason it couldn't be modified to have a few
> -- random non-unique values to prove the point
> insert into foo(owner_fk, created_at)
> select i % 100, now() - (i::text || ' minutes')::interval
> from generate_series(1, 1000000) t(i);
> And here's the naive query to get the results we want:
> select *
> from foo
> where owner_fk = 23
> -- pk is only here to disambiguate/guarantee a stable sort
> -- in the rare case that there are collisions in the other
> -- sort field(s)
> order by created_at desc, pk desc
> limit 50;
If you're concerned about the performance of this case, why don't you make
an index that actually matches the query?
regression=# create index on foo (owner_fk, created_at, pk);
CREATE INDEX
regression=# explain analyze select * from foo where owner_fk = 23 order by created_at desc, pk desc limit 50;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..110.92 rows=50 width=16) (actual time=0.151..0.280 rows=50 loops=1)
-> Index Only Scan Backward using foo_owner_fk_created_at_pk_idx on foo (cost=0.42..20110.94 rows=9100 width=16) (actual time=0.146..0.255 rows=50 loops=1)
Index Cond: (owner_fk = 23)
Heap Fetches: 50
Planning Time: 0.290 ms
Execution Time: 0.361 ms
(6 rows)
There may be use-cases for Alexander's patch, but I don't find this
one to be terribly convincing.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | R, Siva | 2018-09-06 18:02:20 | Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun |
Previous Message | James Coleman | 2018-09-06 16:51:53 | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |