Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2018-09-06 17:39:47
Message-ID: 16617.1536255587@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

James Coleman <jtc331(at)gmail(dot)com> writes:
> A fairly common planning problem for us is what we call "most recent first" queries; i.e., "the 50 most recent <table> rows for a <foreign key>".

> Here's a basic setup:

> -- created_at has very high cardinality
> create table foo(pk serial primary key, owner_fk integer, created_at timestamp);
> create index idx_foo_on_owner_and_created_at on foo(owner_fk, created_at);

> -- technically this data guarantees unique created_at values,
> -- but there's no reason it couldn't be modified to have a few
> -- random non-unique values to prove the point
> insert into foo(owner_fk, created_at)
> select i % 100, now() - (i::text || ' minutes')::interval
> from generate_series(1, 1000000) t(i);

> And here's the naive query to get the results we want:

> select *
> from foo
> where owner_fk = 23
> -- pk is only here to disambiguate/guarantee a stable sort
> -- in the rare case that there are collisions in the other
> -- sort field(s)
> order by created_at desc, pk desc
> limit 50;

If you're concerned about the performance of this case, why don't you make
an index that actually matches the query?

regression=# create index on foo (owner_fk, created_at, pk);
CREATE INDEX
regression=# explain analyze select * from foo where owner_fk = 23 order by created_at desc, pk desc limit 50;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..110.92 rows=50 width=16) (actual time=0.151..0.280 rows=50 loops=1)
-> Index Only Scan Backward using foo_owner_fk_created_at_pk_idx on foo (cost=0.42..20110.94 rows=9100 width=16) (actual time=0.146..0.255 rows=50 loops=1)
Index Cond: (owner_fk = 23)
Heap Fetches: 50
Planning Time: 0.290 ms
Execution Time: 0.361 ms
(6 rows)

There may be use-cases for Alexander's patch, but I don't find this
one to be terribly convincing.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message R, Siva 2018-09-06 18:02:20 Re: Bug in ginRedoRecompress that causes opaque data on page to be overrun
Previous Message James Coleman 2018-09-06 16:51:53 Re: [PATCH] Incremental sort (was: PoC: Partial sort)