| From: | Michał Kłeczek <michal(at)kleczek(dot)org> |
|---|---|
| To: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Alexandre Felipe <alexandre(dot)felipe(at)tpro(dot)io>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: New access method for b-tree. |
| Date: | 2026-02-04 07:13:11 |
| Message-ID: | E313FDE4-8138-44CC-99CE-60F38251D878@kleczek.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On 3 Feb 2026, at 22:42, Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
>
> On Mon, 2 Feb 2026 at 01:54, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>> I'm also wondering how common is the targeted query pattern? How common
>> it is to have an IN condition on the leading column in an index, and
>> ORDER BY on the second one?
>
> I have seen this pattern multiple times. My nickname for it is the
> timeline view. Think of the social media timeline, showing posts from
> all followed accounts in timestamp order, returned in reasonably sized
> batches. The naive SQL query will have to scan all posts from all
> followed accounts and pass them through a top-N sort. When the total
> number of posts is much larger than the batch size this is much slower
> than what is proposed here (assuming I understand it correctly) -
> effectively equivalent to running N index scans through Merge Append.
>
> My workarounds I have proposed users have been either to rewrite the
> query as a UNION ALL of a set of single value prefix queries wrapped
> in an order by limit. This gives the exact needed merge append plan
> shape. But repeating the query N times can get unwieldy when the
> number of values grows, so the fallback is:
>
> SELECT * FROM unnest(:friends) id, LATERAL (
> SELECT * FROM posts
> WHERE user_id = id
> ORDER BY tstamp DESC LIMIT 100)
> ORDER BY tstamp DESC LIMIT 100;
>
> The downside of this formulation is that we still have to fetch a
> batch worth of items from scans where we otherwise would have only had
> to look at one index tuple.
GIST can be used to handle this kind of queries as it supports multiple sort orders.
The only problem is that GIST does not support ORDER BY column.
One possible workaround is [1] but as described there it does not play well with partitioning.
I’ve started drafting support for ORDER BY column in GIST - see [2].
I think it would be easier to implement and maintain than a new IAM (but I don’t have enough knowledge and experience to implement it myself)
[1] https://www.postgresql.org/message-id/3FA1E0A9-8393-41F6-88BD-62EEEA1EC21F%40kleczek.org
[2] https://www.postgresql.org/message-id/B2AC13F9-6655-4E27-BFD3-068844E5DC91%40kleczek.org
—
Kind regards,
Michal
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2026-02-04 07:17:47 | Re: Add backendType to PGPROC, replacing isRegularBackend |
| Previous Message | jian he | 2026-02-04 06:27:03 | Re: using index to speedup add not null constraints to a table |