| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Alexandre Felipe <alexandre(dot)felipe(at)tpro(dot)io>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: New access method for b-tree. |
| Date: | 2026-02-01 23:54:14 |
| Message-ID: | a652515e-278a-4838-815b-ecd2ad4495f6@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello Felipe,
On 2/1/26 11:02, Alexandre Felipe wrote:
> Hello Hackers,
>
> Please check this out,
>
> It is an access method to scan a table sorting by the second column of
> an index, filtered on the first.
> Queries like
> SELECT x, y FROM grid
> WHERE x in (array of Nx elements)
> ORDER BY y, x
> LIMIT M
>
> Can execute streaming the rows directly from disk instead of loading
> everything.
>
> Using btree index on (x, y)
>
> On a grid with N x N will run by fetching only what is necessary
> A skip scal will run with O(N * Nx) I/O, O(N x Nx) space, O(N x Nx *
> log( N * Nx)) comput (assuming a generic in memory sort)
>
> The proposed access method does it O(M + Nx) I/O, O(Nx) space, and O(M *
> log(Nx)) compute.
>
So how does this compare to skip scan in practice? It's hard to compare,
as the patch does not implement an actual access path, but I tried this:
CREATE TABLE merge_test_int (
prefix_col int4,
suffix_col int4
);
INSERT INTO merge_test_int
SELECT p, s
FROM generate_series(1, 10000) AS p,
generate_series(1, 1000) AS s;
CREATE INDEX merge_test_int_idx
ON merge_test_int (prefix_col, suffix_col);
and then
1) master
SELECT * FROM merge_test_int
WHERE prefix_col IN (1,3,4,5,6,7,8,9,10,11,12,13,14,15)
AND suffix_col >= 900
ORDER BY suffix_col LIMIT 100;
vs.
2) merge scan
SELECT * FROM test_btree_merge_scan_int(
'merge_test_int',
'merge_test_int_idx',
ARRAY[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
900,
100);
And with explain analyze we get this:
1) Buffers: shared hit=26 read=25
2) Buffers: shared hit=143 read=17
So it seems to access many more buffers, even if the number of reads is
lower. Presumably the merge scan is not always better than skip scan,
probably depending on number of prefixes in the query etc. What is the
cost model to decide between those two?
If you had to construct the best case and worst cases (vs. skip scan),
what would that look like?
I'm also wondering how common is the targeted query pattern? How common
it is to have an IN condition on the leading column in an index, and
ORDER BY on the second one?
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-02-01 23:54:21 | Re: AIX support |
| Previous Message | Gyan Sreejith | 2026-02-01 23:05:19 | Re: [Proposal] Adding Log File Capability to pg_createsubscriber |