Re: Fully documenting the design of nbtree row comparison scan keys

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fully documenting the design of nbtree row comparison scan keys
Date: 2025-10-31 17:40:40
Message-ID: CAH2-Wz=dCBTZk+a=R1q8tm7MdPshPZiuugRdcZx57KetJRK26A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 31, 2025 at 5:06 AM Victor Yegorov <vyegorov(at)gmail(dot)com> wrote:
> I took a look at the patch. Proposed comments look highly valuable, especially around NULLs, doesn't look immediately obvious, so definitely requires a comment.
> Looks good to commit.

Cool.

> Wouldn't it be good to add such information also into the user documentation, say into
> https://www.postgresql.org/docs/current/functions-comparisons.html#ROW-WISE-COMPARISON

I wasn't thinking about the user-facing documentation. But I think
that you make a good point.

The main problem with that documentation of row compares is that we
don't say anything about keyset pagination with a multicolumn index
(not anywhere). That is almost the entire reason why this feature
exists -- but we don't actually say anything about it to users. No
wonder the feature is underused.

Separately, there is a risk that NULLs will break applications that
implement keyset pagination. Attached test case shows what I mean by
this.

To summarize the test case:

The test case shows that QUERY 1 returns slightly more rows than QUERY
2 due to the presence of NULLs in lower-order index columns. This
seems surprising. (The underlying issue is more or less the same issue
that makes row comparisons confusing to the implementation, especially
in places like _bt_set_startikey).

A user might expect that it won't matter how many or how few "keyset
pages"/row compare queries were used -- one big query (no pagination)
should give the same answer as (say) 4 "keyset pagination" queries
(just in smaller pieces/sets of rows). But as the test case shows,
when there are NULLs in lower-order index columns, that isn't
necessarily true. QUERY 1 and QUERY 2 are only guaranteed to return
precisely the same rows if we somehow make sure that there can be no
rows returned with NULLs in lower order columns (e.g., by using a
primary key for keyset pagination queries, or by always adding IS NOT
NULL conditions against lower-order columns like "b" and "c").

--
Peter Geoghegan

Attachment Content-Type Size
null_pagination_issue.sql application/octet-stream 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-10-31 18:02:32 Re: pg_createsubscriber --dry-run logging concerns
Previous Message Heikki Linnakangas 2025-10-31 17:33:54 Re: AIX support