Re: index prefetching

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-08-13 23:22:28
Message-ID: ffd81af3-a2dd-4f41-b2be-7547cd61c6e3@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/13/25 23:36, Peter Geoghegan wrote:
> On Wed, Aug 13, 2025 at 1:01 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>> This seems rather bizarre, considering the two tables are exactly the
>> same, except that in t2 the first column is negative, and the rows are
>> fixed-length. Even heap_page_items says the tables are exactly the same.
>>
>> So why would the index get so different like this?
>
> In the past, when I required *perfectly* deterministic results for
> INSERT INTO test_table ... SELECT * FROM source_table bulk inserts
> (which was important during the Postgres 12 and 13 nbtree work), I
> found it necessary to "set synchronize_seqscans=off". If I was writing
> a test such as this, I'd probably do that defensively, even if it
> wasn't clear that it mattered. (I'm also in the habit of using
> unlogged tables, because VACUUM tends to set their pages all-visible
> more reliably than equivalent logged tables, which I notice that
> you're also doing here.)
>

The tables are *exactly* the same, block by block. I double checked that
by looking at a couple pages, and the only difference is the inverted
value of the "a" column.

> That said, I *think* that the "locally shuffled" heap TID pattern that
> we see with "t2"/"idx2" is mostly (perhaps entirely) caused by the way
> that you're inverting the indexed column's value when initially
> generating "t2". A given range of values such as "1 through to 4"
> becomes "-4 through to -1" as their tuples are inserted into t2.

Right.

> You're effectively inverting the order of the bigint indexed column
> "a" -- but you're *not* inverting the order of the imaginary
> tie-breaker heap column (it *remains* in ASC heap TID order in "t2").
>

I have no idea what I'm supposed to do about that. As you say the
tie-breaker is imaginary, selected by the system on my behalf. If it
works like this, doesn't that mean it'll have this unfortunate effect on
all data sets with negative correlation?

> In general, when doing this sort of analysis, I find it useful to
> manually verify that the data that I generated matches my
> expectations. Usually a quick check with pageinspect is enough. I'll
> just randomly select 2 - 3 leaf pages, and make sure that they all
> more or less match my expectations.
>

I did that for the heap, and that's just as I expected. But the effect
on the index surprised me.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2025-08-13 23:35:52 Re: [PATCH] Silence Valgrind about SelectConfigFiles()
Previous Message Andres Freund 2025-08-13 23:19:26 Re: index prefetching