Re: Make TID Scans recalculate the TIDs less often

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make TID Scans recalculate the TIDs less often
Date: 2025-09-17 09:51:34
Message-ID: CAApHDvpicvz9+eBxrN9QuQJu5b=hCBPC88Gu_t9_+iZb4YvH8w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 17 Sept 2025 at 18:29, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> I heard of following use-case: data transferring system partition big tables by ctid ranges to mimic parallel secscan, but with many network connections. Some performance improvement was claimed and connection failure resistance (when one connection was broken only one partition is rescanned with same snapshot).

> Would your patch improve performance of such case?

I suspect they're just running a SELECT * to a single table "WHERE
ctid BETWEEN" some fixed range of TIDs. If that's the case then this
won't help as there are no rescans, therefore the TID Range is only
calculated once. Also, I imagine TID Range Scans are less affected
than TID Scans simply because TidExprListCreate() is more complex than
TidRangeEval().

What you'd need for this to ever be slow is lots of rescans, so
something like TID Scan on the inside of a nested loop. If you look at
ExecReScanTidScan() you'll see it pfrees the tss_TidList, which
requires that the list gets built all over again on the next call to
TidNext(). It's the call to TidListEval() that is potentially
expensive due to the expression evaluation, memory allocation, sorting
and distinct work.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2025-09-17 10:13:00 Re: Make TID Scans recalculate the TIDs less often
Previous Message Ilia Evdokimov 2025-09-17 09:40:01 Re: Use merge-based matching for MCVs in eqjoinsel