Re: [PATCH] ANALYZE: hash-accelerate MCV tracking for equality-only types

From: Chengpeng Yan <chengpeng_yan(at)Outlook(dot)com>
To: Tatsuya Kawata <kawatatatsuya0913(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, John Naylor <johncnaylorls(at)gmail(dot)com>
Subject: Re: [PATCH] ANALYZE: hash-accelerate MCV tracking for equality-only types
Date: 2026-02-02 04:05:36
Message-ID: AE365752-F487-4443-A35F-B16079E81A52@Outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Feb 1, 2026, at 17:09, Tatsuya Kawata <kawatatatsuya0913(at)gmail(dot)com> wrote:
>
> Regarding Ilia's simplification proposal:
>
> > if (was_count1 && j < firstcount1)
> > firstcount1--;
> > if (c1_cursor < firstcount1)
> > c1_cursor = firstcount1;
>
> The simplification looks appealing. However, I may be misunderstanding something -
> should the logic perhaps be:
>
> if (was_count1 && j <= firstcount1)
> firstcount1++;
> if (c1_cursor < firstcount1)
> c1_cursor = firstcount1;
>
> My reasoning:
> - Without `<=`, the case where j == firstcount1 would leave firstcount1 pointing
> to a value with count=2 (no longer a singleton).
> - I believe `firstcount1--` should be `firstcount1++`, since when a singleton
> is promoted to multiply-seen, the singleton region shrinks from the left,
> meaning the boundary moves right (increases).
>
> Please correct me if I'm missing something.
>
> Regards,
> Tatsuya Kawata
>

Hi,

Thanks for the careful review and the suggestions.

For v3 I did not change the algorithm/behavior; I only adjusted the
comment in the match path to make the bubble-up / cursor interaction
more explicit.

On the firstcount1 / c1_cursor handling: c1_cursor is an index-based
“clock hand” over the count==1 region used to get FIFO eviction without
physically shifting the singleton subarray. The extra cursor adjustment
in the match path is there to preserve the same FIFO victim sequence as
the original shift-based code.

The key case is when a singleton is matched again (was_count1) and then
bubbles up into the count>1 prefix. That bubble-up effectively shifts
track[firstcount1..match_index-1] right by one. If c1_cursor points
anywhere in that shifted range (or at match_index itself), it must
advance by one as well; otherwise the cursor would start referring to a
different element than before the shift and we’d evict a newer singleton
next, breaking FIFO equivalence.

Concrete example: singleton region is [A,B,C,D] and c1_cursor points to
B (next to evict). If D is seen again, it becomes count=2 and bubbles
up, shifting [A,B,C] one slot to the right. B’s index changes, so
c1_cursor must move with it; otherwise the next eviction would
incorrectly target A (or whatever now sits at the old index), not B as
in the shift-based implementation. This is also why the condition uses
<= match_index: the cursor may point at the promoted singleton slot as
well.

Regarding the boundary direction: since firstcount1 is “index of first
singleton”, promotions shrink the singleton region and move the boundary
to the right. In hash mode that’s handled by advancing firstcount1 while
track[firstcount1].count > 1, and then clamping c1_cursor back into the
singleton region if needed.

If anything above still sounds off, happy to walk through a specific
trace.

--
Best regards,
Chengpeng Yan

Attachment Content-Type Size
v3-0001-ANALYZE-speed-up-MCV-tracking-for-equality-only-t.patch application/octet-stream 10.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-02-02 04:07:57 Re: Incorrect errno used in OpenWalSummaryFile()
Previous Message Chao Li 2026-02-02 03:58:10 Re: splitting pg_resetwal output strings