Re: wrong query results on bf leafhopper

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, tharar(at)amazon(dot)com
Subject: Re: wrong query results on bf leafhopper
Date: 2025-05-20 03:55:24
Message-ID: CAApHDvr78CS20xhrmh2xbMa1rTSC8vOPmT8JndFS0L9H99xb2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 17 May 2025 at 01:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
> @@ -42,7 +42,7 @@
> -> Nested Loop (actual rows=1000.00 loops=N)
> -> Seq Scan on tenk1 t2 (actual rows=1000.00 loops=N)
> Filter: (unique1 < 1000)
> - Rows Removed by Filter: 9000
> + Rows Removed by Filter: 8982
> -> Memoize (actual rows=1.00 loops=N)
> Cache Key: t2.twenty
> Cache Mode: logical
> @@ -178,7 +178,7 @@
> -> Nested Loop (actual rows=1000.00 loops=N)
> -> Seq Scan on tenk1 t1 (actual rows=1000.00 loops=N)
> Filter: (unique1 < 1000)
> - Rows Removed by Filter: 9000
> + Rows Removed by Filter: 8981
> -> Memoize (actual rows=1.00 loops=N)
> Cache Key: t1.two, t1.twenty
> Cache Mode: binary

Note that the actual row count is 1000 still, so that pretty much
discounts corruption with the stored unique1 values. Unfortunately,
that doesn't reduce the number of possible other reasons by very much.

> For a moment I thought this could be a bug in memoize, but that doesn't
> actually make sense - the failure isn't in memoize, it's the seqscan.

I don't have any bright ideas what the cause might be right now, but I
agree that it seems unlikely to be anything related to Memoize.

It might be worth adding a query like: "select count(odd),min(ctid)
from tenk1;" that should use a Seq Scan plan (ideally max(ctid) too,
but that won't be stable over CPU architectures). Maybe also "select
unique1/1000,count(odd) from tenk1 group by 1 order by 1;" so we can
see if there's any sort of consistency or pattern as to which tuples
are missing. Maybe those will provoke some ideas.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-05-20 03:59:13 Re: Regression in statement locations
Previous Message Lukas Fittl 2025-05-20 03:43:25 Re: Add comment explaining why queryid is int64 in pg_stat_statements