Re: Negative cache entries for memoize

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Negative cache entries for memoize
Date: 2023-04-05 21:23:31
Message-ID: CAApHDvptH3p=C4Mm0Pk-u8A=h58vi=m-Y9eBP528bmgBLX1b7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> During two presentations, I was asked if negative cache entries were
> created for cases where inner-side lookups returned no rows.
>
> It seems we don't do that. Has this been considered or is it planned?

It does allow negative cache entries, so I'm curious about what you
did to test this.

A cache entry is always marked as complete (i.e valid to use for
lookups) when we execute the subnode to completion. In some plan
shapes we might not execute the inner side until it returns NULL, for
example in Nested Loop Semi Joins we skip to the next outer row when
matching the first inner row. This could leave an incomplete cache
entry which Memoize can't be certain if it contains all rows from the
subnode or not.

For the negative entry case, which really there is no special code
for, there are simply just no matching rows so the cache entry will be
marked as complete always as the inner node will return NULL on the
first call. So negative entries will even work in the semi-join case.

Here's a demo of the negative entries working with normal joins:

create table t0 (a int);
insert into t0 select 0 from generate_Series(1,1000000);
create table t1 (a int primary key);
insert into t1 select x from generate_series(1,1000000)x;
vacuum analyze t0,t1;
explain (analyze, costs off, timing off, summary off)
select * from t0 inner join t1 on t0.a=t1.a;
QUERY PLAN
--------------------------------------------------------------------------------
Nested Loop (actual rows=0 loops=1)
-> Seq Scan on t0 (actual rows=1000000 loops=1)
-> Memoize (actual rows=0 loops=1000000)
Cache Key: t0.a
Cache Mode: logical
Hits: 999999 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Only Scan using t1_pkey on t1 (actual rows=0 loops=1)
Index Cond: (a = t0.a)
Heap Fetches: 0

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-04-05 21:26:33 Re: logical decoding and replication of sequences, take 2
Previous Message Andres Freund 2023-04-05 21:15:34 Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode