Re: What to call an executor node which lazily caches tuples in a hash table?

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: What to call an executor node which lazily caches tuples in a hash table?
Date: 2021-03-31 04:06:40
Message-ID: CAApHDvp7qU+H1+Ek-a0R+vY_QJp=B7f=gprZxzo3ZSDZMYFRuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 31 Mar 2021 at 14:43, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> At last, I still want to vote for "Tuple(s) Cache", which sounds simple and enough.
> I was thinking if we need to put "Lazy" in the node name since we do build cache
> lazily, then I found we didn't call "Materialize" as "Lazy Materialize", so I think we
> can keep consistent.

I thought about this a little more and I can see now why I put the
word "Cache" in the original name. I do now agree we really need to
keep the word "Cache" in the name.

The EXPLAIN ANALYZE talks about "hits", "misses" and "evictions", all
of those are things that caches do.

-> Result Cache (actual rows=1 loops=403)
Cache Key: c1.relkind
Hits: 398 Misses: 5 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Aggregate (actual rows=1 loops=5)

I don't think there's any need to put the word "Lazy" in the name as
if we're keeping "Cache", then most caches do only cache results of
values that have been looked for.

I'm just not sure if "Tuple" is the best word or not. I primarily
think of "tuple" as the word we use internally, but a quick grep of
the docs reminds me that's not the case. The word is used all over the
documents. We have GUCs like parallel_tuple_cost and cpu_tuple_cost.
So it does seem like the sort of thing anyone who is interested in
looking at the EXPLAIN output should know about. I'm just not
massively keen on using that word in the name. The only other options
that come to mind are "Result" and "Parameterized". However, I think
"Parameterized" does not add much meaning. I think most people would
expect a cache to have a key. I sort of see why I went with "Result
Cache" now.

Does anyone else like the name "Tuple Cache"?

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-03-31 04:07:51 Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12
Previous Message Michael Paquier 2021-03-31 04:05:03 Re: extra semicolon in postgres_fdw test cases