Re: Subplan result caching

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Subplan result caching
Date: 2018-05-23 16:43:58
Message-ID: ec033922-579a-647e-f042-2ee4614c70ec@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23/05/18 19:25, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> I've been working on a patch to add a little cache to SubPlans, to speed
>> up queries with correlated subqueries, where the same subquery is
>> currently executed multiple times with the same parameters. The idea is
>> to cache the result of the subplan, with the correlation vars as the
>> cache key.
>
> I find this pretty bogus as it stands, because it assumes without proof
> that the subquery will deliver identical results for any two parameter
> values that are considered equal by the datatype's default equality
> operator. An easy counterexample is a subquery whose result depends on
> the text conversion of a float8 parameter: zero and minus zero have
> different text forms, but are equal according to float8eq.

Good point.

> To make this
> patch safe, I think you'd need to grovel through the subquery and make
> sure that the parameters are only used as inputs to operators that belong
> to the type's default btree or hash opfamily. (Many other cases would
> work in practice, but we have no semantic knowledge that would let us be
> sure of that.)

Hmm. First thing that comes to mind is to use the raw bytes as cache
key, only treating Datums as equal if their binary representation is
identical.

> That's doable no doubt, but I wonder whether that leaves you in a place
> that's any better than the plan-time-decorrelation approach you proposed
> in the earlier thread. I liked that better TBH; this one seems like
> a very ad-hoc reinvention of a hash join. I don't especially like the
> unpredictable number of executions of the subquery that it results in,
> either.

I'd certainly prefer the plan-time-decorrelation, too. But I suspect
that there's always going to be some cases that cannot be decorrelated,
where a cache like this would help.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-05-23 16:51:38 Re: Subplan result caching
Previous Message Andres Freund 2018-05-23 16:27:14 Re: PG11 jit failing on ppc64el