Re: Rethinking the parameter access hooks for plpgsql's benefit

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rethinking the parameter access hooks for plpgsql's benefit
Date: 2015-03-11 05:52:48
Message-ID: CAFj8pRC52nramuKSTo-GerNSdY53mJNUz9rh+owvaeoVyKvE7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-03-11 0:24 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On the technical side, I do agree that the requirement to allocate and
> > zero an array for every new simple expression is unfortunate, but I'm
> > not convinced that repeatedly invoking the hook-function is a good way
> > to solve that problem. Calling the hook-function figures to be
> > significantly more expensive than an array-fetch, so you can almost
> > think of the ParamListInfo entries themselves as a cache of data
> > retrieved from the hook function. In that view of the world, the
> > problem here is that initializing the cache is too expensive. Your
> > solution to that is to rip the cache out completely, but what would be
> > better is keep the cache and make initializing it cheaper - e.g. by
> > sharing one ParamListInfo across all expressions in the same scope; or
> > by not zeroing the memory and instead tracking the the first N slots
> > are the only ones we've initialized; or $your_idea_here.
>
> Getting back to the actual merits of this patch --- you're right, it
> was not a good idea as proposed. I'd been thinking about the scalar
> expression case only, and forgot that the same code is used to pass
> parameters to queries, which might evaluate those parameters quite
> a large number of times. So any savings from reducing the setup time
> is sure to be swamped if the hook-function gets called repeatedly.
> Another problem is that for ROW and REC datums, exec_eval_datum()
> pallocs memory that won't get freed till the end of the statement,
> so repeat evaluation would cause memory leaks. So we really need
> evaluate-at-most-once semantics in there.
>
> However, this attempt did confirm that we don't need to create a separate
> ParamListInfo struct for each evaluation attempt. So the attached,
> greatly stripped-down patch just allocates a ParamListInfo once at
> function startup, and then zeroes it each time. This does nothing for
> the memset overhead but it does get rid of the palloc traffic, which
> seems to be good for a few percent on simple-assignment-type statements.
> AFAICS this is an unconditional win compared to HEAD. What's more, it
> provides at least a basis for doing better later: if we could think of
> a reasonably cheap way of tracking which datums get invalidated by an
> assignment, we might be able to reset only selected entries in the array
> rather than blindly blowing away all of them.
>
> I propose to apply this and leave it at that for now.
>

+1

Regards

Pave

>
> regards, tom lane
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-03-11 06:04:50 Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Previous Message Noah Misch 2015-03-11 05:47:25 Re: In-core regression tests for replication, cascading, archiving, PITR, etc.