Re: COPY FROM WHEN condition

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Surafel Temesgen <surafel3000(at)gmail(dot)com>, alvherre(at)2ndquadrant(dot)com, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: COPY FROM WHEN condition
Date: 2019-01-21 18:51:02
Message-ID: 20190121185102.m7keci23l7q6byou@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-01-21 16:22:11 +0100, Tomas Vondra wrote:
>
>
> On 1/21/19 4:33 AM, Tomas Vondra wrote:
> >
> >
> > On 1/21/19 3:12 AM, Andres Freund wrote:
> >> On 2019-01-20 18:08:05 -0800, Andres Freund wrote:
> >>> On 2019-01-20 21:00:21 -0500, Tomas Vondra wrote:
> >>>>
> >>>>
> >>>> On 1/20/19 8:24 PM, Andres Freund wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 2019-01-20 00:24:05 +0100, Tomas Vondra wrote:
> >>>>>> On 1/14/19 10:25 PM, Tomas Vondra wrote:
> >>>>>>> On 12/13/18 8:09 AM, Surafel Temesgen wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Dec 12, 2018 at 9:28 PM Tomas Vondra
> >>>>>>>> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Can you also update the docs to mention that the functions called from
> >>>>>>>> the WHERE clause does not see effects of the COPY itself?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> /Of course, i  also add same comment to insertion method selection
> >>>>>>>> /
> >>>>>>>
> >>>>>>> FWIW I've marked this as RFC and plan to get it committed this week.
> >>>>>>>
> >>>>>>
> >>>>>> Pushed, thanks for the patch.
> >>>>>
> >>>>> While rebasing the pluggable storage patch ontop of this I noticed that
> >>>>> the qual appears to be evaluated in query context. Isn't that a bad
> >>>>> idea? ISMT it should have been evaluated a few lines above, before the:
> >>>>>
> >>>>> /* Triggers and stuff need to be invoked in query context. */
> >>>>> MemoryContextSwitchTo(oldcontext);
> >>>>>
> >>>>> Yes, that'd require moving the ExecStoreHeapTuple(), but that seems ok?
> >>>>>
> >>>>
> >>>> Yes, I agree. It's a bit too late for me to hack and push stuff, but I'll
> >>>> fix that tomorrow.
> >>>
> >>> NP. On second thought, the problem is probably smaller than I thought at
> >>> first, because ExecQual() switches to the econtext's per-tuple memory
> >>> context. But it's only reset once for each batch, so there's some
> >>> wastage. At least worth a comment.
> >>
> >> I'm tired, but perhaps its actually worse - what's being reset currently
> >> is the ESTate's per-tuple context:
> >>
> >> if (nBufferedTuples == 0)
> >> {
> >> /*
> >> * Reset the per-tuple exprcontext. We can only do this if the
> >> * tuple buffer is empty. (Calling the context the per-tuple
> >> * memory context is a bit of a misnomer now.)
> >> */
> >> ResetPerTupleExprContext(estate);
> >> }
> >>
> >> but the quals are evaluated in the ExprContext's:
> >>
> >> ExecQual(ExprState *state, ExprContext *econtext)
> >> ...
> >> ret = ExecEvalExprSwitchContext(state, econtext, &isnull);
> >>
> >>
> >> which is created with:
> >>
> >> /* Get an EState's per-output-tuple exprcontext, making it if first use */
> >> #define GetPerTupleExprContext(estate) \
> >> ((estate)->es_per_tuple_exprcontext ? \
> >> (estate)->es_per_tuple_exprcontext : \
> >> MakePerTupleExprContext(estate))
> >>
> >> and creates its own context:
> >> /*
> >> * Create working memory for expression evaluation in this context.
> >> */
> >> econtext->ecxt_per_tuple_memory =
> >> AllocSetContextCreate(estate->es_query_cxt,
> >> "ExprContext",
> >> ALLOCSET_DEFAULT_SIZES);
> >>
> >> so this is currently just never reset.
> >
> > Actually, no. The ResetPerTupleExprContext boils down to
> >
> > MemoryContextReset((econtext)->ecxt_per_tuple_memory)
> >
> > and ExecEvalExprSwitchContext does this
> >
> > MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
> >
> > So it's resetting the right context, although only on batch boundary.

> >> Seems just using ExecQualAndReset() ought to be sufficient?
> >>
> >
> > That may still be the right thing to do.
> >
>
> Actually, no, because that would reset the context far too early (and
> it's easy to trigger segfaults). So the reset would have to happen after
> processing the row, not this early.

Yea, sorry, I was too tired yesterday evening. I'd spent 10h splitting
up the pluggable storage patch into individual pieces...

> But I think the current behavior is actually OK, as it matches what we
> do for defexprs. And the comment before ResetPerTupleExprContext says this:
>
> /*
> * Reset the per-tuple exprcontext. We can only do this if the
> * tuple buffer is empty. (Calling the context the per-tuple
> * memory context is a bit of a misnomer now.)
> */
>
> So the per-tuple context is not quite per-tuple anyway. Sure, we might
> rework that but I don't think that's an issue in this patch.

I'm *not* convinced by this. I think it's bad enough that we do this for
normal COPY, but for WHEN, we could end up *never* resetting before the
end. Consider a case where a single tuple is inserted, and then *all*
rows are filtered. I think this needs a separate econtext that's reset
every round. Or alternatively you could fix the code not to rely on
per-tuple not being reset when tuples are buffered - that actually ought
to be fairly simple.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-01-21 19:06:41 Re: What to name the current heap after pluggable storage / what to rename?
Previous Message Vikramsingh Kushwaha 2019-01-21 18:27:17 Fwd: Google Summer Of Code