Re: Batch insert in CTAS/MatView code

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Asim R P <apraveen(at)pivotal(dot)io>, Paul Guo <pguo(at)pivotal(dot)io>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Taylor Vesely <tvesely(at)pivotal(dot)io>
Subject: Re: Batch insert in CTAS/MatView code
Date: 2019-09-27 21:46:38
Message-ID: 20190927214638.fow7hpndn7dwymoy@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-09-26 10:43:27 -0300, Alvaro Herrera wrote:
> On 2019-Sep-25, Asim R P wrote:
>
> > I reviewed your patch today. It looks good overall. My concern is that
> > the ExecFetchSlotHeapTuple call does not seem appropriate. In a generic
> > place such as createas.c, we should be using generic tableam API
> > only.

Indeed.

> > However, I can also see that there is no better alternative. We need to
> > compute the size of accumulated tuples so far, in order to decide whether
> > to stop accumulating tuples. There is no convenient way to obtain the
> > length of the tuple, given a slot. How about making that decision solely
> > based on number of tuples, so that we can avoid ExecFetchSlotHeapTuple call
> > altogether?
>
> ... maybe we should add a new operation to slots, that returns the
> (approximate?) size of a tuple?

Hm, I'm not convinced that it's worth adding that as a dedicated
operation. It's not that clear what it'd exactly mean anyway - what
would it measure? As referenced in the slot? As if it were stored on
disk? etc?

I wonder if the right answer wouldn't be to just measure the size of a
memory context containing the batch slots, or something like that.

> That would make this easy. (I'm not sure however what to do about
> TOAST considerations -- is it size in memory that we're worried
> about?)

The in-memory size is probably fine, because in all likelihood the
toasted cols are just going to point to on-disk datums, no?

> Also:
>
> + myState->mi_slots_size >= 65535)
>
> This magic number should be placed in a define next to the other one,
> but I'm not sure that heapam.h is a good location, since surely this
> applies to matviews in other table AMs too.

Right. I think it'd be better to move this into an AM independent place.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-09-27 21:49:51 Re: Batch insert in CTAS/MatView code
Previous Message Peter Geoghegan 2019-09-27 21:43:00 Re: ECPG installcheck tests fail if PGDATABASE is set