Re: COPY FROM WHEN condition

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: COPY FROM WHEN condition
Date: 2019-04-03 07:00:09
Message-ID: CAKJS1f_BTC73ZGX6sFZyps_4bXPV6wtFYQq+b4r=M8LrGujfNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 3 Apr 2019 at 18:56, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > +/* Class the buffer full if there are >= this many bytes of tuples stored */
> > +#define MAX_BUFFERED_BYTES 65535
>
> Random aside: This seems pretty small (but should be changed separately.

Yeah, my fingers were hovering over this. I was close to making it
1MB, but thought I'd better not.

> > +typedef struct CopyMultiInsertBuffer
> > +{
> > + TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
> > + ResultRelInfo *resultRelInfo; /* ResultRelInfo for 'relid' */
> > + BulkInsertState bistate; /* BulkInsertState for this rel */
> > + int nused; /* number of 'slots' containing tuples */
> > + uint64 linenos[MAX_BUFFERED_TUPLES]; /* Line # of tuple in copy
> > + * stream */
> > +} CopyMultiInsertBuffer;
>
> I don't think it's needed now, but you'd probably achieve a bit better
> locality by storing slots + linenos in a single array of (slot,lineno).

You're right, but right now I only zero the slots array and leave the
linenos uninitialised. Merging them to one would double the area of
memory to zero. I'm not sure if that's worse or not, but I do know if
the number of partitions exceeds MAX_PARTITION_BUFFERS and we change
partitions quite a lot during the copy then we could end up
initialising buffers quite often. I wanted to keep that as cheap as
possible.

> > +/*
> > + * CopyMultiInsertBuffer_Init
> > + * Allocate memory and initialize a new CopyMultiInsertBuffer for this
> > + * ResultRelInfo.
> > + */
> > +static CopyMultiInsertBuffer *
> > +CopyMultiInsertBuffer_Init(ResultRelInfo *rri)
> > +{
> > + CopyMultiInsertBuffer *buffer;
> > +
> > + buffer = (CopyMultiInsertBuffer *) palloc(sizeof(CopyMultiInsertBuffer));
> > + memset(buffer->slots, 0, sizeof(TupleTableSlot *) * MAX_BUFFERED_TUPLES);
>
> Is there a reason to not just directly palloc0?

Yeah, I'm too cheap to zero the entire thing, per what I mentioned
above. linenos does not really need anything meaningful put there
during init. It'll get set when we consume the slots.

> > +
> > + /* Remove back-link to ourself */
> > + buffer->resultRelInfo->ri_CopyMultiInsertBuffer = NULL;
> > +
> > + ReleaseBulkInsertStatePin(buffer->bistate);
>
> Hm, afaict this still leaks the bistate itself?

Oops, I forgot about that one. v4 attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
david_tableam_copy_v4.patch application/octet-stream 44.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-04-03 08:44:32 Re: Checksum errors in pg_stat_database
Previous Message Michael Paquier 2019-04-03 06:59:49 Simplify redability of some tests for toast_tuple_target in strings.sql