Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date: 2020-08-14 15:52:36
Message-ID: CALtqXTfLBVf=y=LZ=z7tWELq5m+L4sm7z1txB1QkMgMC=koW4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 3, 2020 at 2:29 PM Anastasia Lubennikova <
a(dot)lubennikova(at)postgrespro(dot)ru> wrote:

> On 31.07.2020 23:28, Robert Haas wrote:
> > On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> > <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> >> Questions from the first review pass:
> >>
> >> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is
> always
> >> implied by XLH_INSERT_ALL_FROZEN_SET.
> > I agree that it looks unnecessary to have two separate bits.
> >
> >> 2) What does this comment mean? Does XXX refers to the lsn comparison?
> >> Since it
> >> is definitely necessary to update the VM.
> >>
> >> + * XXX: This seems entirely unnecessary?
> >> + *
> >> + * FIXME: Theoretically we should only do this after we've
> >> + * modified the heap - but it's safe to do it here I think,
> >> + * because this means that the page previously was empty.
> >> + */
> >> + if (lsn > PageGetLSN(vmpage))
> >> + visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
> >> vmbuffer,
> >> + InvalidTransactionId, vmbits);
> > I wondered about that too. The comment which precedes it was, I
> > believe, originally written by me, and copied here from
> > heap_xlog_visible(). But it's not clear very good practice to just
> > copy the comment like this. If the same logic applies, the code should
> > say that we're doing the same thing here as in heap_xlog_visible() for
> > consistency, or some such thing; after all, that's the primary place
> > where that happens. But it looks like the XXX might have been added by
> > a second person who thought that we didn't need this logic at all, and
> > the FIXME by a third person who thought it was in the wrong place, so
> > the whole thing is really confusing at this point.
> >
> > I'm pretty worried about this, too:
> >
> > + * FIXME: setting recptr here is a dirty dirty hack, to
> prevent
> > + * visibilitymap_set() from WAL logging.
> >
> > That is indeed a dirty hack, and something needs to be done about it.
> >
> > I wonder if it was really all that smart to try to make the
> > HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> > records to mark it all-visible afterwards, but I don't see any reason
> > why this can't be made to work. It needs substantially more polishing
> > than it's had, though, I think.
> >
> New version of the patch is in the attachment.
>
> New design is more conservative and simpler:
> - pin the visibility map page in advance;
> - set PageAllVisible;
> - call visibilitymap_set() with its own XlogRecord, just like in other
> places.
>
> It allows to remove most of the "hacks" and keep code clean.
> The patch passes tests added in previous versions.
>
> I haven't tested performance yet, though. Maybe after tests, I'll bring
> some optimizations back.
>
> --
> Anastasia Lubennikova
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
> Here are some performance results with a patched and unpatched master
branch.
The table used for the test contains three columns (integer, text,
varchar).
The total number of rows is 10000000 in total.

Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 9069.432 ms vacuum; 2567.961ms
COPY: 9004.533 ms vacuum: 2553.075ms
COPY: 8832.422 ms vacuum: 2540.742ms

Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 10031.723 ms vacuum: 127.524 ms
COPY: 9985.109 ms vacuum: 39.953 ms
COPY: 9283.373 ms vacuum: 37.137 ms

Time to take the copy slightly increased but the vacuum time significantly
decrease.

--
Ibrar Ahmed

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-08-14 16:09:49 Re: Implement a new data type
Previous Message mohand oubelkacem makhoukhene 2020-08-14 15:48:50 Implement a new data type