|From:||Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>|
|To:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>|
|Cc:||lubennikovaav(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, pavan(dot)deolasee(at)gmail(dot)com, ibrar(dot)ahmad(at)gmail(dot)com|
|Subject:||Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 30.10.2020 03:42, Tomas Vondra wrote:
> I might be somewhat late to the party, but I've done a bit of
> benchmarking too ;-) I used TPC-H data from a 100GB test, and tried
> different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on
> current master and with the patch.
> So I don't really observe any measurable slowdowns in the COPY part (in
> fact there seems to be a tiny speedup, but it might be just noise). In
> the VACUUM part, there's clear speedup when the data was already frozen
> by COPY (Yes, those are zeroes, because it took less than 1 second.)
> So that looks pretty awesome, I guess.
> For the record, these tests were run on a server with NVMe SSD, so
> hopefully reliable and representative numbers.
Thank you for the review.
> A couple minor comments about the code:
> 2) I find the "if (all_frozen_set)" block a bit strange. It's a matter
> of personal preference, but I'd just use a single level of nesting, i.e.
> something like this:
> /* if everything frozen, the whole page has to be visible */
> Assert(!(all_frozen_set && !PageIsAllVisible(page)));
> * If we've frozen everything on the page, and if we're already
> * holding pin on the vmbuffer, record that in the visibilitymap.
> * If we're not holding the pin, it's OK to skip this - it's just
> * an optimization.
> * It's fine to use InvalidTransactionId here - this is only used
> * when HEAP_INSERT_FROZEN is specified, which intentionally
> * violates visibility rules.
> if (all_frozen_set &&
> visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
> IMHO it's easier to read, but YMMV. I've also reworded the comment a bit
> to say what we're doing etc. And I've moved the comment from inside the
> if block into the main comment - that was making it harder to read too.
I agree that it's a matter of taste. I've updated comments and left
nesting unchanged to keep assertions simple.
> 3) I see RelationGetBufferForTuple does this:
> * This is for COPY FREEZE needs. If page is empty,
> * pin vmbuffer to set all_frozen bit
> if ((options & HEAP_INSERT_FROZEN) &&
> (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
> visibilitymap_pin(relation, BufferGetBlockNumber(buffer),
> so is it actually possible to get into the (all_frozen_set) without
> holding a pin on the visibilitymap? I haven't investigated this so
> maybe there are other ways to get into that code block. But the new
> additions to hio.c get the pin too.
I was thinking that GetVisibilityMapPins() can somehow unset the pin. I
gave it a second look. And now I don't think it's possible to get into
this code block without a pin. So, I converted this check into an
> 4) In heap_xlog_multi_insert we now have this:
> if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
> if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
> IIUC it makes no sense to have both flags at the same time, right? So
> what about adding
> Assert(!(XLH_INSERT_ALL_FROZEN_SET &&
> to check that?
I placed this assertion to the very beginning of the function. It also
helped to simplify the code a bit.
I also noticed, that we were not updating visibility map for all_frozen
from heap_xlog_multi_insert. Fixed.
> I wonder what to do about the heap_insert - I know there are concerns it
> would negatively impact regular insert, but is it really true? I suppose
> this optimization would be valuable even for cases where multi-insert is
> not possible.
Do we have something like INSERT .. FREEZE? I only see
TABLE_INSERT_FROZEN set for COPY FREEZE and for matview operations. Can
you explain, what use-case are we trying to optimize by extending this
patch to heap_insert()?
The new version is attached.
I've also fixed a typo in the comment by Tatsuo Ishii suggestion.
Also, I tested this patch with replication and found no issues.
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
|Next Message||Magnus Hagander||2020-11-02 13:46:02||Re: -O switch|
|Previous Message||Thomas Munro||2020-11-02 13:38:31||Re: Collation versioning|