Re: Faster inserts with mostly-monotonically increasing values

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(dot)riggs(at)2ndquadrant(dot)com>
Subject: Re: Faster inserts with mostly-monotonically increasing values
Date: 2018-03-06 03:10:58
Message-ID: CAGTBQpZkfcpSVYLFcri5AJZd8kM1wwnHhAb+MmDBpXJ-9htwWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 5, 2018 at 10:59 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Mon, Mar 5, 2018 at 5:48 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> But in thise case there's no right link to follow, so it's a non-issue.
>>
>> BTree doesn't truncate AFAIK, so the cached block number can't be
>> nonexisting (beyond the end of the relation). It's probably fragile to
>> assume BTree will never truncate, so maybe it would be wise to check
>> for that. But if the block exists, I believe it contains a page that
>> can be safely interpreted by that condition. As long as there can be
>> no other pages with the same flags on the index while the cached page
>> is write-locked, that code will be safe.
>
> Introducing any case that allows us to land on a recycled page, and
> reason that it must at least not be the page we were looking for based
> on *any* criteria about the page itself seems very brittle. Yeah, it
> probably won't break in practice, but it's a bad design.

How is this any different from btvacuumscan?

I don't think it can be argued that somehow btvacuumscan has
permission to be brittle and the rest of the code doesn't.

Point is, it's not brittle. The on-disk format of the tree is such
that any block can be catalogued by inspecting its header,
btvacuumscan depends on that. Questions that can be answered looking
at a page header alone, are:

* Is it deleted or new?
* Is it a leaf?
* Is it rightmost?

Only question remaining is whether it's the *only* rightmost leaf, and
that can be guaranteed by locking.

So, can a flag check result in a random outcome? No. That would also
cause a random outcome for btvacuumscan and then vacuum would corrupt
the index just as well.

And now that we mention this... why is the code not using _bt_getbuf?
It already does quite a bit of sanity checking that is IMO quite
desirable here.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-06 03:46:32 Re: PATCH: Configurable file mode mask
Previous Message Yugo Nagata 2018-03-06 03:09:31 Re: [PATCH] GET DIAGNOSTICS FUNCTION_NAME