From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru> |
Subject: | Re: WIP: Covering + unique indexes. |
Date: | 2017-04-02 23:40:08 |
Message-ID: | CAH2-WznLFzi49_mvAQOFjdpTCq3jZCXSTDMTVg58tUUqTps5Wg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 31, 2017 at 4:31 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> That's all I have for now. Maybe I can look again later, or tomorrow.
I took another look, this time at code used during CREATE INDEX. More feedback:
* I see no reason to expose _bt_pgaddtup() (to modify it to not be
static, so it can be called during CREATE INDEX for truncated high
key). You could call PageAddItem() directly, just as _bt_pgaddtup()
itself does, and lose nothing. This is the case because the special
steps within _bt_pgaddtup() are only when inserting the first real
item (and only on an internal page). You're only ever using
_bt_pgaddtup() for the high key offset. Would a raw PageAddItem() call
lose anything?
I think I see why you've done this -- the existing CREATE INDEX
_bt_sortaddtup() routine (which is very similar to _bt_pgaddtup(), a
routine used for *insertion*) doesn't do the correct thing were you to
use it, because it assumes that the page is always right most (i.e.,
always has no high key yet).
The reason _bt_sortaddtup() exists is explained here:
* This is almost like nbtinsert.c's _bt_pgaddtup(), but we can't use
* that because it assumes that P_RIGHTMOST() will return the correct
* answer for the page. Here, we don't know yet if the page will be
* rightmost. Offset P_FIRSTKEY is always the first data key.
*/
static void
_bt_sortaddtup(Page page,
Size itemsize,
IndexTuple itup,
OffsetNumber itup_off)
{
...
}
(...thinks some more...)
So, this difference only matters when you have a non-leaf item, which
is never subject to truncation in your patch. So, in fact, it doesn't
matter at all. I guess you should just use _bt_pgaddtup() after all,
rather than bothering with a raw PageAddItem(), even. But, don't
forget to note why this is okay above _bt_sortaddtup().
* Calling PageIndexTupleDelete() within _bt_buildadd(), which
memmove()s all other items on the leaf page, seems wasteful in the
context of CREATE INDEX. Can we do better?
* I also think that calling PageIndexTupleDelete() has a page space
accounting bug, because the following thing happens a second time for
highkey ItemId when new code does this call:
phdr->pd_lower -= sizeof(ItemIdData);
(The first time this happens is within _bt_buildadd() itself, just
before your patch calls PageIndexTupleDelete().)
* I don't think it's okay to let index_truncate_tuple() leak memory
within _bt_buildadd(). It's probably okay for nbtinsert.c callers to
index_truncate_tuple() to not be too careful, though, since those
calls occur in a per-tuple memory context. The same cannot be said for
_bt_buildadd()/CREATE INDEX calls.
* Speaking of memory management: is this really needed?
> @@ -554,7 +580,11 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
> * Save a copy of the minimum key for the new page. We have to copy
> * it off the old page, not the new one, in case we are not at leaf
> * level.
> + * Despite oitup is already initialized, it's important to get high
> + * key from the page, since we could have replaced it with truncated
> + * copy. See comment above.
> */
> + oitup = (IndexTuple) PageGetItem(opage,PageGetItemId(opage, P_HIKEY));
> state->btps_minkey = CopyIndexTuple(oitup);
You didn't modify/truncate oitup in-place -- you effectively made a
(truncated) copy by calling index_truncate_tuple(). Maybe you can
manage the memory by assigning keytup to state->btps_minkey, in place
of a CopyIndexTuple(), just for the truncation case?
I haven't studied this in enough detail to be sure that that would be
correct, but it seems clear that a better strategy is needed for
managing memory within _bt_buildadd().
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-04-02 23:44:40 | Re: WIP: Covering + unique indexes. |
Previous Message | Michael Paquier | 2017-04-02 23:38:47 | Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)? |