Re: WIP: Covering + unique indexes.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Subject: Re: WIP: Covering + unique indexes.
Date: 2017-04-04 18:47:10
Message-ID: CAH2-Wzm9y59h2m6iZjM4fpdUP5r4bsRVzGbN2gTRCO1j4nZmtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 4, 2017 at 3:07 AM, Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>> * I think that we should store this (the number of attributes), and
>> use it directly when comparing, per my remarks to Tom over on that
>> other thread. We should also use the free bit within
>> IndexTupleData.t_info, to indicate that the IndexTuple was truncated,
>> just to make it clear to everyone that might care that that's how
>> these truncated IndexTuples need to be represented.
>>
>> Doing this would have no real impact on your patch, because for you
>> this will be 100% redundant. It will help external tools, and perhaps
>> another, more general suffix truncation patch that comes in the
>> future. We should try very hard to have a future-proof on-disk
>> representation. I think that this is quite possible.
>
> To be honest, I think that it'll make the patch overcomplified, because this
> exact patch has nothing to do with suffix truncation.
> Although, we can add any necessary flags if this work will be continued in
> the future.

Yes, doing things that way would mean adding a bit more complexity to
your patch, but IMV would be worth it to have the on-disk format be
compatible with what a full suffix truncation patch will eventually
require.

Obviously I disagree with what you say here -- I think that your patch
*does* have plenty in common with suffix truncation. But, you don't
have to even agree with me on that to see why what I propose is still
a good idea. Tom Lane had a specific objection to this patch --
catalog metadata is currently necessary to interpret internal page
IndexTuples [1]. However, by storing the true number of columns in the
case of truncated tuples, we can make the situation with IndexTuples
similar enough to the existing situation with heap tuples, where the
number of attributes is available right in the header as "natts". We
don't have to rely on something like catalog metadata from a great
distance, where some caller may forget to pass through the metadata to
a lower level.

So, presumably doing things this way addresses Tom's exact objection
to the truncation aspect of this patch [2]. We have the capacity to
store something like natts "for free" -- let's use it. The lack of any
direct source of metadata was called "dangerous". As much as anything
else, I want to remove any danger.

> There is already an assertion.
> Assert(IndexTupleSize(newitup) <= IndexTupleSize(olditup));
> Do you think it is not enough?

I think that a "can't happen" check will work better in the future,
when user defined code could be involved in truncation. Any extra
overhead will be paid relatively infrequently, and will be very low.

>> * I see a small bug. You forgot to teach _bt_findsplitloc() about
>> truncation. It does this currently, which you did not update:
>>
>> /*
>> * The first item on the right page becomes the high key of the left
>> page;
>> * therefore it counts against left space as well as right space.
>> */
>> leftfree -= firstrightitemsz;
>>
>> I think that this accounting needs to be fixed.
>
> Could you explain, what's wrong with this accounting? We may expect to take
> more space on the left page, than will be taken after highkey truncation.
> But I don't see any problem here.

Obviously it would at least be slightly better to have the actual
truncated high key size where that's expected -- not the would-be
untruncated high key size. The code as it stands might lead to a bad
choice of split point in edge-cases.

At the very least, you should change comments to note the issue. I
think it's highly unlikely that this could ever result in a failure to
find a split point, which there are many defenses against already, but
I think I would find that difficult to prove. The intent of the code
is almost as important as the code, at least in my opinion.

[1] postgr.es/m/CAH2-Wz=VMDH8pFAZX9WAH9Bn5Ast5vrnA0xSz+GsfRs12bp_sg@mail.gmail.com
[2] postgr.es/m/11895.1490983884%40sss.pgh.pa.us
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-04-04 19:16:55 Re: Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
Previous Message Tomas Vondra 2017-04-04 18:25:38 Re: Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)