Re: Heap WARM Tuples - Design Draft

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Heap WARM Tuples - Design Draft
Date: 2016-08-11 16:57:09
Message-ID: CAGTBQpZ5GA-GBNW+0JpcQtDBj=P8ouhOSSRTUgw9u9X7QB88wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 11, 2016 at 11:07 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 8/10/16 12:48 PM, Claudio Freire wrote:
>>
>> On Tue, Aug 9, 2016 at 11:39 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
>> wrote:
>>>
>>> On 8/9/16 6:44 PM, Claudio Freire wrote:
>>>>
>>>>
>>>> Since we can lookup all occurrences of k1=a index=0 and k2=a index=0,
>>>> and in fact we probably did so already as part of the update logic
>>>
>>>
>>>
>>> That's a change from what currently happens, right?
>>>
>>> The reason I think that's important is that dropping the assumption that
>>> we
>>> can't safely re-find index entries from the heap opens up other
>>> optimizations, ones that should be significantly simpler to implement.
>>> The
>>> most obvious example being getting rid of full index scans in vacuum.
>>> While
>>> that won't help with write amplification, it would reduce the cost of
>>> vacuum
>>> enormously. Orders of magnitude wouldn't surprise me in the least.
>>>
>>> If that's indeed a prerequisite to WARM it would be great to get that
>>> groundwork laid early so others could work on other optimizations it
>>> would
>>> enable.
>>
>>
>> I can do that. I've been prospecting the code to see what changes it
>> would entail already.
>>
>> But it's still specific to btree, I'm not sure the same optimizations
>> can be applied to GIN (maybe, if the posting list is sorted) or GIST
>> (probably, since it's like a btree, but I don't know the code well
>> enough).
>>
>> Certainly hash indexes won't support it.
>
>
> Why not? If this is all predicated on re-finding index keys based on heap
> data then this is just another index lookup, no?

A lookup on a hash index cannot be made to work for both key-only
lookups and key-ctid lookups, it's a limitation of the data structure.

A key-only lookup can potentially return too many results that don't
match the ctid so a walk of all equal-key item pointers is out of the
question.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2016-08-11 17:46:34 Re: No longer possible to query catalogs for index capabilities?
Previous Message Peter Geoghegan 2016-08-11 16:56:25 Re: Improved ICU patch - WAS: Implementing full UTF-8 support (aka supporting 0x00)