Skip site navigation (1) Skip section navigation (2)

Re: TODO item: Allow data to be pulled directly from indexes

From: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-29 19:33:24
Message-ID: 4867E384.9080406@soe.ucsc.edu (view raw or flat)
Thread:
Lists: pgsql-hackers
Gregory Stark wrote:
>> (1) & (4) require an UPDATE or DELETE to twiddle the old index tuple. Tom has
>> noted (in the linked message) that this is not reliable if the index has any
>> expression-valued columns, because it is not always possible to find the old
>> index entry. For this reason, the proposed patch does not keep visibility
>> metadata for indexes on expressions. This seems like a reasonable limitation
>> --- indexed expressions are just less efficient.
>>     
>
> Or if the index operators and btproc aren't nearly as immutable as they claim.
> Probably less likely than non-immutable index expressions but also possible.
>
>   
Your point is well taken... I'll have to look into that more.

>> (2) & (3) can work for any index, and they are quite elegant in the way that
>> the overhead does not change with the number of indexes. The TODO also notes
>> the benefit of (2) for efficient vacuuming. Thus, I think that (2) is a great
>> idea in general, but it does not serve the intended purpose of this TODO item.
>> Once a page gets marked as requiring visibility checks, it cannot be unmarked
>> until the next VACUUM. The whole point of this feature is that we are willing
>> to be more proactive during updates in order to make index access more
>> efficient.
>>     
>
> Well I think that's precisely the point. If you're trading off work done at
> update time against work done for index accesses then you're only going to win
> if the tuples are relatively static and have lots of accesses done against
> them between updates. In which case having the optimization only kick in when
> the page has been static for long enough that all the tuples are globally
> visible should be good enough
I really don't understand this point. The way I see the visibility map 
working is as follows: we set a page to "requires visibility check" when 
a tuple on the page is inserted, deleted, or non-HOT updated. If the 
only modifications have been inserts, we can reset the status to "all 
tuples visible" when these tuples become universally visible, which 
matches your description. But in the presence of deletes and updates, we 
can only reset the status of a page after a VACUUM (I know that dead HOT 
tuples can be pruned without VACUUM, but I don't think that's the case 
for indexed tuples). We can't reset the status earlier because we don't 
know what indexes still have pointers to the dead tuples. So a page can 
be static indefinitely (after a single modification) without ever 
getting to enjoy the optimization.

This is a really important point, so please let me know if I'm missing 
something.

Thanks for your response!
Karl

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-06-29 20:57:10
Subject: WIP patch: reducing overhead for repeat de-TOASTing
Previous:From: Gregory StarkDate: 2008-06-29 17:18:25
Subject: Re: TODO item: Allow data to be pulled directly from indexes

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group