Skip site navigation (1) Skip section navigation (2)

Re: TODO item: Allow data to be pulled directly from indexes

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-30 07:33:36
Message-ID: 48688C50.2090702@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Karl Schnaitter wrote:
> The main difference between (1) & (4) is that (1) will sometimes require 
> heap lookups and (4) never will. Moreover, the heap lookups in (1) will 
> be difficult for the optimizer to estimate, unless some special 
> statistics can be maintained for this purpose.

Yeah, we certainly should maintain a statistic for it.

> (2) & (3) can work for any index, and they are quite elegant in the way 
> that the overhead does not change with the number of indexes. The TODO 
> also notes the benefit of (2) for efficient vacuuming. Thus, I think 
> that (2) is a great idea in general, but it does not serve the intended 
> purpose of this TODO item. Once a page gets marked as requiring 
> visibility checks, it cannot be unmarked until the next VACUUM. The 
> whole point of this feature is that we are willing to be more proactive 
> during updates in order to make index access more efficient.

In some cases we can mark a page earlier, as soon as we see that the 
condition is true. Most importantly, when new tuples are inserted, we 
can mark the page as soon as the inserting transaction is visible to all.

Also, the visibility map ought to make vacuums cheaper, as you only need 
to scan the parts of the table that have beem modified since last 
vacuum. You still need to scan all indexes, though. But assuming that 
you somehow solve the correctness issues in the "add visibility fields 
to index tuples" approach, we can use the same solution to perform 
retail vacuums, which would bring vacuuming and the visibility map 
approach on par with that approach anyway.

> Please let me know if you agree/disagree with anything here. I need to get this feature implemented for my research, but I would also love to contribute it to the community so your opinions matter a lot.

Well, I think the visibility map is a much better approach. This has 
been discussed many times before, so I don't really have anything new to 
add.

I've been working adding support for so-called "relation forks" 
(http://archives.postgresql.org/message-id/48651722.8020600@enterprisedb.com), 
to allow attaching metadata to relations, like the visibility map. I'm 
going to use the facility for a new FSM implementation, which I'm 
working on at the moment, but after that's done I'm going to start 
working on the visibility map. And after that's done and working for 
VACUUM, I'm going to work on using it for index-only-scans.

I'm not sure I have enough time to get all that done for 8.4, it's 
looking bad at the moment, so help would be much appreciated. If you 
don't agree with taking the visibility map approach, I would suggest 
working on the indexam API changes first, to allow returning index 
tuples from an index. I believe that part is the same regardless of how 
we check the visibility.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Richard HuxtonDate: 2008-06-30 08:38:08
Subject: Re: Does anything dump per-database config settings? (was Re: ALTER DATABASE vs pg_dump)
Previous:From: Simon RiggsDate: 2008-06-30 06:58:39
Subject: Re: Join Removal/ Vertical Partitioning

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group