Quick Links

Re: the big picture for index-only scans

From:	Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: the big picture for index-only scans
Date:	2011-05-10 15:27:32
Message-ID:	BANLkTi=cefjoKR1hJ5RLEPsvRUVRBxg0dw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

2011/5/10 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, May 10, 2011 at 10:58 AM, Cédric Villemain
> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>> ANALYZE can do the stats job for 'free' on the pages it collects
>> anyway. So that looks like a good idea.
>> I believe the really lazy vacuum is another topic; even if it will
>> improve the performance of the index only scan to have tables already
>> vacuuumed, the stats should expose that and the function
>> cost_index(_only?)() taking care of that.
>
> I basically agree. The connection is that - as we use the all-visible
> for more things, the performance penalty for failing to vacuum (say)
> an insert-only table will continue to grow. Still, as you say,
> clearly a separate topic.
>
>> The temptation is high to estimate the cost of an "index_scan(only) +
>> ordered(by ctid) table pages fetch if heap required". (this is what I
>> understood from heikki suggestion 3-4. and it makes sense). It may be
>> easier to implement both at once but I didn't find the branch in the
>> Heikki's git repos. (probably removed since the long time)
>
> I was thinking about this as well, at least if I understand you
> correctly. That would be similar to a bitmap index scan, and I think
> it would be a great thing to have, not only because it would allow us
> to get the advantages of index-only scans in situations that are
> well-suited to our current bitmap scans, but also because it could be
> batched. You could allocate a buffer of work_mem bytes and fill it up
> with TIDs; then, when it's full, you sort the buffer and start doing
> the necessary heap fetches in physical order. If you still need more
> rows, you can clear the buffer and go around for another pass.
>
>> Based on ANALYZE stats for the visibility, I believe cost_index and
>> cost_index_only should be very similar functions (well, atm, I don't
>> see the point to split it in 2 functions).
>
> Yeah, I would more imagine modifying the existing function.
>
>>> Any thoughts welcome. Incidentally, if anyone else feels like working
>>> on this, feel free to let me know and I'm happy to step away, from all
>>> of it or from whatever part someone else wants to tackle. I'm mostly
>>> working on this because it's something that I think we really need to
>>> get done, more than having a burning desire to be the one who does it.
>>
>> Indexonly scans are welcome!
>> I believe I can help on 3 and 4, but (really) not sure for 1 and 2.
>
> Well, I have code for #1, and just need reviews, and #2 shouldn't be
> that hard, and with luck I'll twist Bruce's arm into doing it (*waves
> to Bruce*). So #3 and #4 are the next thing to tackle. Any thoughts
> on what/how you'd like to contribute there?

I can provide initial patchs for cost and analyze, at least.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support

In response to

Re: the big picture for index-only scans at 2011-05-10 15:13:04 from Robert Haas

Responses

Re: the big picture for index-only scans at 2011-05-10 16:46:59 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jesper Krogh	2011-05-10 15:34:17	Re: crash-safe visibility map, take five
Previous Message	Robert Haas	2011-05-10 15:13:04	Re: the big picture for index-only scans