Re: So, is COUNT(*) fast now?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: So, is COUNT(*) fast now?
Date: 2011-10-24 20:01:23
Message-ID: CA+TgmobKHr_p7YXAH+b7NXHb1Vd9QOc34d0iNLE-OjWJofXtpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 24, 2011 at 3:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Your point about people trying to create wider indexes to exploit
> index-only scans is an interesting one, but I think it's premature to
> optimize on the basis of hypotheses about what people might do in
> future.

Well, I don't think it's too much of a stretch to guess that people
will try to use covering indexes; that's common practice on other
products and a frequent and a not-uncommon heartfelt request from
people with large (non-memory-resident) databases they want to migrate
to PostgreSQL. Exactly to what degree they'll do that, and how well
it will work, is another question. But I have little doubt that it
will be tried.

> Not sure about your other idea of returning multiple tuples per
> amgettuple call.  The trouble with that is that it will add complexity
> (and hence cycles) at the nodeIndexscan level, because now nodeIndexscan
> will have to buffer those tuples, keep track of whether it's fetching
> forward or backward, etc etc.  Plus another layer of the same in
> indexam.c (index_getnext etc).  I'm not at all convinced that it's
> likely to be a net win.

I definitely agree that you don't want two layers of caching, but I
don't see why we'd need that. I wasn't thinking of changing
index_getnext() at all, but rather adding a new API that fills a
buffer (or a pair of buffers, maybe) with index tuples and heap TIDs.
It should spit them out in the same order that multiple
index_getnext() calls would have done and leave the scan position
wherever it would have ended up after a number of index_getnext_tid()
calls equal to the number of TIDs returned. Any user of the API (and
it might be just nodeIndexonlyscan.c) would just need to keep track of
the number of tuples returned and the number consumed to date.

This actually gets into a wider architectural discussion, which is
whether the fact that the whole executor (modulo bitmap scans and a
few other special cases) operates on one tuple at a time is a good
design... but my brain hurts just thinking about that.

> I wonder how trustworthy the measure of the visibilitymap_test call site
> as a consumer of cycles really is.  I've frequently noticed that
> oprofile blames remarkably large fractions of the runtime on individual
> statements that appear to be quite trivial.  I'm not sure if that
> represents real hardware-level effects such as cache line switching,
> or whether it's just measurement artifacts.  Keep in mind that
> sampling-based measurements are always subject to sampling artifacts.

I'm not sure either. I guess we could try short-circuiting
visibilitymap_test and see what that does to performance (let's leave
correct answers out of it).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-10-24 20:07:41 Re: So, is COUNT(*) fast now?
Previous Message Josh Berkus 2011-10-24 19:46:45 Re: So, is COUNT(*) fast now?