Re: So, is COUNT(*) fast now?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: So, is COUNT(*) fast now?
Date: 2011-10-24 19:35:55
Message-ID: 5921.1319484955@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> But even though Tom's statement that most indexes are one column might
> be a slight exaggeration, I suspect it probably is true that the
> optimizations he's talking about for large numbers of columns won't
> produce any material benefit even for a 3 or 4 column index. Which
> makes me think maybe we should focus our efforts elsewhere.

Right. If we thought the average was something like ten, it might be
worth pursuing optimizations similar to slot_getallattrs. If it's
around two or three, almost certainly not.

Your point about people trying to create wider indexes to exploit
index-only scans is an interesting one, but I think it's premature to
optimize on the basis of hypotheses about what people might do in
future.

Not sure about your other idea of returning multiple tuples per
amgettuple call. The trouble with that is that it will add complexity
(and hence cycles) at the nodeIndexscan level, because now nodeIndexscan
will have to buffer those tuples, keep track of whether it's fetching
forward or backward, etc etc. Plus another layer of the same in
indexam.c (index_getnext etc). I'm not at all convinced that it's
likely to be a net win.

I wonder how trustworthy the measure of the visibilitymap_test call site
as a consumer of cycles really is. I've frequently noticed that
oprofile blames remarkably large fractions of the runtime on individual
statements that appear to be quite trivial. I'm not sure if that
represents real hardware-level effects such as cache line switching,
or whether it's just measurement artifacts. Keep in mind that
sampling-based measurements are always subject to sampling artifacts.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2011-10-24 19:46:45 Re: So, is COUNT(*) fast now?
Previous Message Robert Haas 2011-10-24 18:37:39 Re: So, is COUNT(*) fast now?