Re: So, is COUNT(*) fast now?

From: Wolfgang Wilhelm <wolfgang20121964(at)yahoo(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: So, is COUNT(*) fast now?
Date: 2011-10-25 06:34:23
Message-ID: 1319524463.19453.YahooMailNeo@web28414.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

my experience is that as soon as index only scans are available they are used - sometimes just because of the simple logic that a user thinks it is faster. Even when the index is so ridiculously long just to have all info in the index...

Regards
Wolfgang Wilhelm

________________________________
Von: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
An: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>; pgsql-hackers(at)postgresql(dot)org
Gesendet: 21:35 Montag, 24.Oktober 2011
Betreff: Re: [HACKERS] So, is COUNT(*) fast now?

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> But even though Tom's statement that most indexes are one column might
> be a slight exaggeration, I suspect it probably is true that the
> optimizations he's talking about for large numbers of columns won't
> produce any material benefit even for a 3 or 4 column index.  Which
> makes me think maybe we should focus our efforts elsewhere.

Right.  If we thought the average was something like ten, it might be
worth pursuing optimizations similar to slot_getallattrs.  If it's
around two or three, almost certainly not.

Your point about people trying to create wider indexes to exploit
index-only scans is an interesting one, but I think it's premature to
optimize on the basis of hypotheses about what people might do in
future.

Not sure about your other idea of returning multiple tuples per
amgettuple call.  The trouble with that is that it will add complexity
(and hence cycles) at the nodeIndexscan level, because now nodeIndexscan
will have to buffer those tuples, keep track of whether it's fetching
forward or backward, etc etc.  Plus another layer of the same in
indexam.c (index_getnext etc).  I'm not at all convinced that it's
likely to be a net win.

I wonder how trustworthy the measure of the visibilitymap_test call site
as a consumer of cycles really is.  I've frequently noticed that
oprofile blames remarkably large fractions of the runtime on individual
statements that appear to be quite trivial.  I'm not sure if that
represents real hardware-level effects such as cache line switching,
or whether it's just measurement artifacts.  Keep in mind that
sampling-based measurements are always subject to sampling artifacts.

            regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-10-25 06:44:30 Re: Online base backup from the hot-standby
Previous Message Fujii Masao 2011-10-25 06:33:41 Re: Unreproducible bug in snapshot import code