Re: count * performance issue

From: Albert Cervera Areny <albert(at)sedifa(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>, "Robins Tharakan" <tharakan(at)gmail(dot)com>
Subject: Re: count * performance issue
Date: 2008-03-11 08:34:30
Message-ID: 200803110934.31254.albert@sedifa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

A Dimarts 11 Març 2008 04:11, Scott Marlowe va escriure:
> On Mon, Mar 10, 2008 at 7:57 PM, Robins Tharakan <tharakan(at)gmail(dot)com> wrote:
> > Hi,
> >
> > I have been reading this conversation for a few days now and I just
> > wanted to ask this. From the release notes, one of the new additions in
> > 8.3 is (Allow col IS NULL to use an index (Teodor)).
> >
> > Sorry, if I am missing something here, but shouldn't something like this
> > allow us to get a (fast) accurate count ?
> >
> > SELECT COUNT(*) from table WHERE indexed_field IS NULL
> > +
> > SELECT COUNT(*) from table WHERE indexed_field IS NOT NULL
>
> It really depends on the distribution of the null / not nulls in the
> table. If it's 50/50 there's no advantage to using the index, as you
> still have to check visibility info in the table itself.
>
> OTOH, if NULL (or converserly not null) are rare, then yes, the index
> can help. I.e. if 1% of the tuples are null, the select count(*) from
> table where field is null can use the index efficiently.

But you'll get a sequential scan with the NOT NULL case which will end up
taking more time. (Seq Scan + Index Scan > Seq Scan)

--
Albert Cervera Areny
Dept. Informàtica Sedifa, S.L.

Av. Can Bordoll, 149
08202 - Sabadell (Barcelona)
Tel. 93 715 51 11
Fax. 93 715 51 12

====================================================================
........................ AVISO LEGAL ............................
La presente comunicación y sus anexos tiene como destinatario la
persona a la que va dirigida, por lo que si usted lo recibe
por error debe notificarlo al remitente y eliminarlo de su
sistema, no pudiendo utilizarlo, total o parcialmente, para
ningún fin. Su contenido puede tener información confidencial o
protegida legalmente y únicamente expresa la opinión del
remitente. El uso del correo electrónico vía Internet no
permite asegurar ni la confidencialidad de los mensajes
ni su correcta recepción. En el caso de que el
destinatario no consintiera la utilización del correo electrónico,
deberá ponerlo en nuestro conocimiento inmediatamente.
====================================================================
........................... DISCLAIMER .............................
This message and its attachments are intended exclusively for the
named addressee. If you receive this message in error, please
immediately delete it from your system and notify the sender. You
may not use this message or any part of it for any purpose.
The message may contain information that is confidential or
protected by law, and any opinions expressed are those of the
individual sender. Internet e-mail guarantees neither the
confidentiality nor the proper receipt of the message sent.
If the addressee of this message does not consent to the use
of internet e-mail, please inform us inmmediately.
====================================================================

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2008-03-11 09:58:20 Re: Very slow (2 tuples/second) sequential scan afterbulk insert; speed returns to ~500 tuples/second after commit
Previous Message Pavan Deolasee 2008-03-11 06:24:42 Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit