Re: Performance problems with DISTINCT ON

From: imad <immaad(at)gmail(dot)com>
To: Sgarbossa Domenico <domenico(dot)sgarbossa(at)eniac(dot)it>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance problems with DISTINCT ON
Date: 2009-10-04 02:27:46
Message-ID: 1f30b80c0910031927k281d1d96oaeed1b1b8a40c606@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The index can produce the sorted output. Add a dummy WHERE clause like
articoli > <min_value> and data_ent > <min_value>.

--Imad

On Mon, Sep 28, 2009 at 10:18 PM, Sgarbossa Domenico
<domenico(dot)sgarbossa(at)eniac(dot)it> wrote:
>
> I need to retrieve the most recent prices per products from a price list
> table:
>
> CREATE TABLE listini_anagrafici
> (
>   id character varying(36) NOT NULL,
>   articolo character varying(18),
>   listino character varying(5),
>   data_ent date,
>   data_fin date,
>   prezzo double precision,
>   ultimo boolean DEFAULT false,
>   date_entered timestamp without time zone NOT NULL,
>   date_modified timestamp without time zone NOT NULL,
>   created_by character varying(36),
>   modified_user_id character varying(36) NOT NULL,
>   deleted boolean NOT NULL DEFAULT false,
>   CONSTRAINT listini_anagrafici_id_key UNIQUE (id)
> )
>
> I guess the right query is:
>
> select distinct on (articolo) articolo,data_ent,prezzo from
> listini_anagrafici order by articolo, data_ent desc
>
> but it seems that this query runs slowly... about 5/6 seconds.
> the table contains more or less 500K records, PostgreSQL version is 8.1.11
> and the server has 4gb of RAM entirely dedicate to the db.
>
> I've tried adding this index
>
> CREATE INDEX articolo_data_ent ON listini_anagrafici (articoli, data_ent)
>
> but it doesn't helps.
>
> As you can see from the explain command (below) the query seems to ignore
> the index
>
> 'Unique  (cost=73897.58..76554.94 rows=77765 width=24)'
> '  ->  Sort  (cost=73897.58..75226.26 rows=531472 width=24)'
> '        Sort Key: articolo, data_ent'
> '        ->  Seq Scan on listini_anagrafici  (cost=0.00..16603.72
> rows=531472 width=24)'
>
> anyone knows how to make this query run faster?
>
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Karl Denninger 2009-10-04 02:35:26 Re: Best suiting OS
Previous Message Robert Haas 2009-10-04 01:02:12 Re: Confusion on shared buffer