Re: Large tables, ORDER BY and sequence/index scans

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Milan Zamazal <pdm(at)brailcom(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Large tables, ORDER BY and sequence/index scans
Date: 2010-01-05 12:12:01
Message-ID: 162867791001050412u59514559o38cb5b39e8d26b8c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello

please, send explain result

postgres=# explain analyze declare x cursor for select * from foo;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..34.00 rows=2400 width=4) (actual
time=0.010..0.012 rows=2 loops=1)
Total runtime: 0.033 ms
(2 rows)

regards
Pavel Stehule

2010/1/5 Milan Zamazal <pdm(at)brailcom(dot)org>:
> My problem is that retrieving sorted data from large tables is sometimes
> very slow in PostgreSQL (8.4.1, FWIW).
>
> I typically retrieve the data using cursors, to display them in UI:
>
>  BEGIN;
>  DECLARE ... SELECT ... ORDER BY ...;
>  FETCH ...;
>  ...
>
> On a newly created table of about 10 million rows the FETCH command
> takes about one minute by default, with additional delay during the
> contingent following COMMIT command.  This is because PostgreSQL uses
> sequence scan on the table even when there is an index on the ORDER BY
> column.  When I can force PostgreSQL to perform index scan (e.g. by
> setting one of the options enable_seqscan or enable_sort to off), FETCH
> response is immediate.
>
> PostgreSQL manual explains motivation for sequence scans of large tables
> and I can understand the motivation.  Nevertheless such behavior leads
> to unacceptably poor performance in my particular case.  It is important
> to get first resulting rows quickly, to display them to the user without
> delay.
>
> My questions are:
>
> - What is your experience with using ORDER BY + indexes on large tables?
>
> - Is there a way to convince PostgreSQL to use index scans automatically
>  in cases where it is much more efficient?  I tried using ANALYZE,
>  VACUUM and SET STATISTICS, but without success.
>
> - Is it a good idea to set enable_seqscan or enable_sort to "off"
>  globally in my case?  Or to set them to "off" just before working with
>  large tables?  My databases contain short and long tables, often
>  connected through REFERENCES or joined into views and many of shorter
>  tables serve as codebooks.  Can setting one of the parameters to off
>  have clearly negative impacts?
>
> - Is there a recommended way to keep indexes in good shape so that the
>  performance of initial rows retrievals remains good?  The large tables
>  are typically append-only tables with a SERIAL primary key.
>
> Thanks for any tips.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Milan Zamazal 2010-01-05 13:24:54 Re: Large tables, ORDER BY and sequence/index scans
Previous Message Filip Rembiałkowski 2010-01-05 12:02:17 Re: Large tables, ORDER BY and sequence/index scans