Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)

From: Hannu Krosing <hannu(at)trust(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)
Date: 1999-08-24 19:10:41
Message-ID: 37C2EE31.7B9A6CDC@trust.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
>
> "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> > Hmm,Index scan is chosen to select all rows.
> > AFAIK,sequential scan + sort is much faster than index scan in
> > most cases.
> > cost of index scan < cost of sequential scan + cost of sort
> > I have felt that the current cost estimation of index scan is too small,
> > though I have no alternative.

can the optimizer make use of LIMIT, or some other hint that reaction
time is preferred over speed of full query ?

In web apps the index scan may often be fastre than seq scan + sort as
one
may not actually need all the tuples but only a small fraction from near
the beginning.

Getting the beginning fast also gives better responsiveness for other
interactive uses.

> I am also suspicious that indexscan costs are underestimated. The
> cost of reading the index is probably not too far off, but the cost
> of accessing the main table is bogus. Worst case, for a table whose
> tuples are thoroughly scattered, you would have a main-table page fetch
> for *each* returned tuple. In practice it's probably not anywhere near
> that bad, since you may have some clustering of tuples (so that the same
> page is hit several times in a row), and also the Postgres buffers and
> the underlying Unix system's disk cache will save trips to disk if there
> is any locality of reference at all. I have no idea how to estimate
> that effect --- anyone? But cost_index is clearly producing a
> ridiculously optimistic estimate at the moment.

The one way to find out would be actual benchmarking - if current
optimizer prefers index scans it is possible to do a query using
index scan, dro the index, somehow flush disk cache and then do the
same query using seqscan+sort.

If the latter is preferred anyway we would have no way to test ...

------------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hub.Org News Admin 1999-08-24 19:15:20
Previous Message Brian E Gallew 1999-08-24 17:01:12 Re: [HACKERS] vacuum process size