Quick Links

Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)

From:	Hannu Krosing <hannu(at)trust(dot)ee>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb)
Date:	1999-08-24 19:10:41
Message-ID:	37C2EE31.7B9A6CDC@trust.ee
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
>
> "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> > Hmm,Index scan is chosen to select all rows.
> > AFAIK,sequential scan + sort is much faster than index scan in
> > most cases.
> > cost of index scan < cost of sequential scan + cost of sort
> > I have felt that the current cost estimation of index scan is too small,
> > though I have no alternative.

can the optimizer make use of LIMIT, or some other hint that reaction
time is preferred over speed of full query ?

In web apps the index scan may often be fastre than seq scan + sort as
one
may not actually need all the tuples but only a small fraction from near
the beginning.

Getting the beginning fast also gives better responsiveness for other
interactive uses.

> I am also suspicious that indexscan costs are underestimated. The
> cost of reading the index is probably not too far off, but the cost
> of accessing the main table is bogus. Worst case, for a table whose
> tuples are thoroughly scattered, you would have a main-table page fetch
> for *each* returned tuple. In practice it's probably not anywhere near
> that bad, since you may have some clustering of tuples (so that the same
> page is hit several times in a row), and also the Postgres buffers and
> the underlying Unix system's disk cache will save trips to disk if there
> is any locality of reference at all. I have no idea how to estimate
> that effect --- anyone? But cost_index is clearly producing a
> ridiculously optimistic estimate at the moment.

The one way to find out would be actual benchmarking - if current
optimizer prefers index scans it is possible to do a query using
index scan, dro the index, somehow flush disk cache and then do the
same query using seqscan+sort.

If the latter is preferred anyway we would have no way to test ...

------------------
Hannu

In response to

Sorting costs (was Caution: tonight's commits force initdb) at 1999-08-24 14:58:50 from Tom Lane

Responses

RE: [HACKERS] Sorting costs (was Caution: tonight's commits force initdb) at 1999-08-25 00:17:03 from Hiroshi Inoue

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Hub.Org News Admin	1999-08-24 19:15:20
Previous Message	Brian E Gallew	1999-08-24 17:01:12	Re: [HACKERS] vacuum process size