Re: cost_sort() improvements

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cost_sort() improvements
Date: 2018-06-28 19:30:01
Message-ID: CAH2-Wzm17qTRO71UToUqu9Lu64Jx+4rt09Ux8eBcG-_RgQP45A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 28, 2018 at 9:47 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> Current estimation of sort cost has following issues:
> - it doesn't differ one and many columns sort
> - it doesn't pay attention to comparison function cost and column width
> - it doesn't try to count number of calls of comparison function on per
> column
> basis

I've been suspicious of the arbitrary way in which I/O for external
sorts is costed by cost_sort() for a long time. I'm not 100% sure
about how we should think about this question, but I am sure that it
needs to be improved in *some* way. It's really not difficult to show
that external sorts are now often faster than internal sorts, because
they're able to be completed on-the-fly, which can have very good CPU
cache characteristics, and because the I/O latency can be hidden
fairly well much of the time. Of course, as memory is taken away,
external sorts will eventually get slower and slower, but it's
surprising how little difference it makes. (This makes me tempted to
look into a sort_mem GUC, even though I suspect that that will be
controversial.)

Clearly there is a cost to doing I/O even when an external sort is
faster than an internal sort "in isolation"; I/O does not magically
become something that we don't have to worry about. However, the I/O
cost seems more and more like a distributed cost. We don't really have
a way of thinking about that at all. I'm not sure if that much bigger
problem needs to be addressed before this specific problem with
cost_sort() can be addressed.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-06-28 20:52:20 Re: Listing triggers in partitions (was Re: Remove mention in docs that foreign keys on partitioned tables)
Previous Message Peter Geoghegan 2018-06-28 17:52:42 Re: Tips on committing