Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Date: 2021-07-20 11:28:35
Message-ID: CAEudQAq1t4CnqQZK+kh2MF0iwX1MEs+uHk1wTksn=QO0xFKokg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em ter., 20 de jul. de 2021 às 05:35, David Rowley <dgrowleyml(at)gmail(dot)com>
escreveu:

> On Tue, 20 Jul 2021 at 01:10, James Coleman <jtc331(at)gmail(dot)com> wrote:
> > To be clear up front: I'm in favor of the patch, and I don't want to
> > put unnecessary stumbling blocks up for it getting committed. So if we
> > decide to proceed as is, that's fine with me.
>
> Thanks for making that clear.
>
> > But I'm not sure that the "cost model, unfortunately, does not account
> > for that" is entirely accurate. The end of cost_tuplesort contains a
> > cost "per extracted tuple". It does, however, note that it doesn't
> > charge cpu_tuple_cost, which maybe is what you'd want to fully
> > incorporate this into the model. But given this run_cost isn't about
> > accounting for comparison cost (that's been done earlier) which is the
> > part that'd be the same between tuple and datum sort, it seems to me
> > that we could lower the cpu_operator_cost here by something like 10%
> > if it's byref and 30% if it's byval?
>
> I failed to notice that last part that adds the additional
> cpu_operator_cost.
>
> The default cpu_operator_cost is 0.0025, so with the 10k tuple
> benchmark, that adds an additional charge of 25 to the total cost.
>
> If we take test 1 from my results on v5 as an example:
>
> > Test1 446.1 657.3 147.32%
>
> Looking at explain for that query:
>
> regression=# explain select two from tenk1 order by two offset 1000000;
> QUERY PLAN
> ----------------------------------------------------------------------
> Limit (cost=1133.95..1133.95 rows=1 width=4)
> -> Sort (cost=1108.97..1133.95 rows=9995 width=4)
> Sort Key: two
> -> Seq Scan on tenk1 (cost=0.00..444.95 rows=9995 width=4)
> (4 rows)
>
> If we want the costs to reflect reality again here then we'd have
> reduce 1133.95 by something like 147.32% (the performance difference).
> That would bring the cost down to 769.72, which is way more than we
> have to play with than the 25 that the cpu_operator_cost * tuples
> gives us.
>
> If we reduced the 25 by 30% in this case, we'd get 17.5 and the total
> cost would become 1126.45. That's not great considering the actual
> performance indicates that 769.72 would be a better number.
>
> If we look at John's result for test 1: He saw 588 tps on master and
> 998 on v8. 1133.95 / 998.0 * 588.0 = 668.09, so we'd need even more
> to get close to reality on that machine.
>
> My thoughts are that the small surcharge added at the end of
> cost_tuplesort() is just not enough for us to play with. I think to
> get us closer to fixing this correctly would require a redesign of the
> tuplesort costing entirely. I think that would be about an order of
> magnitude more effort than this patch was, so I really feel like I
> don't want to do this.
>
I understand that redesign would require a lot of work,
but why not do it step by step?

> I kinda feel that since the comparison_cost is always just 2.0 *
> cpu_operator_cost regardless of the number of columns in the sort,
> then if we add too many new smarts to try and properly adjust for this
> new optimization, unless we do a completly new cost modal for this,
> then we might as well be putting lipstick on a pig.
>
I think one first step is naming this 2.0?
Does this magic number don't have a good name?

>
> It sounds like James mostly just mentioned the sorting just to ensure
> it was properly considered and does not really feel strongly that it
> needs to be adjusted. Does anyone else feel that we should be
> adjusting it?
>
I took a look at cost_tuplesort and I think that some small adjustments
could be made as part of the improvement process.
It is attached.
1. long is a very problematic type; better int64?
2. 1024 can be int, not long?
3. 2 changed all to 2.0 (double)?
4. If disk-based is not needed, IMO can we avoid calling relation_byte_size?

Finally, to at least document (add comments) those conclusions,
would be nice, wouldn't it?

regards,
Ranier Vilela

Attachment Content-Type Size
costsize.patch application/octet-stream 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-07-20 11:42:50 Re: row filtering for logical replication
Previous Message Denis Hirn 2021-07-20 11:15:30 Re: [PATCH] Allow multiple recursive self-references