Re: BUG #15383: Join Filter cost estimation problem in 10.5

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15383: Join Filter cost estimation problem in 10.5
Date: 2020-04-24 06:26:49
Message-ID: CAApHDvq-e90W0JD1q9U9RPquXq0AZ5pUz2BN8Zxch-QBhJ5ZFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sun, 1 Dec 2019 at 06:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> >>> One thing we could look at would be to charge an additional
> >>> cpu_tuple_cost per outer row for all joins except semi, anti and
> >>> unique joins. This would account for the additional lookup for
> >>> another matching row which won't be found and cause the planner to
> >>> slightly favour keeping the unique rel on the inner side of the join,
> >>> when everything else is equal.
>
> which'd help break ties in the right direction. It's a bit scary to
> be fixing this issue by changing the cost estimates for non-unique
> joins --- that could have side-effects we don't want. But arguably,
> the above is a correct refinement to the cost model, so maybe it's
> okay.

I wanted to see just how much fallout there'd be in the regression
tests if we did add the small non-unique join surcharge to all joins
which are unable to skip to the next outer tuple after matching the
first inner one. I went and added a cpu_tuple_cost per row to try to
account for the additional cost during execution. In the simple test
that I showed in April last year on this thread, it now does always
prefer to keep the unique rel on the inner side of the join with all 3
join types. However, there's quite a bit of fallout in the regression
tests. Mostly around changes in join order, but I see there's also a
weird failure in join.out on a test that claims that it shouldn't
allow a unique join. The output of that has changed to actually
performing a unique join. The test looks to be broken since the qual
comparing to the cost should allow detection of unique joins.

Perhaps 1 cpu_tuple_cost per output tuple is far too big a surcharge.
For hash join with the example case from April 2019, the surcharge
adds about 18% to the overall cost of the join. 552.50 up to 652.50.

The attached patch is Tom's patch from this thread from March last
year minus his regression test changes plus my join surcharge stuff.
This is just intended as a topic of conversation at the moment and
does not make any adjustments to the expected test outputs.

David

Attachment Content-Type Size
add_non-unique_join_surcharge_experiment.patch application/octet-stream 4.6 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2020-04-24 07:05:10 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Michael Paquier 2020-04-24 03:43:51 Re: [BUG] non archived WAL removed during production crash recovery

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-04-24 06:36:51 Re: 2pc leaks fds
Previous Message Dilip Kumar 2020-04-24 06:24:46 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions