Re: Performance improvement for joins where outer side is unique

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2017-03-13 22:35:30
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On 14 March 2017 at 07:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> [ getting back to this patch finally... ]
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> > I've attached a patch which implements this, though only for
> > MergeJoin, else I'd imagine we'd also need to ensure all proofs used
> > for testing the uniqueness were also hash-able too. I added some XXX
> > comments in analyzejoin.c around the mergeopfamilies == NIL tests to
> > mention that Merge Join depends on all the unique proof quals having
> > mergeopfamilies. This also assumes we'll never use some subset of
> > mergejoin-able quals for a merge join, which could be an interesting
> > area in the future, as we might have some btree index on a subset of
> > those columns to provide pre-sorted input. In short, it's a great
> > optimisation, but I'm a little scared we might break it one day.
> Umm ... it's broken already isn't it? See the comment for
> generate_mergejoin_paths:
> * We generate mergejoins if mergejoin clauses are available. We have
> * two ways to generate the inner path for a mergejoin: sort the cheapest
> * inner path, or use an inner path that is already suitably ordered for
> the
> * merge. If we have several mergeclauses, it could be that there is no
> inner
> * path (or only a very expensive one) for the full list of mergeclauses,
> but
> * better paths exist if we truncate the mergeclause list (thereby
> discarding
> * some sort key requirements). So, we consider truncations of the
> * mergeclause list as well as the full list. (Ideally we'd consider all
> * subsets of the mergeclause list, but that seems way too expensive.)
> There's another, more subtle, way in which it could fail in
> sort_inner_and_outer():
> * Each possible ordering of the available mergejoin clauses will generate
> * a differently-sorted result path at essentially the same cost. We have
> * no basis for choosing one over another at this level of joining, but
> * some sort orders may be more useful than others for higher-level
> * mergejoins, so it's worth considering multiple orderings.
> *
> * Actually, it's not quite true that every mergeclause ordering will
> * generate a different path order, because some of the clauses may be
> * partially redundant (refer to the same EquivalenceClasses). Therefore,
> * what we do is convert the mergeclause list to a list of canonical
> * pathkeys, and then consider different orderings of the pathkeys.
> I'm fairly sure that it's possible to end up with fewer pathkeys than
> there are mergeclauses in this code path. Now, you might be all right
> anyway given that the mergeclauses must refer to the same ECs in such a
> case --- maybe they're fully redundant and we can take testing the
> included clause as proving the omitted one(s) too. I'm not certain
> right now what I meant by "partially redundant" in this comment.
> But in any case, it's moot for the present purpose because
> generate_mergejoin_paths certainly breaks your assumption.

Thanks for looking at this again.

Yeah confirmed. It's broken. I guess I just need to remember in the Path if
we got all the join quals, although I've not looked in detail what the best
fix is. I imagine it'll require storing something else in the JoinPath.

Here's my test case:

select 'create tablespace slow_disk LOCATION ''' ||
current_setting('data_directory') || ''' WITH (random_page_cost=1000);';
create table ab (a int not null, b int not null);
create unique index ab_pkey on ab (a,b) tablespace slow_disk;
alter table ab add constraint ab_pkey primary key using index ab_pkey;

create index ab_a_idx on ab (a);

insert into ab values(1,1);
insert into ab values(1,2);

set enable_nestloop to 0;
set enable_hashjoin to 0;
set enable_sort to 0;

explain verbose select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and
ab1.b = ab2.b;

Merge Join (cost=0.26..24.35 rows=1 width=16)
Output: ab1.a, ab1.b, ab2.a, ab2.b
Inner Unique: Yes
Merge Cond: (ab1.a = ab2.a)
Join Filter: (ab1.b = ab2.b)
-> Index Scan using ab_a_idx on public.ab ab1 (cost=0.13..12.16 rows=2
Output: ab1.a, ab1.b
-> Index Scan using ab_a_idx on public.ab ab2 (cost=0.13..12.16 rows=2
Output: ab2.a, ab2.b
(9 rows)

select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
-- wrong results
a | b | a | b
1 | 2 | 1 | 2
(1 row)

drop index ab_a_idx;

-- same query again. This time it'll use the PK index on the slow_disk
select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
a | b | a | b
1 | 1 | 1 | 1
1 | 2 | 1 | 2
(2 rows)

I had to fix up some conflicts before testing this again on a recent
master, so I've attached my rebased version. No fix yet for the above issue

David Rowley
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
unique_joins_2017-03-14.patch application/octet-stream 91.0 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-03-13 22:36:28 Re: bytea_output vs make installcheck
Previous Message Peter Geoghegan 2017-03-13 22:12:12 Re: tuplesort_gettuple_common() and *should_free argument