Re: Performance improvement for joins where outer side is unique

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: dgrowleyml(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2015-02-03 09:23:36
Message-ID: 20150203.182336.36467818.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, I had a look on this patch. Although I haven't understood
whole the stuff and all of the related things, I will comment as
possible.

Performance:

I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.

=# SELECT count(*) FROM t1 JOIN t2 USING (id);

Of course, the gain would be trivial when returning many tuples,
or scans go to disk.

I haven't measured the loss by additional computation when the
query is regarded as not a "single join".

Explain representation:

I don't like that the 'Unique Join:" occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.

Nested Loop ...
Output: ....
- Unique Jion: Yes
-> Seq Scan on public.t2 (cost = ...
- -> Seq Scan on public.t1 (cost = ....
+ -> Seq Scan on public.t1 (unique) (cost = ....

Coding:

The style looks OK. Could applied on master.
It looks to work as you expected for some cases.

Random comments follow.

- Looking specialjoin_is_unique_join, you seem to assume that
!delay_upper_joins is a sufficient condition for not being
"unique join". The former simplly indicates that "don't
commute with upper OJs" and the latter seems to indicate that
"the RHS is unique", Could you tell me what kind of
relationship is there between them?

- The naming "unique join" seems a bit obscure for me, but I
don't have no better idea:( However, the member name
"has_unique_join" seems to be better to be "is_unique_join".

- eclassjoin_is_unique_join involves seemingly semi-exhaustive
scan on ec members for every element in joinlist. Even if not,
it might be thought to be a bit too large for the gain. Could
you do the equivelant things by more small code?

regards,

Jan 2015 00:37:19 +1300, David Rowley <dgrowleyml(at)gmail(dot)com> wrote in <CAApHDvod_uCMoUPovdpXbNkw50O14x3wwKoJmZLxkbBn71zdEg(at)mail(dot)gmail(dot)com>
> On 1 January 2015 at 02:47, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> > Hi,
> >
> > I've been hacking a bit at the join code again... This time I've been
> > putting some effort into optimising the case where the inner side of the
> > join is known to be unique.
> > For example, given the tables:
> >
> > create table t1 (id int primary key);
> > create table t2 (id int primary key);
> >
> > And query such as:
> >
> > select * from t1 left outer join t2 on t1.id=t2.id;
> >
> > It is possible to deduce at planning time that "for each row in the outer
> > relation, only 0 or 1 rows can exist in the inner relation", (inner being
> > t2)
> >
>
> I've been hacking at this unique join idea again and I've now got it
> working for all join types -- Patch attached.
>
> Here's how the performance is looking:
>
> postgres=# create table t1 (id int primary key);
> CREATE TABLE
> postgres=# create table t2 (id int primary key);
> CREATE TABLE
> postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);
> INSERT 0 1000000
> postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);
> INSERT 0 1000000
> postgres=# vacuum analyze;
> VACUUM
> postgres=# \q
>
> Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;
>
> With Patch on master as of 32bf6ee
>
> D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
> transaction type: Custom query
> scaling factor: 1
> query mode: simple
> number of clients: 1
> number of threads: 1
> duration: 60 s
> number of transactions actually processed: 78
> latency average: 769.231 ms
> tps = 1.288260 (including connections establishing)
> tps = 1.288635 (excluding connections establishing)
>
> Master as of 32bf6ee
>
> D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
> transaction type: Custom query
> scaling factor: 1
> query mode: simple
> number of clients: 1
> number of threads: 1
> duration: 60 s
> number of transactions actually processed: 70
> latency average: 857.143 ms
> tps = 1.158905 (including connections establishing)
> tps = 1.159264 (excluding connections establishing)
>
> That's a 10% performance increase.
>
> I still need to perform more thorough benchmarking with different data
> types.
>
> One weird thing that I noticed before is that in an earlier revision of the
> patch in the executor's join Initialise node code, I had set the
> unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
> unique_join check in the execute node for each join method. With this the
> performance results barely changed from standard... I've yet to find out
> why.

I don't know even what you did precisely but if you do such a
thing, you perhaps should change the name of "unique_inner" to
something representing that it reads up to one row from the inner
for one outer row. (Sorry I have no good idea for this..)

> The patch also has added a property to the EXPLAIN (VERBOSE) output which
> states if the join was found to be unique or not.
>
> The patch also still requires a final pass of comment fix-ups. I've just
> plain run out of time for now.
>
> I'll pick this up in 2 weeks time.
>
> Regards
>
> David Rowley

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-02-03 09:41:21 Re: Hot Standby WAL reply uses heavyweight session locks, but doesn't have enough infrastructure set up
Previous Message Tatsuo Ishii 2015-02-03 08:46:37 Performance difference between kernel 3.10 and 3.12