Re: LEFT JOINs not optimized away when not needed

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Moshe Jacobson <moshe(at)neadwerx(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, Robert Mowlavi <robert(at)neadwerx(dot)com>, Chris Autry <chris(at)neadwerx(dot)com>
Subject: Re: LEFT JOINs not optimized away when not needed
Date: 2014-07-08 15:19:31
Message-ID: 8401.1404832771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Moshe Jacobson <moshe(at)neadwerx(dot)com> writes:
> However, it turns out that Postgres is not optimizing away the left joins
> as I would expect. See the following query:

> SELECT r.reset
> FROM tb_reset r
> LEFT JOIN tb_project pj ON pj.project = r.project
> LEFT JOIN tb_location l ON l.location = r.location
> LEFT JOIN tb_program pg ON pg.program = r.program
> LEFT JOIN tb_reset_survey rsv ON rsv.reset = r.reset
> LEFT JOIN tb_program_location pl ON pl.program = r.program AND
> pl.location = r.location
> LEFT JOIN tb_program_type pgt ON pgt.program_type = pg.program_type
> LEFT JOIN tb_project_department pd ON pd.project = pj.project
> LEFT JOIN tb_department d ON d.department = pd.department
> LEFT JOIN tb_fiscal_calendar fc ON fc.day = r.execution_date
> LEFT JOIN tb_fiscal_month fm ON fm.fiscal_month = fc.month
> LEFT JOIN tb_rollout_schedule rs ON rs.fiscal_week = fc.week AND
> rs.fiscal_year = fc.year AND rs.program = r.program
> LEFT JOIN tb_reset_team rt ON rt.reset_team = r.reset_team
> LEFT JOIN tb_vendor v ON v.vendor = rt.vendor
> LEFT JOIN tb_reset_status rst ON rst.reset_status = r.reset_status
> LEFT JOIN tb_order_location ol ON ol.location = r.location
> LEFT JOIN tb_entity_reset er ON er.reset = r.reset
> LEFT JOIN tb_market m ON m.market = l.market
> LEFT JOIN tb_district dist ON dist.district = l.district
> LEFT JOIN tb_mregion mr ON mr.mregion = l.mregion
> LEFT JOIN tb_region rg ON rg.region = l.region
> LEFT JOIN tb_ogrp ogrp ON ogrp.ogrp = l.ogrp
> LEFT JOIN tb_buying_office byo ON byo.buying_office = l.buying_office
> LEFT JOIN tb_project_participant pp ON pp.project = r.project
> LEFT JOIN tb_project_status ps ON ps.project_status = pj.project_status
> WHERE r.in_scope IS TRUE
> GROUP BY r.reset
> ORDER BY r.reset
> LIMIT 100 OFFSET 0;

Ugh. What's your setting of join_collapse_limit, and if it's less than
27, does raising it fix this query?

> Seeing that there is only one output column, and that the results are
> grouped by this output column, it seems to me that the optimizer should not
> even look at the rest of the tables.

The GROUP BY has nothing to do with it, but if all the other tables' join
keys are primary keys (or at least unique), I'd expect the planner to get
rid of the joins. However, I'm not sure whether it works completely when
there are more than join_collapse_limit relations to worry about.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Fetter 2014-07-08 15:45:02 Re: [BUGS] LEFT JOINs not optimized away when not needed
Previous Message Moshe Jacobson 2014-07-08 14:04:21 LEFT JOINs not optimized away when not needed

Browse pgsql-hackers by date

  From Date Subject
Next Message geohas 2014-07-08 15:28:01 How to use Makefile - pgxs without gcc -O2 ?
Previous Message Tom Lane 2014-07-08 14:45:13 Re: query_is_distinct_for does not take into account set returning functions