Re: Patch to support SEMI and ANTI join removal

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to support SEMI and ANTI join removal
Date: 2014-09-16 10:20:52
Message-ID: CAApHDvox2L2z3iZn23mE+HVpRf3U3dKC2KEpaznCcb-AS1Q9bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 13, 2014 at 1:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > On Fri, Sep 12, 2014 at 3:35 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >> I haven't read the patch, but I think the question is why this needs
> >> to be different than what we do for left join removal.
>
> > I discovered over here ->
> >
> http://www.postgresql.org/message-id/CAApHDvo5wCRk7uHBuMHJaDpbW-b_oGKQOuisCZzHC25=H3__fA@mail.gmail.com
> > during the early days of the semi and anti join removal code that the
> > planner was trying to generate paths to the dead rel. I managed to track
> > the problem down to eclass members still existing for the dead rel. I
> guess
> > we must not have eclass members for outer rels? or we'd likely have seen
> > some troubles with left join removals already.
>
> Mere existence of an eclass entry ought not cause paths to get built.
> It'd be worth looking a bit harder into what's happening there.
>
>
It took me a bit of time to create this problem again as I didn't record
the actual query where I hit the assert failure the first time. Though, now
I have managed to recreate the problem again by removing the code that I
had added which removes eclass members for dead rels.

Using the attached patch, the failing test case is:

create table b (id int primary key);
create table a (id int primary key, b_id int references b);
create index on a (b_id); -- add index to create alternative path

explain select a.* from a inner join b on b.id=a.b_id;

What seems to be happening is that generate_implied_equalities_for_column
generates a RestrictInfo for the dead rel due to the eclass member still
existing. This new rinfo gets matched to the index
by match_clauses_to_index()

The code then later fails in get_loop_count:

TRAP: FailedAssertion("!(outer_rel->rows > 0)", File:
"src\backend\optimizer\path\indxpath.c", Line: 1861)

The call stack looks like:

> postgres.exe!get_loop_count(PlannerInfo * root, Bitmapset * outer_relids)
Line 1861 C
postgres.exe!build_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, char useful_predicate,
SaOpControl saop_control, ScanTypeControl scantype) Line 938 C
postgres.exe!get_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * clauses, List * * bitindexpaths)
Line 745 C
postgres.exe!get_join_index_paths(PlannerInfo * root, RelOptInfo * rel,
IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths, Bitmapset
* relids, List * * considered_relids) Line 672 C
postgres.exe!consider_index_join_outer_rels(PlannerInfo * root,
RelOptInfo * rel, IndexOptInfo * index, IndexClauseSet * rclauseset,
IndexClauseSet * jclauseset, IndexClauseSet * eclauseset, List * *
bitindexpaths, List * indexjoinclauses, int considered_clauses, List * *
considered_relids) Line 585 C
postgres.exe!consider_index_join_clauses(PlannerInfo * root, RelOptInfo *
rel, IndexOptInfo * index, IndexClauseSet * rclauseset, IndexClauseSet *
jclauseset, IndexClauseSet * eclauseset, List * * bitindexpaths) Line 485 C
postgres.exe!create_index_paths(PlannerInfo * root, RelOptInfo * rel)
Line 308 C
postgres.exe!set_plain_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
RangeTblEntry * rte) Line 403 C
postgres.exe!set_rel_pathlist(PlannerInfo * root, RelOptInfo * rel,
unsigned int rti, RangeTblEntry * rte) Line 337 C
postgres.exe!set_base_rel_pathlists(PlannerInfo * root) Line 223 C
postgres.exe!make_one_rel(PlannerInfo * root, List * joinlist) Line 157 C
postgres.exe!query_planner(PlannerInfo * root, List * tlist, void
(PlannerInfo *, void *) * qp_callback, void * qp_extra) Line 236 C
postgres.exe!grouping_planner(PlannerInfo * root, double tuple_fraction)
Line 1289 C
postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse,
PlannerInfo * parent_root, char hasRecursion, double tuple_fraction,
PlannerInfo * * subroot) Line 573 C
postgres.exe!standard_planner(Query * parse, int cursorOptions,
ParamListInfoData * boundParams) Line 211 C
postgres.exe!planner(Query * parse, int cursorOptions, ParamListInfoData
* boundParams) Line 139 C
postgres.exe!pg_plan_query(Query * querytree, int cursorOptions,
ParamListInfoData * boundParams) Line 750 C
postgres.exe!ExplainOneQuery(Query * query, IntoClause * into,
ExplainState * es, const char * queryString, ParamListInfoData * params)
Line 330 C
postgres.exe!ExplainQuery(ExplainStmt * stmt, const char * queryString,
ParamListInfoData * params, _DestReceiver * dest) Line 231 C
postgres.exe!standard_ProcessUtility(Node * parsetree, const char *
queryString, ProcessUtilityContext context, ParamListInfoData * params,
_DestReceiver * dest, char * completionTag) Line 647 C
postgres.exe!ProcessUtility(Node * parsetree, const char * queryString,
ProcessUtilityContext context, ParamListInfoData * params, _DestReceiver *
dest, char * completionTag) Line 314 C
postgres.exe!PortalRunUtility(PortalData * portal, Node * utilityStmt,
char isTopLevel, _DestReceiver * dest, char * completionTag) Line 1195 C
postgres.exe!FillPortalStore(PortalData * portal, char isTopLevel) Line
1063 C
postgres.exe!PortalRun(PortalData * portal, long count, char isTopLevel,
_DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Line
790 C
postgres.exe!exec_simple_query(const char * query_string) Line 1052 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname,
const char * username) Line 4012 C
postgres.exe!BackendRun(Port * port) Line 4113 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4618 C
postgres.exe!main(int argc, char * * argv) Line 207 C

So my solution to this was just to add the code that strips out the eclass
members that belong to the newly dead rel in join removal.
The only other solution I can think of would be to add a bitmap set of dead
rels onto the PlannerInfo struct or perhaps just generate one and passing
that in prohibited_rels in generate_implied_equalities_for_column (). I
don't really care for this solution very much as it seems better to make
the join removal code pay for this extra processing rather than (probably)
most queries.

Of course this is my problem as I'm unable to create the same situation
with the existing left join removals. The point here is more to justify why
I added code to strip eclass members of dead rels.

Any thoughts? Or arguments against me keeping the code that strips out the
eclass members of dead rels?

Regards

David Rowley

Attachment Content-Type Size
inner_join_removals_2014-09-16_2bcc9ba.patch application/octet-stream 53.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-09-16 10:21:38 Re: WAL format and API changes (9.5)
Previous Message Amit Kapila 2014-09-16 10:13:06 Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)