Re: Performance improvement for joins where outer side is unique

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2015-02-26 00:15:27
Message-ID: 54EE659F.8050004@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I tried to do an initdb with the patch applied, and seems there's a bug
somewhere in analyzejoins.c:

tomas(at)rimmer ~ $ pg_ctl -D tmp/pg-unidata init
The files belonging to this database system will be owned by user "tomas".
This user must also own the server process.

The database cluster will be initialized with locale "en_US".
The default database encoding has accordingly been set to "LATIN1".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory tmp/pg-unidata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... sysv
creating configuration files ... ok
creating template1 database in tmp/pg-unidata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... TRAP:
FailedAssertion("!(index_vars != ((List *) ((void *)0)))", File:
"analyzejoins.c", Line: 414)
sh: line 1: 339 Aborted
"/home/tomas/pg-unijoins/bin/postgres" --single -F -O -c
search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
child process exited with exit code 134
initdb: removing data directory "tmp/pg-unidata"
pg_ctl: database system initialization failed

The problem seems to be the last command in setup_description() at
src/bin/initdb/initdb.c:1843, i.e. this query:

WITH funcdescs AS (
SELECT p.oid as p_oid, oprname,
coalesce(obj_description(o.oid, 'pg_operator'),'') as opdesc
FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid )
INSERT INTO pg_description
SELECT p_oid, 'pg_proc'::regclass, 0,
'implementation of ' || oprname || ' operator'
FROM funcdescs
WHERE opdesc NOT LIKE 'deprecated%' AND
NOT EXISTS (SELECT 1 FROM pg_description
WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass)
And particularly the join in the CTE, i.e. this fails

SELECT * FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid

I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):

if (relvar != NULL && candidaterelvar != NULL)
{
...
index_vars = lappend(index_vars, candidaterelvar);
...
}

so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.

BTW, I find this coding (first cast, then check) rather strange:

Var *var = (Var *) ecm->em_expr;

if (!IsA(var, Var))
continue; /* Ignore Consts */

It's probably harmless, but I find it confusing and I can't remember
seeing it elsewhere in the code (for example clausesel.c and such) use
this style:

... clause is (Node*) ...

if (IsA(clause, Var))
{
Var *var = (Var*)clause;
...
}

or

Var * var = NULL;

if (! IsA(clause, Var))
// error / continue

var = (Var*)clause;

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-02-26 00:28:52 Re: Partitioning WIP patch (was: Partitioning: issues/ideas)
Previous Message Stephen Frost 2015-02-25 23:56:03 Re: contrib/fuzzystrmatch/dmetaphone.c license