Re: [INTERFACES] Re: [HACKERS] changes in 6.4

From: David Hartwig <daveh(at)insightdist(dot)com>
To: Hannu Krosing <hannu(at)trust(dot)ee>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, pgsql-interfaces(at)postgreSQL(dot)org, hackers(at)postgreSQL(dot)org
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
Date: 1998-07-15 22:16:02
Message-ID: 35AD2A22.A5C2D5A1@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-interfaces


Hannu Krosing wrote:

> > The days where every release fixed server crashes, or added a feature
> > that users were 'screaming for' may be a thing of the past.
>
> Is anyone working on fixing the exploding optimisations for many OR-s,
> at least the canonic case used by access?
>
> My impression is that this has fallen somewhere between
> insightdist and Vadim.

This is really big for the ODBCers. (And I suspect for JDBCers too.) Many
desktop libraries and end-user tools depend on this "record set" strategy to
operate effectively.

I have put together a workable hack that runs just before cnfify(). The
option is activated through the SET command. Once activated, it identifies
queries with this particular multi-OR pattern generated by these RECORD SET
strategies. Qualified query trees are rewritten as multiple UNIONs. (One
for each OR grouping).

The results are profound. Queries that used to scan tables because of the
ORs, now make use of any indexes. Thus, the size of the table has virtually
no effect on performance. Furthermore, queries that used to crash the
backend, now run in under a second.

Currently the down sides are:
1. If there is no usable index, performance is significantly worse. The
patch does not check to make sure that there is a usable index. I could use
some pointers on this.

2. Small tables are actually a bit slower than without the patch.

3. Not very elegant. I am looking for a more generalized solution.
I have lots of ideas, but I would need to know the backend much better before
attempting any of them. My favorite idea is before cnfify(), to factor the
OR terms and pull out the constants into a virtual (temporary) table spaces.
Then rewrite the query as a join. The optimizer will (should) treat the new
query accordingly. This assumes that an efficient factoring algorithm exists
and that temporary tables can exist in the heap.

Illustration:
SELECT ... FROM tab WHERE
(var1 = const1 AND var2 = const2) OR
(var1 = const3 AND var2 = const4) OR
(var1 = const5 AND var2 = const6)

SELECT ... FROM tab, tmp WHERE
(var1 = var_x AND var2 = var_y)

tmp
var_x | var_y
--------------
const1|const2
const3|const4
const5|const6

Comments?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 1998-07-15 23:39:05 Re: [INTERFACES] Re: [HACKERS] changes in 6.4
Previous Message Bruce Momjian 1998-07-15 21:02:40 Re: [INTERFACES] Re: [HACKERS] changes in 6.4

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-07-15 23:39:05 Re: [INTERFACES] Re: [HACKERS] changes in 6.4
Previous Message Bruce Momjian 1998-07-15 21:02:40 Re: [INTERFACES] Re: [HACKERS] changes in 6.4

Browse pgsql-interfaces by date

  From Date Subject
Next Message Ross, Brenton 1998-07-15 23:37:35 Distributed Applications
Previous Message Bruce Momjian 1998-07-15 21:02:40 Re: [INTERFACES] Re: [HACKERS] changes in 6.4