Re: [INTERFACES] Re: [HACKERS] changes in 6.4

From: David Hartwig <daybee(at)bellatlantic(dot)net>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hannu(at)trust(dot)ee, pgsql-interfaces(at)postgreSQL(dot)org, hackers(at)postgreSQL(dot)org
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
Date: 1998-08-23 23:55:29
Message-ID: 35E0ABF0.578694C8@bellatlantic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-interfaces

Bruce Momjian wrote:

> >
> > Hannu Krosing wrote:
> >
> > > > The days where every release fixed server crashes, or added a feature
> > > > that users were 'screaming for' may be a thing of the past.
> > >
> > > Is anyone working on fixing the exploding optimisations for many OR-s,
> > > at least the canonic case used by access?
> > >
> > > My impression is that this has fallen somewhere between
> > > insightdist and Vadim.
> >
> > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many
> > desktop libraries and end-user tools depend on this "record set" strategy to
> > operate effectively.
> >
> > I have put together a workable hack that runs just before cnfify(). The
> > option is activated through the SET command. Once activated, it identifies
> > queries with this particular multi-OR pattern generated by these RECORD SET
> > strategies. Qualified query trees are rewritten as multiple UNIONs. (One
> > for each OR grouping).
> >
> > The results are profound. Queries that used to scan tables because of the
> > ORs, now make use of any indexes. Thus, the size of the table has virtually
> > no effect on performance. Furthermore, queries that used to crash the
> > backend, now run in under a second.
> >
> > Currently the down sides are:
> > 1. If there is no usable index, performance is significantly worse. The
> > patch does not check to make sure that there is a usable index. I could use
> > some pointers on this.
> >
> > 2. Small tables are actually a bit slower than without the patch.
> >
> > 3. Not very elegant. I am looking for a more generalized solution.
> > I have lots of ideas, but I would need to know the backend much better before
> > attempting any of them. My favorite idea is before cnfify(), to factor the
> > OR terms and pull out the constants into a virtual (temporary) table spaces.
> > Then rewrite the query as a join. The optimizer will (should) treat the new
> > query accordingly. This assumes that an efficient factoring algorithm exists
> > and that temporary tables can exist in the heap.
> >
> > Illustration:
> > SELECT ... FROM tab WHERE
> > (var1 = const1 AND var2 = const2) OR
> > (var1 = const3 AND var2 = const4) OR
> > (var1 = const5 AND var2 = const6)
> >
> > SELECT ... FROM tab, tmp WHERE
> > (var1 = var_x AND var2 = var_y)
> >
> > tmp
> > var_x | var_y
> > --------------
> > const1|const2
> > const3|const4
> > const5|const6
>
> David, where are we on this? I know we have OR's using indexes. Do we
> still need to look this as a fix, or are we OK. I have not gotten far
> enough in the optimizer to know how to fix the

Bruce,

If the question is, have I come up with a solution for the cnf'ify problem: No

If the question is, is it still important: Very much yes.

It is essential for many RAD tools using remote data objects which make use of key
sets. Your recent optimization of the OR list goes a long way, but inevitably
users are confronted with multi-part keys.

When I look at the problem my head spins. I do not have the experience (yet?)
with the backend to be mucking around in the optimizer. As I see it, cnf'ify is
doing just what it is supposed to do. Boundless boolean logic.

I think hope may lay though, in identifying each AND'ed group associated with a key
and tagging it as a special sub-root node which cnf'ify does not penetrate. This
node would be allowed to pass to the later stages of the optimizer where it will be
used to plan index scans. Easy for me to say.

In the meantime, I still have the patch that I described in prior email. It has
worked well for us. Let me restate that. We could not survive without it!
However, I do not feel that is a sufficiently functional approach that should be
incorporated as a final solution. I will submit the patch if you, (anyone) does
not come up with a better solution. It is coded to be activated by a SET KSQO to
minimize its reach.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sferacarta Software 1998-08-24 07:54:23 Re[2]: [GENERAL] DELETE statement KILL backend <AGAIN>
Previous Message Rostislav Matl 1998-08-23 19:40:48 inserting 3D objects/scenes into PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-08-24 00:24:58 Re: [HACKERS] initdb problem
Previous Message Bruce Momjian 1998-08-23 22:30:14 Re: [HACKERS] Autoconf'd test for int64

Browse pgsql-interfaces by date

  From Date Subject
Next Message Bruce Momjian 1998-08-24 01:21:56 Re: [HACKERS] Convert PGconn, PGresult to opaque types?
Previous Message Peter T Mount 1998-08-23 19:47:02 Re: [INTERFACES] jdbc driver compilation under linux