Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report

From: David Hartwig <daveh(at)insightdist(dot)com>
To: Hannu Krosing <hannu(at)trust(dot)ee>
Cc: "Jose' Soares Da Silva" <sferac(at)bo(dot)nettuno(dot)it>, PostgreSQL Interfaces <pgsql-interfaces(at)postgreSQL(dot)org>, hackers postgres <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report
Date: 1998-06-02 18:43:14
Message-ID: 357447C1.A41C8257@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-interfaces

Hannu Krosing wrote:

> David Hartwig wrote:
> >
> > > 1. The PostgreSQL doesn't use the index with "OR" operator and
> > > so is not possible to define a multiple key to use with M$-Access
> > > and we need to retreat using OID as primary keys (thanks to Byron
> > > Nikolaidis and David Hartwig of insightdist.com that are doing a
> > > really great job with ODBC driver), but with the obvious consequences.
> >
> > I am currently working on a solution as time will allow. Hopefully part of 6.4
> >
>
> Will this solution be in ODBC driver (rewrite ORs to UNION) or in
> the backend (fix the optimiser)?
>

The short answer is that the rewrite on the driver side is problematic.

I had hoped to be further along with my feasibility research before raising the issue
for again discussion. But, now is as good a time as any. Let me first clarify the
problem for any ppl jumping into the middle of this thread.

Many general purpose database clients applications such as MS Access routinely generate
queries with the following signature:

SELECT k1, k2, k3, a4, a5, ... FROM t WHERE
(k1 = const01 AND k2 = const02 AND k3 = const03) OR
(k1 = const11 AND k2 = const12 AND k3 = const13) OR
(k1 = const21 AND k2 = const22 AND k3 = const23) OR
(k1 = const31 AND k2 = const32 AND k3 = const33) OR
(k1 = const41 AND k2 = const42 AND k3 = const43) OR
(k1 = const51 AND k2 = const52 AND k3 = const53) OR
(k1 = const61 AND k2 = const62 AND k3 = const63) OR
(k1 = const71 AND k2 = const72 AND k3 = const73) OR
(k1 = const81 AND k2 = const82 AND k3 = const73) OR
(k1 = const91 AND k2 = const92 AND k3 = const93);

Where k(n) id is the attribute for a multi-part primary key and const(m)(n) is any
constant.

Performance on this kind of a query is crucial to these client side tools. These are
used to maneuver through large tables without having to slurp in the entire table.
Currently the backend optimizer tries to arrange the WHERE clause into conjunctive
normal form (cnfify()). Unfortunatley this process leads to memory exhaustion.

I have come up with 3 methods of attacking the problem.

1. As Mr. Krosing mentioned we could rewrite the query on the driver side. before
sending it to the backend. One could identify the signature of such a query and upon
verification replace all the ORs with a "UNION SELECT k1, k2, k3, a4, ... FROM t
WHERE" I have tested this substitution with up to 30 OR groupings and it performs
like a charm. Thanks Bruce. Here is the kicker. If you do some guestimations using
a table with say 50 attributes, you will see that very quickly you will be bumping into
the 8K message limit. I am finding that this is not unusual in our user community.

2. Use a similar strategy to the first method except you do the rewrite the query in
the backend; some where after parsing and before optimizations. The exact location
can be debated. The basic idea is to pre-qualify the rewrite by requiring only one
table, no subselects, no unions, etc. Then, identify the AND/OR signature in the
qualifier expression tree. For each OR grouping, clone the main query tree (minus the
qualifier clause) onto a list of query trees hanging off the UNION structure element.
All the while, pruning the OR nodes off and attaching them to the cloned query tree.
The code required by this approach is very isolated and should be low risk as a
result. My concern is that this approach is too narrow and does not fit well into the
long term goals of the project. My guess is that performance will be even better than
the first method.

3. Get out of the way and let Vadim do his thing.

Comments?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-06-02 21:52:17 Re: [HACKERS] Re: [INTERFACES] ODBC is slow with M$-Access Report
Previous Message Bruce Momjian 1998-06-02 17:26:00 Re: [HACKERS] An easier way to upgrade (Was: Lots 'o patches)

Browse pgsql-interfaces by date

  From Date Subject
Next Message Tim Bosinius 1998-06-02 20:26:46 ODBC v243 with MSQuery & Word
Previous Message Tim Bosinius 1998-06-02 17:33:54 RE: [INTERFACES] whoops: VB RDO works just fine with new odbc dri ver....