Re: Serious performance problem

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: "Tille, Andreas" <TilleA(at)rki(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Serious performance problem
Date: 2001-10-30 10:20:57
Message-ID: 3BDE7F09.12A1CC4C@tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tille, Andreas" wrote:
>
> On Mon, 29 Oct 2001, Ross J. Reedstrom wrote:
>
> > Here's what I see: Your example touches on what can be an achilles
> > heel for pgsql's current statistical analyzer: selection on data fields
> > that have a few common values. Often, the indices don't get used, since
> > a large fraction of the table needs to be scanned, in any case. In
> > your example, fully 68% of the table fits the where condition.
> > ...
> >
> > I think we have a winner. No it's not sub-second, but I improved the time
> > by 3x just by trying some indices. Note that I _still_ had to force the
> > use of indices for this one. It's also the first time I've personally seen
> > a query/dataset that benefits this much from a two-key index.
> This is true for this example and I also played with indices as you. I also
> enforced the index scan and compared with forbidding the index scan. The
> result was on my more realistic examples that both versions performed quite
> the same. There was no *real* difference. For sure in this simple query there
> is a difference but the real examples showed only 2% - 5% speed increase
> (if not slower with enforcing index scans!).

I studied his dataset and found that a simple count(*) on whole table
took 1.3 sec on my Celeron 375 so I'm sure that the more complex query,
which has to visit 2/3 of tuples will not be able to execute under 1 sec

My playing with indexes / subqueries and query rewriting got the example
query (actually a functional equivalent) to run in ~5 sec with simple
aggregate(group(indexscan))) plan and I suspect that this is how fast
it will be on my hardware

It could probably be soon possible to make it run in ~ 1.5 by using an
aggregate
function that does a sequential scan and returns a rowset.

> > As another poster replied to you, there is limitation with postgresql's
> > use of indices that arises from MVCC: even if the only data requested is
> > that stored in the index itself, the backend must visit the actual tuple
> > in the table to ensure that it is 'visible' to the current transaction.
> Any possibility to switch of this temporarily for certain queries like this
> if the programmer could make sure that it is not necessary? Just a stupid
> idea from a bloody uneducated man in database-engeniering.

There have been plans to set aside a bit in index that would mark the
deleted
tuple. Unfortunately this helps only in cases when there are many
deleted tuples
and all live tuples have to be checked anyway ;(

--------------
Hannu

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jean-Michel POURE 2001-10-30 10:38:48 Re: DROP/CREATE
Previous Message Jean-Michel POURE 2001-10-30 10:17:51 Re: [HACKERS] Serious performance problem