Re: Serious performance problem

From: "Tille, Andreas" <TilleA(at)rki(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Serious performance problem
Date: 2001-11-04 16:24:04
Message-ID: Pine.LNX.4.33.0111041723090.15449-100000@wr-linux02.rki.ivbb.bund.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2 Nov 2001, Zeugswetter Andreas SB SD wrote:

> This is not true, since the index scan also neads to read the leaf pages
> in MS Sql. The number of leaf pages grows linear with number of rows
> that qualify the where restriction.
>
> R = number of rows that qualify
> --> O(R + log(R))
>
> The pg measurements showed, that PostgreSQL query performance can be
> expected
> to stay nearly the same regardless of number of rows in the table as
> long as
> the number of rows that qualify the where restriction stays constant.
> The response time is linear to the number of rows that qualify the where
>
> restriction, but that linear behavior is also expected with MS Sql.
Well, may be you are right here but I talked once more with my colleague
about specifications. We can assure that the input of data is about 1GB.
We can be sure about that because it is defined what has to be stored
is fixed in the German law about infectious diseases. We have no online
shop system or something else. <sarcastic>If the recent anthrax problem
would increase exponential we could be into trouble, but chances are
low.</sarcastic> So we have good chances to estimate the amount of data
quite well. It is a linear growth of 1GB per year. If MS SQL server is
now fast enough we can grow with normal hardware performance increase
over the year. This is a fact I have to accept.

Additional constraint is that the underlying data modell with an
Access application is running by about 170 clients which have an amount
of data of about 100 - 500 data sets which they export once a week into
our central server. The developers tried hard to get the Access application
and the MS SQL server solution in sync and having a third application
(by rewriting some 500 queries) would be a lot of work. (I´m not afraid
this work but I must be sure it would make sense before I start and so
I hope for advice of people who perhaps did so.)

I discussed the issue of using statistics tables to speed up certain
queries. He told me that those technique is known as OLAP tubes in
MS SQL server and that there are tools to build such things. Is this
a valid comparison? He did not use it because it would disable the
access solution of our clients. Are there any tools for PostgreSQL for
such stuff besides the manual creating tables and triggers?

Currently I see two solutions to solve my problem:
1. Hoping that 'index coverage' coverage is implented (perhaps by a
patch ... sombody asked about it but no response) in 7.2 or at
least 7.3.
In this case I would try to do my best with the statistic tables
but I wouldn´t cope with it if at some stage our data model would
change and I would rework all such stuff.
2. Giving MySQL a trial because I expect it to solve my problem in
the fashion I need. (Well - readonly is OK, surely no such features
like MVCC and thus perhaps faster index scans.) I would definitely
come back to PostgreSQL once 'index coverage' or any other method
to speed up index search will be implemented.

Could somebody give any advise what would be the best strategy? (Perhaps
I should switch back to pgsql-general for this question, but I definitely
want to hear a statement from the hackers about future implementation
plans!)

By the way in my former postings I forgot to mention a further problem
which stayed unanswered in my questions on pgsql-general is the fact that
while observing "top" while doing a query (over some 30 seconds) the
memory load from postgresql increases heavily when executing a query.
I wonder if it could help if there would be some mechanism to let keep
some information of the database resident in memory. I surely know that
memory handling of Linux/UNIX is different from Win (and this is a great
feature ;-) ), but if I have a plenty of free memory (2GB) and my box
wasn´t swapping at any time I wonder if it shouldn´t be possible to
hold some information in memory in favour of simply relying on the hard
disk cache of the OS. Any opinions?

Kind regards

Andreas.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-11-04 16:32:15 Re: Beta going well
Previous Message Tom Lane 2001-11-04 16:13:15 Re: [COMMITTERS] pgsql/ oc/src/sgml/client-auth.sgml oc/src/sgm ...