Re: Serious performance problem

From: Brent Verner <brent(at)rcfile(dot)org>
To: "Tille, Andreas" <TilleA(at)rki(dot)de>
Cc: "Claus, Hermann" <ClausH(at)rki(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Serious performance problem
Date: 2001-10-30 11:48:40
Message-ID: 20011030064840.A58106@rcfile.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30 Oct 2001 at 11:44 (+0100), Tille, Andreas wrote:
| On Mon, 29 Oct 2001, Vsevolod Lobko wrote:
|
| > Seems that problem is very simple :))
| > MSSql can do queries from indexes, without using actual table at all.
| > Postgresql doesn't.
| >
| > So mssql avoids sequental scanning of big table, and simply does scan of
| > index which is already in needed order and has very much less size.
| I forewarded this information to my colleague and he replied the following
| (im translating from German into English):
|
| hc> I expected this problem. But what is the purpose of an index: Not
| hc> to look into the table itself. Moreover this means that the expense
| hc> grows linear with the table size - no good prospect at all (the
| hc> good thing is it is not exponential :-)).
| I have to explain that we are in the *beginning* of production process.
| We expect a lot more of data.
|
| hc> In case of real index usage the expense grows only with log(n).
| hc> No matter about the better philosophy of database servers, MS-SQL-Server
| hc> has a consequent index usage and so it is very fast at many queries.
| hc> When performing a query to a field without index, I get a slow
| hc> table scan. This is like measuring the speed of the harddisk and
| hc> the cleverness of the cache.
|
| The consequence for my problem is now: If it is technically possible
| to implement index scans without table lookups please implement it. If
| not we just have to look for another database engine which does so,
| because our applictaion really need the speed on this type of queries.
| I repeat from my initial posting: The choice of the server for our
| application could have importance for many projects in the field of
| medicine in Germany. I really hope that there is a reasonable solution
| which perhaps could give a balance between safety and speed. For
| example I can assure in my application that the index, once created
| will be valid, because I just want to read in a new set of data once
| a day (from the MS-SQL Server which collects data over the day). So
| I could recreate all indices after the import and the database is
| readonly until the next cron job. Is there any chance to speed up
| those applications?

CREATE INDEX idx_meldekategorie_hauptdaten_f
ON hauptdaten_fall(meldekategorie);
CLUSTER idx_meldekategorie_hauptdaten_f ON hauptdaten_fall;

Aggregate (cost=5006.02..5018.90 rows=258 width=16)
-> Group (cost=5006.02..5012.46 rows=2575 width=16)
-> Sort (cost=5006.02..5006.02 rows=2575 width=16)
-> Seq Scan on hauptdaten_fall (cost=0.00..4860.12 rows=2575 width=16)

This looks much nicer, but is still quite slow. I'm quite sure the
slowness is in the sort(), since all queries that don't sort, return
quickly. I hoped the clustered index would speed up the sort, but
that is not the case.

It _seems_ a simple optimization would be to not (re)sort the tuples
when using a clustered index.

if( the_column_to_order_by_is_clustered ){
if( order_by_is_DESC )
// reverse the tuples to handle
}

I haven't looked at the code to see if this is even feasible, but I
do imagine there is enough info available to avoid an unnecessary
sort on the CLUSTERED index. The only problem I see with this is
if the CLUSTERed index is not kept in a CLUSTERed state as more
records are added to this table.

brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing." -- Duane Allman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jean-Michel POURE 2001-10-30 12:06:44 Re: [HACKERS] Serious performance problem
Previous Message Zeugswetter Andreas SB SD 2001-10-30 11:29:50 Re: Serious performance problem