Re: planner/optimizer question

From: "Gary Doades" <gpd(at)gpdnet(dot)co(dot)uk>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: planner/optimizer question
Date: 2004-04-29 18:13:51
Message-ID: 409153EF.24647.144E9BF7@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 29 Apr 2004 at 19:03, Manfred Koizar wrote:

> While the storage overhead could be reduced to 1 bit (not a joke) we'd
> still have the I/O overhead of locating and updating index tuples for
> every heap tuple deleted/updated.

But this is what a lot of DBMSs do and seem to do well enough. I can see that the
MVCC system gives additional problems, but maybe it shouldn't be dismissed so lightly.

Coming from a MS SQLServer platform I have spent a lot of time optimising SQL in
PostgreSQL to be comparable to SQLServer. For the most part I have done this, but
some things are just slower in PostgreSQL.

Recently I have been looking at raw performance (CPU, IO) rather than the plans. I
have some test queries that (as far as I can determine) use the same access plans on
PostgreSQL and SQLServer. Getting to the detail, an index scan of an index on a
integer column (222512 rows) takes 60ms on SQLServer and 540ms on PostgreSQL.
A full seq table scan on the same table without the index on the other hand takes 370ms
in SQLServer and 420ms in PostgreSQL.

I know that the platforms are different (windows 2000 vs Linux 2.6.3), but the statement
was executed several times to make sure the index and data was in cache (no disk io)
on both systems. Same data, Same CPU, Same disks, Same memory, Same
motherboards.

The only thing I can think of is the way that the index scan is performed on each
platform, SQLServer can use the data directly from the index. This makes the biggest
difference in multi join statements where several of the intermediate tables do not need
to be accessed at all, the data is contained in the join indexes. This results in almost an
order of magnitude performance difference for the same data.

I would be nice to get a feel for how much performance loss would be incurred in
maintaining the index flags against possible performance gains for getting the data back
out again.

Regards,
Gary.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-04-29 18:21:51 Re: Wierd context-switching issue on Xeon
Previous Message Manfred Koizar 2004-04-29 17:59:13 Re: Simply join in PostrgeSQL takes too long