Re: Select count(*), the sequel

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Select count(*), the sequel
Date: 2010-10-26 22:45:30
Message-ID: AANLkTik6ifMywZq7C1_MiHLafOQ53_OXreHYQdps3GqP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Oct 16, 2010 at 2:44 PM, Kenneth Marshall <ktm(at)rice(dot)edu> wrote:
> Interesting data points. The amount of rows that you managed to
> insert into PostgreSQL before Oracle gave up the ghost is 95%
> of the rows in the Oracle version of the database. To count 5%
> fewer rows, it took PostgreSQL 24 seconds longer. Or adjusting
> for the missing rows, 52 seconds longer for the entire table
> or 18% longer than the full table scan in Oracle. This seems to
> be well within the table layout size differences, possibly due
> to the fillfactor used --not really bad at all.

I don't think this is due to fillfactor - the default fillfactor is
100, and anyway we ARE larger on disk than Oracle. We really need to
do something about that, in the changes to NUMERIC in 9.1 are a step
in that direction, but I think a lot more work is needed. I think it
would be really helpful if we could try to quantify where the extra
space is going.

Some places to look:

- Bloated representations of individual datatypes. (I know that even
the new NUMERIC format is larger than Oracle's NUMBER.)
- Excessive per-tuple overhead. Ours is 24 bytes, plus the item pointer.
- Alignment requirements. We have a fair number of datatypes that
require 4 or 8 byte alignment. How much is that hurting us?
- Compression. Maybe Oracle's algorithm does better than PGLZ.

If we can quantify where we're losing vs. Oracle - or any other
competitor - that might give us some idea where to start looking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-10-26 22:50:37 Re: Postgres insert performance and storage requirement compared to Oracle
Previous Message Merlin Moncure 2010-10-26 22:14:15 Re: Postgres insert performance and storage requirement compared to Oracle