Speed Question

From: Noah Silverman <noah(at)allresearch(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Speed Question
Date: 2002-12-20 22:57:28
Message-ID: 69932226-146E-11D7-8943-000393AA8F3C@allresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

We are considering switching our systems over from MySQL to Postgresql.

Speed is one of our major concerns, so before switching we've decided
to perform some speed tests.
From what I understand, Postgresql is NOT as fast as Mysql, but should
be close enough.

We've installed the software and have run some basic insert, index and
query tests that
seem ridiculously slow. I can't help thinking that we are doing
something wrong, or
don't have things configured for optimal performance.

We've performed these same tests on Mysql and then run dramatically
faster.

Here's the initial performance test results and issues...

Table configuration:
speedtest( prop1 integer, prop2 integer, prop3 integer, prop4 integer);
indexes on each of the four individual property fields

Each record consists of four random integers, uniformly distributed,
between 0 and 1000. The integers are computed in the perl script
used to populate the table, not using an SQL random() function.

Hardware configuration: P3-500, 384MB ram, *unloaded* system.
Software configuration: Linux 2.4.20, reiserfs, standard slackware
install.

Issue #1: Speed of inserts is relatively slow. 100000 inserts is
taking
roughly 10 minutes. This isn't EVIL, but mysql appears to be about
ten times faster here. Is there something we could do to the indexes
differently? Disable transactions? Is there a more "raw" insert, which
may not set off triggers?

Issue #2: It doesn't appear as though multiple indexes are being used.
ie: select count(*) from speedtest where (prop1 between 100 and 200)
and( prop2 between 100 and 200) and (prop3 between 100 and 200)
and (prop4 between 100 and 200) formulates a query plan that only
uses one index. The following is pasted from the 'explain select' ---

Aggregate (cost=17.16..17.16 rows=1 width=0)
-> Index Scan using p4 on speedtest (cost=0.00..17.16 rows=1
width=0)
Index Cond: ((prop4 >= 100) AND (prop4 <= 200))
Filter: ((prop1 >= 100) AND (prop1 <= 200) AND (prop2 >= 100)
AND
(prop2 <= 200) AND (prop3 >= 100) AND (prop3 <= 200))
(4 rows)

It appears as though the index on prop4 is being used to determine a
subset
of records to fetch -- subsequently filtering them with the other
conditions.
Unfortunately, since the index condition matches 10% of the table (due
to
the random uniform integers from 0-1000), this results in a large
number of
record fetches and examinations the db engine must make. This query
takes
at least a second to execute, whereas we would like to be able to drop
this
into the sub-0.1 second range, and preferably into the millisecond
range.
While this would run faster on the production machines than on my
workstation,
it is still a fundamental flaw that multiple indexes aren't being
combined to
restrict the record set to fetch.

OTOH, if we could do index combining, we could fetch 10% of 10% of 10%
of the initial 10% of records... Resulting in a microscopic number of
items
to retrieve and examine.

Can anybody give me some ideas as to what I am doing wrong???

Thanks,

-Noah

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message scott.marlowe 2002-12-20 23:23:59 Re: Speed Question
Previous Message Josh Berkus 2002-12-20 17:01:28 Re: 4G row table?