7.4 performance issue

From: Christophe Musielak <cmusielak(at)akio-software(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: 7.4 performance issue
Date: 2004-06-16 09:07:50
Message-ID: 1087376870.1861.60.camel@fangorn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

We're working since 4 years on Postgres for a e-crm web based
application that deals with emails.

As our customers bases keep growing (more than one million rows in
sereval tables) we started to face performance issue with 7.2 and
decided to upgrade to 7.4 since one month.

The upgrade process was ok.

But we still face performances problems.

We decided to import the database into SQLServer to make some benchmarks
between the two with same hardware and same data.

Please find included main query we're testing on, the explain plan and
indexes list.

Here is the result :

Total running time Postges: 10 - 11s
Total running time sql Server : < 1s

We must admit that we're quite disappointed with the result and
currently think there is something wrong in the way we're working with
Postgres.

What do you think? is there some wrong with the configuration? with the
way we're building the query?

Thanks a lot for your answers.

Christophe Musielak
Christine Bruzaud
Akio Software

PS : Concerning the seq_scan shown in the explain plan, as there is
indexes on the tables as show below, we think Postgres is choosing
seq_scan versus index_scan to improve performance. We tried to force
using index_scan with 'set enable_seqscan = false) but there is no
improvement in the total duration of the query.

------------------------------------------------------------------------

We're running our tests on :

Table threads rows number : 125 000
Table emails rows number : 650 000
Table operators rows number : 50
Total nb rows returned by the select : 116 000 (without the LIMIT)
explain SELECT t.*, substring(t.subject::text, 0, 30) AS subject,
e.email,
o.lastname AS "operator"
FROM threads t JOIN emails e ON (e.id = t.client_email_id)
JOIN operators o ON (o.id = t.operator_id)
WHERE t.mailbox_id IN ( 2,3,5,20,21,13,22,23,24,25,26,19 )
and t.desktop = 2
ORDER BY t.date_last asc, t.id asc
LIMIT 16 OFFSET 0;

QUERY PLAN
------------------------------------------------------------------

Limit (cost=93539.02..93539.06 rows=16 width=619)
-> Sort (cost=93539.02..93745.69 rows=82669 width=619)
Sort Key: t.date_last, t.id
-> Hash Join (cost=26186.03..73789.00 rows=82669 width=619)
Hash Cond: ("outer".operator_id = "inner".id)
-> Merge Join (cost=26183.07..69487.26 rows=82668
width=609)
Merge Cond: ("outer".id = "inner".client_email_id)
-> Index Scan using emails_pkey on emails e
(cost=0.00..37782.57 rows=654511 width=30)
-> Sort (cost=26183.07..26389.74 rows=82667
width=583)
Sort Key: t.client_email_id
-> Seq Scan on threads t
(cost=0.00..19431.23
rows=82667 width=583)
Filter: (((mailbox_id = 2) OR
(mailbox_id
= 3) OR (mailbox_id = 5) OR (mailbox_id = 20) OR (mailbox_id = 21) OR
(mailbox_id = 13) OR (mailbox_id = 22) OR (mailbox_id =
23)
OR (mailbox_id = 24) OR (mailbox_id = 25) OR (mailbox_id = 26)
OR (mailbox_id = 19)) AND (desktop = 2))
-> Hash (cost=2.85..2.85 rows=44 width=14)
-> Seq Scan on operators o (cost=0.00..2.85
rows=44
width=14)

------------------------------------------------------------------

Indexes on threads :

threads_pkey PRIMARY KEY (id)
CREATE INDEX threads_date_last_idx ON threads (date_last);
CREATE INDEX threads_desktop_idx ON threads (desktop);
CREATE INDEX threads_operator_id_idx ON threads (operator_id);
CREATE UNIQUE INDEX threads_pri_dlast_id_idx ON threads (priority,
date_last, id);

Indexes on emails :

emails_pkey PRIMARY KEY (id)

Indexes on operators :

operators_pkey PRIMARY KEY (id)

-----------------------------------------------------------------
postgresql.conf modified parameters :

shared_buffers = 14000 # = 112 Mo with 10 Mo used by wal_buffers
sort_mem = 51200 # = 50 Mo
vacuum_mem = 102400 # = 100 Mo
# Recommended : 64 MB for 1 - 2 Go RAM
wal_buffers = 1280 # = 10240 Ko = 10 Mo
effective_cache_size = 65536 # Choice : 50% of RAM
# <=> 1 * 1024 * 1024 * .50 / 8 = 65536
random_page_cost = 2 # make the planner favor indexscans
cpu_tuple_cost = 0.042 # got the planner to choose the index

stats_command_string = true
stats_block_level = true
stats_row_level = true

datestyle = 'iso, dmy'
------------------------------------------------------------------

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Brendan Jurd 2004-06-16 09:34:13 Re: Can you help me with this query?
Previous Message Frank van Vugt 2004-06-16 08:42:08 Re: Why is a union of two null-results automatically casted to type text ?