Re: Scaling concerns

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "tsuraan" <tsuraan(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Scaling concerns
Date: 2006-12-17 07:18:25
Message-ID: C1AA2F41.15D43%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tsuraan,

"Select count(*) from bigtable" is testing your disk drive speed up till
about 300MB/s, after which it is CPU limited in Postgres.

My guess is that your system has a very slow I/O configuration, either due
to faulty driver/hardware or the configuration.

The first thing you should do is run a simple I/O test on your data
directory - write a file twice the size of memory using dd like this:

time bash -c "dd if=/dev/zero of=data_directory/bigfile bs=8k count=(2 *
memory_size / 8192) && sync"

time dd if=data_directory/bigfile of=/dev/null bs=8k

Then report the times here.

- Luke

On 12/16/06 9:26 AM, "tsuraan" <tsuraan(at)gmail(dot)com> wrote:

> I'm writing a webmail-type application that is meant to be used in a
> corporate environment. The core of my system is a Postgres database
> that is used as a message header cache. The two (relevant) tables
> being used are pasted into the end of this message. My problem is
> that, as the messages table increases to tens of millions of rows,
> pgsql slows down considerably. Even an operation like "select
> count(*) from messages" can take minutes, with a totally idle system.
> Postgres seems to be the most scalable Free database out there, so I
> must be doing something wrong.
>
> As for the most common strategy of having a slower (more rows)
> "archival" database and a smaller, faster "live" database, all the
> clients in the company are using their normal corporate email server
> for day-to-day email handling. The webmail is used for access email
> that's no longer on the corporate server, so it's not really simple to
> say which emails should be considered live and which are really
> out-of-date.
>
> My postgres settings are entirely default with the exception of
> shared_buffers being set to 40,000 and max_connections set to 400.
> I'm not sure what the meaning of most of the other settings are, so I
> haven't touched them. The machines running the database servers are
> my home desktop (a dual-core athlon 3200+ with 2GB RAM and a 120GB
> SATA II drive), and a production server with two dual-core Intel
> chips, 4 GB RAM, and a RAID 5 array of SATA II drives on a 3Ware 9550
> controller. Both machines are running Gentoo Linux with a 2.6.1x
> kernel, and both exhibit significant performance degradation when I
> start getting tens of millions of records.
>
> Any advice would be most appreciated. Thanks in advance!
>
> Tables:
>
> CREATE TABLE EmailAddresses (
> emailid SERIAL PRIMARY KEY, -- The unique identifier of this address
> name TEXT NOT NULL, -- The friendly name in the address
> addrspec TEXT NOT NULL, -- The user(at)domain part of the address
> UNIQUE(name, addrspec)
> );
>
> and
>
> CREATE TABLE Messages (
> -- Store info:
> msgkey BIGSERIAL PRIMARY KEY, -- Unique identifier for a message
> path TEXT NOT NULL, -- Where the message is on the file
> system
> inserted TIMESTAMP DEFAULT now(),-- When the message was fetched
> -- Message Info:
> msgid TEXT UNIQUE NOT NULL, -- Message's Message-Id field
> mfrom INTEGER -- Who sent the message
> REFERENCES EmailAddresses
> DEFAULT NULL,
> mdate TIMESTAMP DEFAULT NULL, -- Message "date" header field
> replyto TEXT DEFAULT NULL, -- Message-ID of replied-to message
> subject TEXT DEFAULT NULL, -- Message "subject" header field
> numatch INTEGER DEFAULT NULL, -- Number of attachments
> UNIQUE(path)
> );
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bruno Wolff III 2006-12-17 11:18:00 Re: File Systems Compared
Previous Message Andreas Kostyrka 2006-12-17 03:25:27 Re: Scaling concerns