Re: Hardware upgrade for a high-traffic database

From: "Jason Coene" <jcoene(at)gotfrag(dot)com>
To: "'Merlin Moncure'" <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hardware upgrade for a high-traffic database
Date: 2004-08-11 21:18:27
Message-ID: 200408112118.i7BLIR0E097522@mail.gotfrag.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
> Right. The point is: is your i/o bottle neck on the read side or the
> write side. With 10-30 inserts/sec and fsync off, it's definitely on
> the read side. What's interesting is that such a low insert load is
> causing i/o storm problems. How does your app run with fsync on?
>
> With read-bound i/o problems, might want to consider upgrading memory
> first to get better cache efficiency. You may want to consider Opteron
> for > 4GB allocations (yummy!).
>
> The good news is that read problems are usually solvable by being
> clever, whereas write problems require hardware.
>

The difference with fsync being off makes seems to be that it allows the
server to write in groups instead of scattering our INSERT/UPDATE calls all
over - it helps keep things going. When a checkpoint occurs, reads slow
down there. Normal reads are usually quite fast, aside from some reads.

A good example, a comments table where users submit TEXT data. A common
query is to find the last 5 comments a user has submitted. The scan, while
using an index, takes a considerable amount of time (> 0.5 sec is about as
good as it gets). Again, it's using an index on the single WHERE clause
(userid = int). The field that's used to ORDER BY (timestamp) is also
indexed.

I'm wondering why our PG server is using so little memory... The system has
2GB of memory, though only around 200MB of it are used. Is there a PG
setting to force more memory usage towards the cache? Additionally, we use
FreeBSD. I've heard that Linux may manage that memory better, any truth
there? Sorry if I'm grabbing at straws here :)

> > One question I do have though - you specifically mentioned NOW() as
> > something to watch out for, in that it's mutable. We typically use
>
> This is specifically with regards to materialized views. Mutable
> functions cause problems because when they are pushed unto the view,
> they are refreshed...something to watch out for.
>
> The trick with MVs is to increase your filesystem cache efficiency. The
> big picture is to keep frequently read data in a single place to make
> better benefit of cache. Aggregates naturally read multiple rows to
> return a single row's worth of data so you want to target them first.
> This all comes at a cost of update I/O time and some application
> complexity.
>
> > as a subselect to retrieve the number of associated rows to the
> current
> > query. Additionally, we use NOW a lot, primarily to detect the status
> of
> > a
> > date, i.e.:
>
> Might want to check if your application middleware (php?) exposes
> PQntuples()...this is a zero cost way to get the same information.
>

Thanks, I'll look into it. We use C and PHP.

> > Based on feedback, I'm looking at a minor upgrade of our RAID
> controller
> > to
> > a 3ware 9000 series (SATA with cache, battery backup optional), and
> > re-configuring it for RAID 10. It's a damn cheap upgrade at around
> $350
> > and
> > an hour of downtime, so I figure that it's worth it for us to give it
> a
> > shot.
>
> p.s. you can also increase cache efficiency by reducing database size,
> for example use int2/int4 vs. numerics.
>

I've gone through and optimized data types as much as possible. I'll see
what else we can do w/o causing downtime once PG 8 is ready to go and we can
change data types on the fly.

Thanks,

Jason

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Brian Hirt 2004-08-11 21:31:48 Re: Hardware upgrade for a high-traffic database
Previous Message Steve Bergman 2004-08-11 20:51:14 Re: [HACKERS] fsync vs open_sync