Skip site navigation (1) Skip section navigation (2)

Re: Hardware recommendations to scale to silly load

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Hardware recommendations to scale to silly load
Date: 2003-08-27 02:59:06
Message-ID: 1061953146.18108.291.camel@haggis (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Tue, 2003-08-26 at 20:35, matt wrote:
> I'm wondering if the good people out there could perhaps give me some
> pointers on suitable hardware to solve an upcoming performance issue. 
> I've never really dealt with these kinds of loads before, so any
> experience you guys have would be invaluable.  Apologies in advance for
> the amount of info below...
> 
> My app is likely to come under some serious load in the next 6 months,
> but the increase will be broadly predictable, so there is time to throw
> hardware at the problem.
> 
> Currently I have a ~1GB DB, with the largest (and most commonly accessed
> and updated) two tables having 150,000 and 50,000 rows.
> 
> A typical user interaction with the system involves about 15
> single-table selects, 5 selects with joins or subqueries, 3 inserts, and
> 3 updates.  The current hardware probably (based on benchmarking and
> profiling) tops out at about 300 inserts/updates *or* 2500 selects per
> second.
> 
> There are multiple indexes on each table that updates & inserts happen
> on.  These indexes are necessary to provide adequate select performance.
> 
> Current hardware/software:
> Quad 700MHz PIII Xeon/1MB cache
> 3GB RAM
> RAID 10 over 4 18GB/10,000rpm drives
> 128MB battery backed controller cache with write-back enabled

Much more cache needed.  Say 512MB per controller?

> Redhat 7.3, kernel 2.4.20
> Postgres 7.2.3 (stock redhat issue)

Upgrade to Pg 7.3.4!

> I need to increase the overall performance by a factor of 10, while at
> the same time the DB size increases by a factor of 50.  e.g. 3000

Are you *sure* about that????  3K updates/inserts per second xlates
to 10,800,000 per hour.  That, my friend, is a WHOLE HECK OF A LOT!

> inserts/updates or 25,000 selects per second, over a 25GB database with

Likewise: 90,000,000 selects per hour.

> most used tables of 5,000,000 and 1,000,000 rows.
> 
> Notably, the data is very time-sensitive, so the active dataset at any

During the 1 hour surge, will SELECTs at 10 minutes after the 
hour depend on INSERTs at 5 minutes after the hour?

If not, maybe you could pump the INSERT/UPDATE records into
flat files, to be processed after the 1-hour surge is complete.
That may reduce the h/w requirements.

> hour is almost certainly going to be more on the order of 5GB than 25GB
> (plus I'll want all the indexes in RAM of course).
> 
> Also, and importantly, the load comes but one hour per week, so buying a

Only one hour out of 168?????  May I ask what kind of app it is?

> Starfire isn't a real option, as it'd just sit idle the rest of the
> time.  I'm particularly interested in keeping the cost down, as I'm a
> shareholder in the company!

What a fun exercises.  Ok, lets see:
Postgres 7.3.4
RH AS 2.1
12GB RAM
motherboard with 64 bit 66MHz PCI slots
4 - Xenon 3.0GHz (1MB cache) CPUs
8 - 36GB 15K RPM as RAID10 on a 64 bit 66MHz U320 controller
    having 512MB cache (for database)
2 - 36GB 15K RPM as RAID1 on a 64 bit 66MHz U320 controller
    having 512MB cache (for OS, swap, WAL files)
1 - library tape drive plugged into the OS' SCSI controller.  I
    prefer DLT, but that's my DEC bias.
1 - 1000 volt UPS.

If you know when the flood will be coming, you could perform
SELECT * FROM ... WHERE statements on an indexed field, to
pull the relevant data into Linux's buffers.

Yes, the 8 disks is capacity-overkill, but the 8 high-speed
spindles is what you're looking for.

> So what do I need?  Can anyone who has (or has ever had) that kind of
> load in production offer any pointers, anecdotes, etc?  Any theoretical
> musings also more than welcome.  Comments upon my sanity will be
> referred to my doctor.
> 
> If the best price/performance option is a second hand 32-cpu Alpha
> running VMS I'd be happy to go that way ;-)

I'd love to work on a GS320!  You may even pick one up for a million
or 2.  The license costs for VMS & Rdb would eat you, though.

Rdb *does* have ways, though, using large buffers and hashed indexes,
with the table tuples stored on the same page as the hashed index
keys, to make such accesses *blazingly* fast.

> Many thanks for reading this far.

-- 
-----------------------------------------------------------------
Ron Johnson, Jr. ron(dot)l(dot)johnson(at)cox(dot)net
Jefferson, LA USA

"A C program is like a fast dance on a newly waxed dance floor 
by people carrying razors."
Waldi Ravens


In response to

Responses

pgsql-performance by date

Next:From: Bruce MomjianDate: 2003-08-27 03:25:41
Subject: Re: [GENERAL] Replication Ideas
Previous:From: Bill MoranDate: 2003-08-27 02:11:48
Subject: Re: Hardware recommendations to scale to silly load

pgsql-hackers by date

Next:From: Marc G. FournierDate: 2003-08-27 03:21:23
Subject: Beta2 Tag'd and Bundled ...
Previous:From: Bill MoranDate: 2003-08-27 02:11:48
Subject: Re: Hardware recommendations to scale to silly load

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group