Re: 10 TB database

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: 10 TB database
Date: 2009-06-16 19:33:19
Message-ID: alpine.GSO.2.01.0906161519400.120@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 16 Jun 2009, Michelle Konzack wrote:

> Am 2009-06-16 12:13:20, schrieb Greg Smith:
>> you'll be hard pressed to keep up with 250GB/day unless you write a
>> custom data loader that keeps multiple cores
>
> AFAIK he was talking about 250 GByte/month which are around 8 GByte a
> day or 300 MByte per hour

Right, that was just a typo in my response, the comments reflected what he
meant. Note that your averages here presume you can spread that out over
a full 24 hour period--which you often can't, as this type of data tends
to come in a big clump after market close and needs to be loaded ASAP for
it to be useful.

It's harder than most people would guess to sustain that sort of rate
against real-world data (which even fails to import some days) in
PostgreSQL without running into a bottleneck in COPY, WAL traffic, or
database disk I/O (particularly if there's any random access stuff going
on concurrently with the load). Just because your RAID array can write at
hundreds of MB/s does not mean you'll be able to sustain anywhere close to
that during your loading.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-06-16 19:59:37 Re: 10 TB database
Previous Message Michelle Konzack 2009-06-16 18:37:09 Re: 10 TB database