Re: Streaming large data into postgres [WORM like applications]

From: "Dhaval Shah" <dhaval(dot)shah(dot)m(at)gmail(dot)com>
To: "Lincoln Yeoh" <lyeoh(at)pop(dot)jaring(dot)my>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Streaming large data into postgres [WORM like applications]
Date: 2007-05-13 00:49:29
Message-ID: 565237760705121749r4b331fa5v81cf235f3a371d0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Consolidating my responses in one email.

1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of
the data comes in a period of 10 hours. Rest 25% comes in the 14
hours. Of course there are ways to smooth the load patterns, however
the current scenario is as explained.

2 I do expect that the customer rolls in something like a NAS/SAN with
Tb of disk space. The idea is to retain the data for a duration and
offload it to tape.

That leads to the question, can the data be compressed? Since the data
is very similar, any compression would result in some 6x-10x
compression. Is there a way to identify which partitions are in which
data files and compress them until they are actually read?

Regards
Dhaval

On 5/12/07, Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my> wrote:
> At 04:43 AM 5/12/2007, Dhaval Shah wrote:
>
> >1. Large amount of streamed rows. In the order of @50-100k rows per
> >second. I was thinking that the rows can be stored into a file and the
> >file then copied into a temp table using copy and then appending those
> >rows to the master table. And then dropping and recreating the index
> >very lazily [during the first query hit or something like that]
>
> Is it one process inserting or can it be many processes?
>
> Is it just a short (relatively) high burst or is that rate sustained
> for a long time? If it's sustained I don't see the point of doing so
> many copies.
>
> How many bytes per row? If the rate is sustained and the rows are big
> then you are going to need LOTs of disks (e.g. a large RAID10).
>
> When do you need to do the reads, and how up to date do they need to be?
>
> Regards,
> Link.
>
>
>
>

--
Dhaval Shah

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Hunter 2007-05-13 03:46:00 Re: Streaming large data into postgres [WORM like applications]
Previous Message Jim C. Nasby 2007-05-12 22:16:25 Re: increasing of the shared memory does not solve the problem of "OUT of shared memory"