Re: > 16TB worth of data question

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: postgres list <pgsql-general(at)postgresql(dot)org>
Subject: Re: > 16TB worth of data question
Date: 2003-04-29 06:01:39
Message-ID: 1051596099.16230.16.camel@haggis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 2003-04-28 at 16:59, scott.marlowe wrote:
> On 28 Apr 2003, Ron Johnson wrote:
> [snip]
> > On Mon, 2003-04-28 at 10:42, scott.marlowe wrote:
> > > On 28 Apr 2003, Jeremiah Jahn wrote:
> > >
> > > > On Fri, 2003-04-25 at 16:46, Jan Wieck wrote:
> > > > > Jeremiah Jahn wrote:
> > > > > >
> > > > > > On Tue, 2003-04-22 at 10:31, Lincoln Yeoh wrote:
> > [snip]
> > > Don't shut it down and backup at file system level, leave it up, restrict
> > > access via pg_hba.conf if need be, and use pg_dump. File system level
> > > backups are not the best way to go, although for quick recovery they can
> > > be added to full pg_dumps as an aid, but don't leave out the pg_dump,
> > > it's the way you're supposed to backup postgresql, and it can do so when
> > > the database is "hot and in use" and provide a consistent backup
> > > snapshot.
> >
> > What's the problem with doing a file-level backup of a *cold* database?
>
> There's no problem with doing it, the problem is that in order to get
> anything back you pretty much have to have all of it to make it work
> right, and any subtle problems of a partial copy might not be so obvious.
>
> Plus it sticks you to one major rev of the database. Pulling out five
> year old copies of the base directory can involve a fair bit of work
> getting an older flavor of postgresql to run on a newer os.

Good point...

[snip]
> > The problem with pg_dump is that it's single-threaded, and it would take
> > a whole lotta time to back up 16TB using 1 tape drive...
>
> But, you can run pg_dump against individual databases or tables on the
> same postmaster, so you could theoretically write a script around pg_dump
> to dump the databases or large tables to different drives. We backup our
> main server to our backup server that way, albeit with only one backup
> process at a time, since we can backup about a gig a minute, it's plenty
> fast for us. If we needed to parallelize it that would be pretty easy.

But pg doesn't guarantee internal consistency unless you pg_dump
the database in one command "pg_dump db_name > db_yyyymmdd.dmp".

Thus, no parallelism unless there are multiple databases, but if there's
only 1 database...

--
+-----------------------------------------------------------+
| Ron Johnson, Jr. Home: ron(dot)l(dot)johnson(at)cox(dot)net |
| Jefferson, LA USA http://members.cox.net/ron.l.johnson |
| |
| An ad currently being run by the NEA (the US's biggest |
| public school TEACHERS UNION) asks a teenager if he can |
| find sodium and *chloride* in the periodic table of the |
| elements. |
| And they wonder why people think public schools suck... |
+-----------------------------------------------------------+

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gerhard Hintermayer 2003-04-29 07:26:10 Backend memory leakage when inserting
Previous Message Tom Lane 2003-04-29 04:33:46 Re: Bug(?) with cursors using aggregate functions.