Re: pg_dump additional options for performance

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_dump additional options for performance
Date: 2008-02-26 17:27:13
Message-ID: Pine.GSO.4.64.0802261209430.204@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 26 Feb 2008, Simon Riggs wrote:

> Splitting up the dump is the enabler for splitting up the load.

While the pg_dump split train seems to be leaving the station, I feel
compelled to point out that focus does nothing to help people who are
bulk-loading data that came from somewhere else. If my data is already in
PostgreSQL, and I'm doing a dump/load, I can usually split the data easily
enough with existing tools to handle that right now via COPY (SELECT...)
TO. Some tools within pg_dump would be nice, but I don't need them that
much. It's gigantic files that came from some other DB I don't even have
access to that I struggle with loading efficiently.

The work Dimitri is doing is wandering in that direction and that may be
enough. I note that something that addresses loading big files regardless
of source could also work on PostgreSQL dumps, while a pg_dump focused
effort helps nothing but that specific workflow. I wonder if doing too
much work on the pg_dump path is the best use of someone's time when the
more general case will need to be addressed one day anyway.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-02-26 17:35:27 Re: Including PL/PgSQL by default
Previous Message Simon Riggs 2008-02-26 17:25:15 Re: pg_dump additional options for performance