Populating large DB from Perl script

From: "Kynn Jones" <kynnjo(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Populating large DB from Perl script
Date: 2007-11-01 18:57:36
Message-ID: c2350ba40711011157u48ab291fs2ac4352386887c0b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi. This is a recurrent problem that I have not been able to find a
good solution for. I have large database that needs to be built from
scratch roughly once every month. I use a Perl script to do this.

The tables are very large, so I avoid as much as possible using
in-memory data structures, and instead I rely heavily on temporary
flat files.

The problem is the population of tables that refer to "internal" IDs
on other tables. By "internal" I mean IDs that have no meaning
external to the database; they exist only to enable relational
referencing. They are always defined as serial integers. So the
script either must create and keep track of them, or it must populate
the database in stages, letting Pg assign the serial IDs, and query
the database for these IDs during subsequent stages.

I have solved this general problem in various ways, all of them
unwieldy (in the latest version, the script generates the serial ids
and uses Perl's so-called "tied hashes" to retrieve them when needed).

But it occurred to me that this is a generic enough problem, and that
I'm probably re-inventing a thoroughly invented wheel. Are there
standard techniques or resources or Pg capabilities to deal with this
sort of situation?

TIA!

kj

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff MacDonald 2007-11-01 19:00:46 Re: Solaris 10, mod_auth_pgsql2
Previous Message piotr_sobolewski 2007-11-01 18:38:22 Re: select random order by random