Re: Populating large tables with occasional bad values

From: "John T(dot) Dow" <john(at)johntdow(dot)com>
To: "Craig Ringer" <craig(at)postnewspapers(dot)com(dot)au>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Populating large tables with occasional bad values
Date: 2008-06-11 16:39:36
Message-ID: 200806111658.m5BGwRZX057718@web2.nidhog.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Latency it is.

I just had no idea it would add up so fast. I guess I was thinking that you could pump a lot of data over the Internet without realizing the overhead when the data is broken down into little chunks.

I'm not sure what the best solution is. I do this rarely, usually when first loading the data from the legacy. When ready to go live, my (remote) client will send the data, I'll massage it for loading, then load it to their (remote) postgres server. This usually takes place over a weekend, but last time was in an evening which lasted until 4AM.

If I did this regularly, three options seem easiest.

1 - Load locally to get clean data and then COPY. This requires the server to have access local access to the file to be copied, and if the server is hosted by an isp, it depends on them whether you can do this easily.

2 - Send the data to the client to run the Java app to insert over their LAN (this only works if the database server is local to them and not at an ISP).

3 - If the only problem is duplicate keys, load into a special table without the constraint, issue update commands to rewrite the keys as needed, then select/insert to the correct table.

Thanks

John

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Craig Ringer 2008-06-11 17:22:14 Re: Populating large tables with occasional bad values
Previous Message Craig Ringer 2008-06-11 16:20:46 Re: Populating large tables with occasional bad values