Re: VLDB Features

From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: VLDB Features
Date: 2007-12-15 00:18:50
Message-ID: 1197677930.1536.18.camel@dell.linuxdev.us.dell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2007-12-14 at 18:22 -0500, Tom Lane wrote:
> If we could somehow only do a subtransaction per failure, things would
> be much better, but I don't see how.

One approach would be to essentially implement the pg_bulkloader
approach inside the backend. That is, begin by doing a subtransaction
for every k rows (with k = 1000, say). If you get any errors, then
either repeat the process with k/2 until you locate the individual
row(s) causing the trouble, or perhaps just immediately switch to k = 1.
Fairly ugly though, and would be quite slow for data sets with a high
proportion of erroneous data.

Another approach would be to distinguish between errors that require a
subtransaction to recover to a consistent state, and less serious errors
that don't have this requirement (e.g. invalid input to a data type
input function). If all the errors that we want to tolerate during a
bulk load fall into the latter category, we can do without
subtransactions.

-Neil

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-12-15 01:03:56 Re: VLDB Features
Previous Message Jonah H. Harris 2007-12-14 23:50:53 Re: Negative LIMIT and OFFSET?