Re: error tolerant inserting of data

From: "Jules Alberts" <jules(dot)alberts(at)arbodienst-limburg(dot)nl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: error tolerant inserting of data
Date: 2003-03-06 08:09:12
Message-ID: 20030306080956.10B011CB1DB@koshin.dsl.systemec.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Op 5 Mar 2003 (14:36), schreef Nigel J. Andrews <nandrews(at)investsystems(dot)co(dot)uk>:
> On Wed, 5 Mar 2003, Jules Alberts wrote:
> > Hello everybody,
> > I'm working on a new pg database that will contain all data that's
> > currently in our old system. The data is already in the db, in tables
> > with names like import_table1, import_table2 etc., now I have to
> > transfer them to the real tables. The real tables have some
> > constraints, like trigger functions to prevent overlapping date-
> > periods.
> >
> > I have written pl/pgsql code to do the importing bit, but there is one
> > big drawback: as soon as anything in a function (or in a FOR ... SELECT
> > ... LOOP) goes wrong, everything is rolled back. Normally this
> > behaviour would be fine, but not in this situation, because my old data
> > contains rows sometimes conflicts with the constraints.
> >
> > What I want is all conflicting rows to be not imported, just plainly be
> > refused, and let everything that's OK be imported. What is the best way
> > to do this?
>
> I think the only way is to do the conflict testing in your code and not try the
> insert if it's going to fail.

Yes, I did tried and it works but this is _very_ slow.

> It'll slow the process down but if you think in terms of issuing a begin at the
> start of a application and you are hoping for a commit at the end then
> obviously you have to protect against situations that you know will give rise
> to an error. For example, if you have some data from a form that a user
> completes when you come to save that data you may not know just from that data
> that it a foreign key constraint will pass so you have to check that before
> trying an insert otherwise or transaction will abort.

I know, and that OK when the system goes live. But right now I'm
importing data and all I want is the good data to be imported as
swiftly as possible while tha bad data is dropped. The data is
somewhere between 500Mb and 1Gb and the transition from the old system
to the new will be during a weekend (big bang).

When I do errorchecking before actually importing the data one weekend
will be _way_ too short :-(

In response to

Browse pgsql-general by date

  From Date Subject
Next Message btober 2003-03-06 08:23:14 Re: My contract has been renewed
Previous Message Oleg Bartunov 2003-03-06 06:36:02 Re: Alpha-2 of contrib/tsearch