From: | "blackwater dev" <blackwaterdev(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: [TLM] Re: batch insert/update |
Date: | 2007-12-31 15:34:05 |
Message-ID: | 34a9824e0712310734i3a412b80ia50dd05e4938f4ac@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I was also thinking about adding a 'is_new' column to the table which I
would flag as 0, then do a basic copy of all the new rows in with is_new at
1. I'd then do a delete statement to delete all the rows which are
duplicate and have a flag of 0 as the copy should leave me some with two
rows, one with is_new of 1 and some with 0. Just don't know if this would
be best.
On Dec 26, 2007 3:13 PM, Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> wrote:
> On Wed, 26 Dec 2007 20:48:27 +0100
> Andreas Kretschmer <akretschmer(at)spamfence(dot)net> wrote:
>
> > blackwater dev <blackwaterdev(at)gmail(dot)com> schrieb:
> >
> > > I have some php code that will be pulling in a file via ftp.
> > > This file will contain 20,000+ records that I then need to pump
> > > into the postgres db. These records will represent a subset of
> > > the records in a certain table. I basically need an efficient
> > > way to pump these rows into the table, replacing matching rows
> > > (based on id) already there and inserting ones that aren't. Sort
> > > of looping through the result and inserting or updating based on
> > > the presents of the row, what is the best way to handle this?
> > > This is something that will run nightly.
>
> > Insert you data to a extra table and work with regular SQL to
> > insert/update the destination table. You can use COPY to insert the
> > data into your extra table, this works very fast, but you need a
> > suitable file format for this.
>
> What if you know in advance what are the row that should be inserted
> and you've a batch of rows that should be updated?
>
> Is it still the fasted system to insert them all in a temp table with
> copy?
>
> What about the one that have to be updated if you've all the columns,
> not just the changed ones?
> Is it faster to delete & insert or to update?
>
> updates comes with the same pk as the destination table.
>
> thx
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>
From | Date | Subject | |
---|---|---|---|
Next Message | Reece Hart | 2007-12-31 18:29:04 | Re: double free corruption? |
Previous Message | Raymond O'Donnell | 2007-12-31 15:09:59 | Re: Can postgres handle "group by" queries? |