Re: Recommended technique for large imports?

From: Jeff Davis <list-pgsql-general(at)empires(dot)org>
To: Stephen Bacon <sbacon(at)13x(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Recommended technique for large imports?
Date: 2002-09-14 23:24:54
Message-ID: 200209141624.54526.list-pgsql-general@empires.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-jdbc

You didn't seem concerned about the upload time of the data, wouldn't that
take a while as well?

Anyway, assuming that you can get the uploaded file properly before the
browser times out, one thing you can do is throw all of the data in a temp
file and copy from there.

You can even do that via an async query so that your application doesn't wait
for the import to finish. Note: there might be some issues with async queries
if you want to close the connection to the backend when the HTTP connection
is finished (i.e. when your script is done), or if you want to return an
error if the import fails. I suggest reading the docs before messing too much
with async queryies.

You can also try doing what you suggested below. I think that's the second
fastest way (copy being the fastest).

Regards,
Jeff

On Saturday 14 September 2002 02:22 pm, you wrote:
> Hello,
>
> I'm running a tomcat-based web app (Tomcat 3.3 under Linux 7.3) with
> PostgreSQL (7.2.1) as the back end. I need to add new import
> functionality. From previous importer experience with this site, I'm
> worried that it can take so long that the user's browser times out
> waiting for the process to complete (only ever happens when they're
> importing a lot of records when the system is under heavy demand - the
> main set of tables have a lot of indexes, so the loop / insert method
> can take a bit).
>
> Of course the data gets in there, but the user can end up with a
> 404-type of error anyways and no one likes to see that.
>
> Now I know the COPY command is much faster because it doesn't update the
> indexes after every row insert, but building that and passing it via
> jdbc seems iffy (or C, PHP, etc. for that matter).
>
> Can anyone give a recommended technique for this sort of process?
>
> Basically (I think) I need to do something like:
>
> Start transaction
> Turn off indexing for this transaction
> loop 1..n
> insert record X
> end loop
> Turn indexing back on
> Commit / End transaction
>
> thanks,
> -Steve
>
> (appologies for the cross-post, but I figured it's not specifically jdbc
> related)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Glen Eustace 2002-09-14 23:26:56 Re: Panic - Format has changed
Previous Message Glen Eustace 2002-09-14 23:21:40 Wht the SEQ Scan ?

Browse pgsql-jdbc by date

  From Date Subject
Next Message Sam Varshavchik 2002-09-15 00:01:42 Re: Recommended technique for large imports?
Previous Message Stephen Bacon 2002-09-14 21:22:43 Recommended technique for large imports?