Re: Is there any limit on the number of rows to import using copy command

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: "sivapostgres(at)yahoo(dot)com" <sivapostgres(at)yahoo(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Is there any limit on the number of rows to import using copy command
Date: 2025-07-25 00:48:53
Message-ID: e8fb9b99-40a2-41eb-8932-0959db8356c6@aklaver.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/24/25 16:59, sivapostgres(at)yahoo(dot)com wrote:
> 1.  Testcase.  Created a new database, modified the triggers (split into
> three), populated required master data, lookup tables.  Then transferred
> 86420 records. Checked whether all the 86420 records inserted in table1
> and also whether the trigger created the required records in table2.
>  Yes, it created.
>
> 2.  In the test case above, the total time taken to insert 86420 records
> is 1.15 min only.   Earlier (before splitting the triggers) we waited
> for more than 1.5 hrs first time and 2.5 hrs second time with no records
> inserted.
>
> 3.  Regarding moving the logic to procedure.  Won't the trigger work?
> Will it be a burden for 86420 records?  It's working, if we insert few
> thousand records.  After split of trigger function, it's working for
> 86420 records.  Are triggers overhead for handling even 100000 records?
> In production system, the same (single) trigger is working with 3
> millions of records.  There might be better alternatives to triggers,
> but triggers should also work.  IMHO.

Reread this post, in the thread, from Laurenz Albe:

https://www.postgresql.org/message-id/de08fd016dd9c630f65c52b80292550e0bcdea4c.camel%40cybertec.at

>
> 4.  Staging tables.  Yes, I have done that in another case, where there
> was a need to add data / transform for few more columns.  It worked like
> a charm.  In this case, since there was no need for any other
> calculations (transformation), and with just column to column matching,
> I thought copy command will do.

There is a transformation, you are moving data to another table. That is
overhead, especially if the triggers are not optimized.

>
> Before splitting the trigger into three, we tried
> 1.  Transferring data using DataWindow / PowerBuilder (that's the tool
> we use to develop our front end).  With the same single trigger, it took
> few hours (more than 4 hours, exact time not noted down) to transfer the
> same 86420 records.  (Datawindow fires insert statements for every
> row).  Works, but the time taken is not acceptable.

INSERTs by row is going to be slow, especially if the tool is doing a
commit for each which I suspect it is. Check the Postgres logs.

>
> 2.  Next, we split the larger csv file into 8, with each file containing
> 10,000 records and the last one with 16420 records.  Copy command
> worked.  Works, but the time taken to split the file not acceptable.  We
> wrote a batch file to split the larger csv file.  We felt batch file is
> easier to automate the whole process using PowerBuilder.

I find most GUI tools create extra steps and overhead. My preference are
simpler tools e.g. using Python csv module to batch/stream rows that the
Python psycopg2 Postgres driver can insert or copy into the database.
See:

https://www.psycopg.org/psycopg3/docs/basic/copy.html

>
> 3.  What we observed here, is insert statement succeeds and copy command
> fails, if the records exceed a certain no.  Haven't arrived the exact
> number of rows when the copy command fails.
>
> Will do further works after my return from a holiday.
>
> Happiness Always
> BKR Sivaprakash
>
>
>
> On Thursday 24 July, 2025 at 08:18:07 pm IST, Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
>
> On 7/24/25 05:18, sivapostgres(at)yahoo(dot)com <mailto:sivapostgres(at)yahoo(dot)com>
> wrote:
> > Thanks Merlin, adrain, Laurenz
> >
> > As a testcase, I split the trigger function into three, one each for
> > insert, update, delete, each called from a separate trigger.
> >
> > IT WORKS!.
>
> It worked before, it just slowed down as your cases got bigger. You need
> to provide more information on what test case you used and how you
> define worked.
>
> >
> > Shouldn't we have one trigger function for all the three trigger
> > events?  Is it prohibited for bulk insert like this?
>
> No. Triggers are overhead and they add to the processing that need to be
> done for moving the data into the table. Whether that is an issue is a
> case by case determination.
>
> >
> > I tried this in PGAdmin only, will complete the testing from the program
> > which we are developing, after my return from holiday.
>
> From Merlin Moncure's post:
>
> "* reconfiguring your logic to a procedure can be a better idea; COPY
> your data into some staging tables (perhaps temp, and indexed), then
> write to various tables with joins, upserts, etc."
>
> I would suggest looking into implementing the above.
>
>
> >
> > Happiness Always
> > BKR Sivaprakash
>
> >
>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Pierre Barre 2025-07-25 09:25:48 Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance
Previous Message Ron Johnson 2025-07-25 00:41:34 Re: Is there any limit on the number of rows to import using copy command