Re: procedure using CURSOR to insert is extremely slow

From: Milos Babic <milos(dot)babic(at)gmail(dot)com>
To: Szalontai Zoltán <szalontai(dot)zoltan(at)t-online(dot)hu>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: procedure using CURSOR to insert is extremely slow
Date: 2021-04-08 18:21:59
Message-ID: CAPVD16tVf6bBYSOx7meTZ7HBW+msMwbRnbW84R8=hvDLUiPiOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Zoltan,

you should try to rethink the logic behind the query.
Numerous if/then/else can be transformed into case-when, or a bunch of
unions, which, I'm 100% certain will do much better than row-by-row
insertion.

However, this is a general note.
Still doesn't explain why it takes faster to insert with deletions (?!!)
Is there any chance the set you inserting in the second run is smaller
(e.g. only a fraction of the original one)?

If possible, you can send over a fragment of the code, and we can look into
it.

regards,
Milos

On Thu, Apr 8, 2021 at 3:56 PM Szalontai Zoltán <
szalontai(dot)zoltan(at)t-online(dot)hu> wrote:

> Hi Milos,
>
>
>
> Inside the loops there are frequently if / else branches value
> transformations used.
>
> We could not solve it without using a cursor.
>
>
>
> Regards,
>
> Zoltán
>
>
>
> *From:* Milos Babic <milos(dot)babic(at)gmail(dot)com>
> *Sent:* Thursday, April 8, 2021 2:31 PM
> *To:* Szalontai Zoltán <szalontai(dot)zoltan(at)t-online(dot)hu>
> *Cc:* Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
> *Subject:* Re: procedure using CURSOR to insert is extremely slow
>
>
>
> Hi Zoltan,
>
>
>
> is there any particular reason why you don't do a bulk insert as:
>
> insert into target_table
>
> select ... from source_table(s) (with joins etc)
>
>
>
> Regards,
>
> Milos
>
>
>
>
>
>
>
> On Thu, Apr 8, 2021 at 1:24 PM Szalontai Zoltán <
> szalontai(dot)zoltan(at)t-online(dot)hu> wrote:
>
> Hi,
>
>
>
> We have a Class db.t2.medium database on AWS.
>
> We use a procedure to transfer data records from the Source to the Target
> Schema.
>
> Transfers are identified by the log_id field in the target table.
>
>
>
> The procedure is:
>
> 1 all records are deleted from the Target table with the actual log_id
> value
>
> 2 a complicated SELECT (numerous tables are joined) is created on the
> Source system
>
> 3 a cursor is defined based on this SELECT
>
> 4 we go trough the CURSOR and insert new records into the Target table
> with this log_id
>
>
>
> (Actually we have about 100 tables in the Target schema and the size of
> the database backup file is about 1GByte. But we do the same for all the
> Target tables.)
>
>
>
> Our procedure is extremely slow for the first run: 3 days for the 100
> tables. For the second and all subsequent run it is fast enough (15
> minutes).
>
> The only difference between the first run and all the others is that in
> the first run there are no records in the Target schema with this log_id.
>
>
>
> It seems, that in the first step the DELETE operation makes free some
> “space”, and the INSET operation in the 4. step can reuse this space. But
> if no records are deleted in the first step, the procedure is extremely
> slow.
>
>
>
> To speed up the first run we found the following workaround:
>
> We inserted dummy records into the Target tables with the proper log_id,
> and really the first run became very fast again.
>
>
>
> Is there any “normal” way to speed up this procedure?
>
> In the production environment there will be only “first runs”, the same
> log_id will never be used again.
>
>
>
>
>
> thank
>
> Zoltán
>
>
>
>
>
>
>
>
> --
>
> Milos Babic
>
> http://www.linkedin.com/in/milosbabic
>

--
Milos Babic
http://www.linkedin.com/in/milosbabic

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Szalontai Zoltán 2021-04-08 19:45:52 RE: procedure using CURSOR to insert is extremely slow
Previous Message Mike Sofen 2021-04-08 16:52:44 RE: procedure using CURSOR to insert is extremely slow