Re: Need help identifying a periodic performance issue.

From: Robert Creager <robertc(at)spectralogic(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Need help identifying a periodic performance issue.
Date: 2021-11-16 17:40:09
Message-ID: C2E3D6C2-8388-4B0D-A4F8-440A7B6F7530@spectralogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Nov 15, 2021, at 10:29 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com<mailto:pryzby(at)telsasoft(dot)com>> wrote:

This message originated outside your organization.

On Tue, Nov 16, 2021 at 04:43:25AM +0000, Robert Creager wrote:
> We’re executing the following copy to fill a table with approximately 5k records, then repeating for a total of 250k records. Normally, this copy executes < 1 second, with the entire set taking a couple of minutes. The problem is not reproducible on command, but usually within a couple of hours of starting some test runs.
>
> COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS ‘|’
>
> But, occasionally we get into a huge performance bottleneck for about 2 hours, where these copy operations are taking 140 seconds or so
>
> Nov 15 22:25:49 sm4u-34 postgres[5799]: [381-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1<http://127.0.0.1> LOG: duration: 145326.293 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'

> I’m logging statements with pgbadger monitoring the logs. There are no apparent auto-vacuum’s running, nor any vacuums, nor anything at all really. Other select queries around that time frame are executing normally.

What about checkpoints ?

Would you show the "^checkpoint starting" and "^checkpoint complete" logs
surrounding a slow COPY ?

Sorry, it was late last night, I meant to include the checkpoint info. I didn’t have enough logs around the one I pointed out above, my tail got aborted by a reboot. Working on a log server… From an earlier one:

Nov 5 03:56:28 sm4u-34 postgres[4934]: [2679-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 247 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=26.279 s, sync=0.002 s, total=26.323 s; sync files=142, longest=0.001 s, average=0.001 s; distance=592 kB, estimate=279087 kB
Nov 5 04:26:03 sm4u-34 postgres[4934]: [2680-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 04:26:14 sm4u-34 postgres[4934]: [2681-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 115 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=11.880 s, sync=0.003 s, total=11.885 s; sync files=75, longest=0.001 s, average=0.001 s; distance=541 kB, estimate=251232 kB
Nov 5 04:56:03 sm4u-34 postgres[4934]: [2682-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 04:56:15 sm4u-34 postgres[4934]: [2683-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 103 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=11.011 s, sync=0.002 s, total=11.015 s; sync files=74, longest=0.001 s, average=0.001 s; distance=528 kB, estimate=226162 kB
Nov 5 05:15:28 sm4u-34 postgres[59442]: [24-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 1.059 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 05:26:03 sm4u-34 postgres[4934]: [2684-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 05:26:27 sm4u-34 postgres[4934]: [2685-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 226 buffers (0.0%); 0 WAL file(s) added, 1 removed, 0 recycled; write=24.000 s, sync=0.006 s, total=24.037 s; sync files=122, longest=0.001 s, average=0.001 s; distance=583 kB, estimate=203604 kB
Nov 5 05:56:03 sm4u-34 postgres[4934]: [2686-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 05:56:24 sm4u-34 postgres[4934]: [2687-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 199 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=21.212 s, sync=0.004 s, total=21.218 s; sync files=122, longest=0.001 s, average=0.001 s; distance=580 kB, estimate=183302 kB
Nov 5 06:26:03 sm4u-34 postgres[4934]: [2688-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 06:26:22 sm4u-34 postgres[4934]: [2689-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 178 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=19.078 s, sync=0.005 s, total=19.084 s; sync files=120, longest=0.001 s, average=0.001 s; distance=563 kB, estimate=165028 kB
Nov 5 06:32:27 sm4u-34 postgres[7728]: [213-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 143318.661 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:34:56 sm4u-34 postgres[7728]: [214-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149175.227 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:37:27 sm4u-34 postgres[7728]: [215-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 150440.140 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:39:56 sm4u-34 postgres[7728]: [216-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149521.024 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:42:26 sm4u-34 postgres[7728]: [217-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149182.715 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:44:59 sm4u-34 postgres[7728]: [218-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 153734.718 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:47:26 sm4u-34 postgres[7728]: [219-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 146371.043 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:49:59 sm4u-34 postgres[7728]: [220-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 152996.005 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:52:29 sm4u-34 postgres[7728]: [221-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 150094.597 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:55:03 sm4u-34 postgres[7728]: [222-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 154446.475 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 06:56:03 sm4u-34 postgres[4934]: [2690-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 06:57:33 sm4u-34 postgres[7728]: [223-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149823.562 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:00:06 sm4u-34 postgres[7728]: [224-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 152262.349 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:02:37 sm4u-34 postgres[7728]: [225-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 151812.262 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:05:11 sm4u-34 postgres[7728]: [226-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 152992.509 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:07:46 sm4u-34 postgres[7728]: [227-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 155094.565 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:10:19 sm4u-34 postgres[7728]: [228-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 153728.503 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:12:53 sm4u-34 postgres[7728]: [229-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 153031.260 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:15:26 sm4u-34 postgres[7728]: [230-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 153722.550 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:17:49 sm4u-34 postgres[4934]: [2691-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 12310 buffers (0.4%); 0 WAL file(s) added, 10 removed, 0 recycled; write=1305.144 s, sync=0.001 s, total=1305.178 s; sync files=92, longest=0.001 s, average=0.001 s; distance=172759 kB, estimate=172759 kB
Nov 5 07:18:00 sm4u-34 postgres[7728]: [231-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 153736.774 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:20:25 sm4u-34 postgres[7728]: [232-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 145263.582 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:22:53 sm4u-34 postgres[7728]: [233-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 147632.451 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:25:22 sm4u-34 postgres[7728]: [234-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149081.218 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:26:03 sm4u-34 postgres[4934]: [2692-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 07:27:51 sm4u-34 postgres[7728]: [235-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 148655.719 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:30:20 sm4u-34 postgres[7728]: [236-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 148677.766 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:32:49 sm4u-34 postgres[7728]: [237-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 149493.666 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:34:15 sm4u-34 postgres[7728]: [238-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 85751.267 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 07:40:50 sm4u-34 postgres[4934]: [2693-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 8356 buffers (0.3%); 0 WAL file(s) added, 3 removed, 0 recycled; write=887.648 s, sync=0.001 s, total=887.660 s; sync files=51, longest=0.001 s, average=0.001 s; distance=47063 kB, estimate=160189 kB
Nov 5 07:56:03 sm4u-34 postgres[4934]: [2694-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 08:12:46 sm4u-34 postgres[4934]: [2695-1] db=,user=,app=,client= LOG: checkpoint complete: wrote 9436 buffers (0.3%); 0 WAL file(s) added, 5 removed, 0 recycled; write=1002.606 s, sync=0.002 s, total=1002.627 s; sync files=87, longest=0.001 s, average=0.001 s; distance=77658 kB, estimate=151936 kB
Nov 5 08:26:03 sm4u-34 postgres[4934]: [2696-1] db=,user=,app=,client= LOG: checkpoint starting: time
Nov 5 08:38:03 sm4u-34 postgres[7728]: [317-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 206.436 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:04 sm4u-34 postgres[7728]: [318-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 222.790 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:04 sm4u-34 postgres[7728]: [319-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 225.146 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:04 sm4u-34 postgres[7728]: [320-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 217.768 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:04 sm4u-34 postgres[7728]: [321-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 221.421 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:05 sm4u-34 postgres[7728]: [322-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 280.209 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:05 sm4u-34 postgres[7728]: [323-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 224.838 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:05 sm4u-34 postgres[7728]: [324-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 232.646 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:06 sm4u-34 postgres[7728]: [325-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 261.916 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:06 sm4u-34 postgres[7728]: [326-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 252.336 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:06 sm4u-34 postgres[7728]: [327-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 231.334 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:06 sm4u-34 postgres[7728]: [328-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 221.924 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'
Nov 5 08:38:07 sm4u-34 postgres[7728]: [329-1] db=tapesystem,user=Administrator,app=PostgreSQL JDBC Driver,client=127.0.0.1 LOG: duration: 222.862 ms statement: COPY ds3.blob (byte_offset, checksum, checksum_type, id, length, object_id) FROM STDIN WITH DELIMITER AS '|'

> We’re coming from PostgreSQL 9.6 on FreeBSD 11 where we did not see this problem, but have a major release upgrade happening. I’m checking to see if this machine was updated or was a fresh install.
> PostgreSQL 13.2 on amd64-portbld-freebsd13.0, compiled by FreeBSD clang version 11.0.1 (git(at)github(dot)com<mailto:git(at)github(dot)com><mailto:git(at)github(dot)com>:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe), 64-bit
>
> Changes made to the settings in the postgresql.conf file
> checkpoint_timeout | 30min | configuration file
> log_checkpoints | on | configuration file
> log_lock_waits | on | configuration file
...
> shared_buffers | 21679MB | configuration file

> Operating system and version:
> FreeBSD sm4u-34 13.0-STABLE FreeBSD 13.0-STABLE #0: Mon Sep 13 10:11:57 MDT 2021

> These are the system calls made over 30 seconds from Postgres during a slowdown.
...
> fsync 27

--
Justin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Creager 2021-11-16 22:40:43 Re: Need help identifying a periodic performance issue.
Previous Message arjun shetty 2021-11-16 10:10:24 PostgreSQLv14 TPC-H performance GCC vs Clang