Re: Slow WAL recovery for DROP TABLE

From: Sherrylyn Branchaw <sbranchaw(at)gmail(dot)com>
To: jeff(dot)janes(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Slow WAL recovery for DROP TABLE
Date: 2018-07-18 17:11:56
Message-ID: CAB_myF7Z+8yYPgaDXBLFDbQvEtPi=0-fzM6+O_izJZpTSvgWOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> There was a recent commit for a similar performance problem, which will
appear in 9.6.10. But that was specifically for cases where there were
multiple dropped tables per transaction, and large shared_buffers.

Interesting, and good to know, thanks! I'm not sure we fall under either
(is 8 GB large? It's larger than the default, but I always thought large
was defined as "more than 8GB" for this setting), but it sounds like this
sort of problem is on the developers' radars. It's possible up to 4 tables
were dropped per transaction in prod, but I don't know if that's enough to
count as "multiple", and in testing, I reproduced the problem with 1 drop
per transaction.

> I can't reproduce your single-drop-per-transaction problem. The replica
has no problem keeping up with the master.

It's possible that the problem only occurs when the replica is on inferior
hardware. I was unable to test equal servers in the time I had. I noticed
that when the superior server was the replica, it was able to keep up with
the inferior replica, but that dropping tables was the only action for
which the inferior server wasn't able to keep up with as a standby, and the
only action for which the standalone outperformed the replica. I did not
test truncates; it's possible I would have seen the same problem with it.

> Can you share the reproduction scripts

For the table drops, I prepped by running these:

CREATE TABLE IF NOT EXISTS test1 (id int);
CREATE TABLE IF NOT EXISTS test2 (id int);
CREATE TABLE IF NOT EXISTS test3 (id int);
...
CREATE TABLE IF NOT EXISTS test100000 (id int);

Then I dropped with these:

DROP TABLE IF EXISTS test1;
DROP TABLE IF EXISTS test2;
DROP TABLE IF EXISTS test3;
...
DROP TABLE IF EXISTS test100000;

For the inserts, I did a

CREATE TABLE test (content1 TEXT, content2 TEXT, content3 TEXT);

followed by 300 of these statements, which insert a random string into each
of the three columns, in batches of 10000 rows:

INSERT INTO test SELECT array_to_string(ARRAY(SELECT chr((48 +
round(random() * 9)) :: integer) FROM generate_series(1,135)), ''),
array_to_string(ARRAY(SELECT chr((48 + round(random() * 9)) :: integer)
FROM generate_series(1,135)), ''), array_to_string(ARRAY(SELECT chr((48 +
round(random() * 9)) :: integer) FROM generate_series(1,135)), '') FROM
(SELECT generate_series(1,10000)) foo;

I loaded and timed each script as a file using

time psql -d mydb -f drop_tables.sql

> any non-default config settings? Especially the setting of
shared_buffers (on both master and replica, if different)

Our shared_buffers setting is 8 GB on all nodes.

In prod, we had a primary and two standbys. One standby was very similar to
the primary in hardware, but older and with fewer CPUs. The other standby
was far inferior in hardware. For testing, I used the two standbys, and was
able to reproduce when I made the superior standby the primary.

These are the non-default settings on the primary and the standby of
comparable hardware (other than changing file and directory paths, which
shouldn't affect anything):

listen_addresses = '*'
log_destination = 'stderr, syslog'
log_filename = 'postgresql-%Y-%m-%d.log'
log_line_prefix = '%p [%m]: %u %d %q %h %a %v %x'
syslog_facility = 'local1'
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_timezone = 'US/Central'
log_statement = 'ddl'
track_functions = pl
track_activity_query_size = 4096
timezone = US/Central
client_encoding = UTF8
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
max_connections = 1024
tcp_keepalives_idle = 7200
tcp_keepalives_interval = 75
tcp_keepalives_count = 9
work_mem = '48 MB'
maintenance_work_mem = '1 GB'
max_locks_per_transaction = 8192
random_page_cost = 2.0
effective_cache_size = '94GB'
log_autovacuum_min_duration = 10s
autovacuum_naptime = 2min
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_freeze_max_age = 1000000000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_scale_factor = 0.01
autovacuum_analyze_scale_factor = 0.01
wal_buffers = 16MB
bgwriter_lru_maxpages = 250
max_wal_senders = 5
wal_keep_segments = 256
hot_standby = on
log_min_duration_statement = 2s
wal_level = replica
wal_log_hints = on
archive_mode = on
archive_command = 'ssh postgres(at)backupserver "test ! -f /opt/backup/db1/%f"
&& rsync -q %p postgres(at)backupserver:/opt/backup/db1/%f'
max_parallel_workers_per_gather = 4
max_worker_processes = 16

The only different setting on the second standby is effective_cache_size,
which is 24 GB instead of 94 GB.

To rule out fetching from the remote archive as the bottleneck, I tried
scp-ing a bunch of WAL files to the standby and resetting the restore
command to replay from the local archive. Same performance problem, and
only when dropping tables.

Best,
Sherrylyn

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sherrylyn Branchaw 2018-07-18 17:27:48 Re: Slow WAL recovery for DROP TABLE
Previous Message Márcio Antônio Sepp 2018-07-18 16:16:43 RES: Can't compile postgresql 11 on FreeBSD 11.1