From: | Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com> |
---|---|
To: | Andrew Kerber <andrew(dot)kerber(at)gmail(dot)com> |
Cc: | pgsql-admin <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Logical Replication Speed |
Date: | 2021-09-29 15:29:40 |
Message-ID: | CAFpL5Vw6sq6D=-0MNUkwF2P+qnFiW5zZnCu_T7et88N71jYDsA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi Andrew,
Our hardware at the destination is pretty good where all the write is
happening
Source
CPU -20
Mem -180GB
pg_test_fsync -s 5
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 92.364 ops/sec 10827 usecs/op
fdatasync 92.047 ops/sec 10864 usecs/op
fsync 45.893 ops/sec 21790 usecs/op
fsync_writethrough n/a
open_sync 45.849 ops/sec 21811 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 33.290 ops/sec 30039 usecs/op
fdatasync 90.455 ops/sec 11055 usecs/op
fsync 45.314 ops/sec 22068 usecs/op
fsync_writethrough n/a
open_sync 22.740 ops/sec 43976 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 45.280 ops/sec 22085 usecs/op
2 * 8kB open_sync writes 20.371 ops/sec 49088 usecs/op
4 * 4kB open_sync writes 11.346 ops/sec 88140 usecs/op
8 * 2kB open_sync writes 5.334 ops/sec 187477 usecs/op
16 * 1kB open_sync writes 2.671 ops/sec 374454 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 45.814 ops/sec 21828 usecs/op
write, close, fsync 41.236 ops/sec 24251 usecs/op
Non-sync'ed 8kB writes:
write 202147.625 ops/sec 5 usecs/op
Destination
CPU - 60
Mem - 350GB
pg_test_fsync -s 5
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 28146.149 ops/sec 36 usecs/op
fdatasync 26721.581 ops/sec 37 usecs/op
fsync 23500.918 ops/sec 43 usecs/op
fsync_writethrough n/a
open_sync 25170.058 ops/sec 40 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 14390.340 ops/sec 69 usecs/op
fdatasync 19655.191 ops/sec 51 usecs/op
fsync 17476.174 ops/sec 57 usecs/op
fsync_writethrough n/a
open_sync 12840.007 ops/sec 78 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 19287.311 ops/sec 52 usecs/op
2 * 8kB open_sync writes 12786.778 ops/sec 78 usecs/op
4 * 4kB open_sync writes 7527.042 ops/sec 133 usecs/op
8 * 2kB open_sync writes write failed: Invalid argument
On Wed, Sep 29, 2021 at 8:02 PM Andrew Kerber <andrew(dot)kerber(at)gmail(dot)com>
wrote:
> The speed is going to be governed by the available hardware, so improve
> the hardware to improve the speed of the initial data copy. You should also
> be able to add space to the file system for your wal logs without too much
> difficulty in RHEL7.
>
> On Wed, Sep 29, 2021 at 9:26 AM Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
> wrote:
>
>> Hi Team,
>>
>> We have a 10 TB database most of which is occupied by a single table.
>> This table is around 9.5TB. Bloat is 3.5TB for this table. So, the total
>> table size without bloat is 6TB.
>>
>> We are using native logical replication to remove the bloat for this
>> table to another instance but it takes more than a day to copy 1.5TB of
>> data. With this speed it will take 4-5 days for the initial sync itself.
>> During this initial sync, the replication slot holds WAL and creates disk
>> usage problems on the primary, so we had to stop the replication. We cannot
>> remove the PK since it is the Replica identity.Other indexes and FK have
>> been removed from the destination(secondary).
>>
>> Other changes we have made in the destination(secondary) instance but
>> still no luck
>> - fsync=off,synchronous_commit=off, archive_mode=off.
>>
>> Since it is a single table, we cannot increase worker_processes. Source
>> database is highly transactional and we do not need any downtime.
>>
>> Is there any other option to increase the speed of initial data copy in
>> logical replication?
>>
>> Postgres Version - 11.7
>> OS - OEL7
>>
>> Thanks and Regards,
>> Nikhil
>>
>
>
> --
> Andrew W. Kerber
>
> 'If at first you dont succeed, dont take up skydiving.'
>
From | Date | Subject | |
---|---|---|---|
Next Message | Sergio Alonso | 2021-09-30 16:42:36 | Barman setup |
Previous Message | Andrew Kerber | 2021-09-29 14:32:31 | Re: Logical Replication Speed |