RE: Recovery performance of DROP DATABASE with many tablespaces

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: 'Fujii Masao' <masao(dot)fujii(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Recovery performance of DROP DATABASE with many tablespaces
Date: 2019-11-19 06:39:53
Message-ID: OSBPR01MB32077D0AF3728A1F92469C6BEF4C0@OSBPR01MB3207.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 13, 2019 5:34PM (GMT+9), Fujii Masao wrote:
> On Wed, Nov 13, 2019 at 3:57 PM k(dot)jamison(at)fujitsu(dot)com <k(dot)jamison(at)fujitsu(dot)com>
> wrote:
> >
> > On Wed, Oct. 2, 2019 5:40 PM, Fujii Masao wrote:
> > > On Tue, Jul 10, 2018 at 3:04 PM Michael Paquier <michael(at)paquier(dot)xyz>
> wrote:
> > > >
> > > > On Thu, Jul 05, 2018 at 01:42:20AM +0900, Fujii Masao wrote:
> > > > > TBH, I have no numbers measured by the test.
> > > > > One question about your test is; how did you measure the
> > > > > *recovery
> > > > > time* of DROP DATABASE? Since it's *recovery* performance,
> > > > > basically it's not easy to measure that.
> > > >
> > > > It would be simple to measure the time it takes to replay this
> > > > single DROP DATABASE record by putting two gettimeofday() calls or
> > > > such things and then take the time difference. There are many
> > > > methods that you could use here, and I suppose that with a shared
> > > > buffer setting of a couple of GBs of shared buffers you would see
> > > > a measurable difference with a dozen of tablespaces or so. You
> > > > could also take a base backup after creating all the tablespaces,
> > > > connect the standby and then drop the database on the primary to
> > > > see the actual time it takes. Your patch looks logically correct
> > > > to me because DropDatabaseBuffers is a
> > > > *bottleneck* with large shared_buffers, and it would be nice to
> > > > see numbers.
> > >
> > > Thanks for the comment!
> > >
> > > I measured how long it takes to replay DROP DATABASE with 1000
> > > tablespaces, in master and patched version. shared_buffers was set to
> 16GB.
> > >
> > > [master]
> > > It took 8 seconds to replay DROP DATABASE with 1000 tablespaces, as follows.
> > > In this case, 16GB shared_buffers was fully scanned 1000 times.
> > >
> > > 2019-10-02 16:50:14 JST LOG: redo starts at 0/2000028
> > > 2019-10-02 16:50:22 JST LOG: redo done at 0/300A298
> > >
> > > [patched]
> > > It took less than 1 second to replay DROP DATABASE with 1000
> > > tablespaces, as follows. In this case, 16GB shared_buffers was scanned
> only one time.
> > >
> > > 2019-10-02 16:47:03 JST LOG: redo starts at 0/2000028
> > > 2019-10-02 16:47:03 JST LOG: redo done at 0/3001588
> > >
> >
> > Hi Fujii-san,
> >
> > It's been a while, so I checked the patch once again.
> > It's fairly straightforward and I saw no problems nor bug in the code.
>
> Thanks for the review!
>
> > > [patched]
> > > It took less than 1 second to replay DROP DATABASE with 1000
> > > tablespaces,
> > The results are good.
> > I want to replicate the performance to confirm the results as well.
> > Could you share how you measured the recovery replay?
>
> I forgot the actual steps that I used for the measurement.
> But I think they are something like
>
> 1. create database "hoge"
> 2. create 1,000 tablespaces
> 3. create 1,000 tables on the database "hoge".
> each table should be placed in different tablespace.
> 4. take a base backup
> 5. drop database "hoge"
> 6. shutdown the server with immediate mode 7. start an archive recovery from
> the backup taken at #4 8. measure how long it takes to apply DROP DATABASE
> record by
> checking the timestamp at REDO start and REDO end.
>
> I think that I performed the above steps on the master and the patched version.
>
> > Did you actually execute a failover?
>
> No.

I'm sorry for the late reply, and thanks for the guide above.
I replicated the same recovery test above on a standalone server
and have confirmed with the logs that the patch made the recovery faster.

[MASTER/UNPATCHED] ~10 seconds
2019-11-19 15:25:23.891 JST [23042] LOG: redo starts at 0/180006A0
...
2019-11-19 15:25:34.492 JST [23042] LOG: redo done at 0/1800A478

[PATCHED] ~less than 1 sec
2019-11-19 15:31:59.415 JST [17625] LOG: redo starts at 0/40005B8
...
2019-11-19 15:32:00.159 JST [17625] CONTEXT: WAL redo at 0/4000668 for Database/DROP: dir 1663/16384 16385/16384...//further details ommitted//...
...
2019-11-19 15:32:00.159 JST [17625] LOG: redo done at 0/4001638

I believe there are no problems, so I am marking this patch now
as "Ready for Committer".

Regards,
Kirk Jamison

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-11-19 07:37:04 Re: Hypothetical indexes using BRIN broken since pg10
Previous Message Michael Paquier 2019-11-19 05:40:25 Re: Hypothetical indexes using BRIN broken since pg10