Re: [PATCH] Speedup truncates of relation forks

From: Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>
To: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Speedup truncates of relation forks
Date: 2019-06-26 09:09:58
Message-ID: e516bb08-e846-739d-886b-d3119b9992e0@anayrat.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/12/19 10:29 AM, Jamison, Kirk wrote:
>
>> From a user POW, the main issue with relation truncation is that it can block
>> queries on standby server during truncation replay.
>>
>> It could be interesting if you can test this case and give results of your
>> path.
>> Maybe by performing read queries on standby server and counting wait_event
>> with pg_wait_sampling?
>
> Thanks for the suggestion. I tried using the extension pg_wait_sampling,
> But I wasn't sure that I could replicate the problem of blocked queries on standby server.
> Could you advise?
> Here's what I did for now, similar to my previous test with hot standby setup,
> but with additional read queries of wait events on standby server.
>
> 128MB shared_buffers
> SELECT create_tables(10000);
> SELECT insert_tables(10000);
> SELECT delfrom_tables(10000);
>
> [Before VACUUM]
> Standby: SELECT the following view from pg_stat_waitaccum
>
> wait_event_type | wait_event | calls | microsec
> -----------------+-----------------+-------+----------
> Client | ClientRead | 2 | 20887759
> IO | DataFileRead | 175 | 2788
> IO | RelationMapRead | 4 | 26
> IO | SLRURead | 2 | 38
>
> Primary: Execute VACUUM (induces relation truncates)
>
> [After VACUUM]
> Standby:
> wait_event_type | wait_event | calls | microsec
> -----------------+-----------------+-------+----------
> Client | ClientRead | 7 | 77662067
> IO | DataFileRead | 284 | 4523
> IO | RelationMapRead | 10 | 51
> IO | SLRURead | 3 | 57
>

(Sorry for the delay, I forgot to answer you)

As far as I remember, you should see "relation" wait events (type lock) on
standby server. This is due to startup process acquiring AccessExclusiveLock for
the truncation and other backend waiting to acquire a lock to read the table.

On primary server, vacuum is able to cancel truncation:

/*
* We need full exclusive lock on the relation in order to do
* truncation. If we can't get it, give up rather than waiting --- we
* don't want to block other backends, and we don't want to deadlock
* (which is quite possible considering we already hold a lower-grade
* lock).
*/
vacrelstats->lock_waiter_detected = false;
lock_retry = 0;
while (true)
{
if (ConditionalLockRelation(onerel, AccessExclusiveLock))
break;

/*
* Check for interrupts while trying to (re-)acquire the exclusive
* lock.
*/
CHECK_FOR_INTERRUPTS();

if (++lock_retry > (VACUUM_TRUNCATE_LOCK_TIMEOUT /
VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL))
{
/*
* We failed to establish the lock in the specified number of
* retries. This means we give up truncating.
*/
vacrelstats->lock_waiter_detected = true;
ereport(elevel,
(errmsg("\"%s\": stopping truncate due to conflicting lock request",
RelationGetRelationName(onerel))));
return;
}

pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L);
}

To maximize chances to reproduce we can use big shared_buffers. But I am afraid
it is not easy to perform reproducible tests to compare results. Unfortunately I
don't have servers to perform tests.

Regards,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-06-26 09:11:53 Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Previous Message Elif Ak 2019-06-26 09:06:14 GSoD Introductory Tutorial