Re: BUG #14938: ALTER TABLE hang/ poor performance

From: Dipesh Kamdar <dipesh(dot)kamdar(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14938: ALTER TABLE hang/ poor performance
Date: 2017-12-01 17:37:39
Message-ID: CACLLSiN0HZAK5YBHURXKq6-=Um3cmjFcPVnTrzKRyVS-tEwjLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

#1 9.5.3 was released more than a year ago, you might be hitting one of
the
bugs fixed since then.

>>> Do you know any bug related ?

#2 What do you mean by deadlock?Moreover, autovacuum certainly should not
deadlock with anything (certainly not with DELETE)

>>> Please take a look of SEQ
deleting TAB1 data of 50K ------ T1
autovacuum check TAB1 ------- T2 ( Since delete is running
skipping table )
delete completed ---- T3
before starting next delete on same table autovacuum started on TAB1
---- T4
DELETE statement is waiting for autovacuum to release lock.
after some time later reported deadlock in log and killed DELETE
process.

#3 I also don't quite understand why you do the delete in batches of 50k
rows, to be honest

>> It is client facing application with close to 1.5 billion records and
has multiple FK other and indexes removing again millions of record in
single statement means
putting entire online application on fire.

#4 I am running following SQL as monitoring point of every 10min.
Let me know if anything is missing above monitor script.

SELECT

COALESCE(blockingl.relation*::*regclass*::*text,blockingl.locktype) as
locked_item,

now() *-* blockeda.query_start AS waiting_duration, blockeda.pid AS
blocked_pid,

blockeda.query as blocked_query, blockedl.mode as blocked_mode,

blockinga.pid AS blocking_pid, blockinga.query as blocking_query,

blockingl.mode as blocking_mode

FROM pg_catalog.pg_locks blockedl

INNER JOIN pg_stat_activity blockeda ON blockedl.pid *=* blockeda.pid

INNER JOIN pg_catalog.pg_locks blockingl ON(

( (blockingl.transactionid*=*blockedl.transactionid) OR

(blockingl.relation*=*blockedl.relation AND
blockingl.locktype*=*blockedl.locktype)

) AND blockedl.pid *!=* blockingl.pid)

INNER JOIN pg_stat_activity blockinga ON blockingl.pid *=*
blockinga.pid AND blockinga.datid *=* blockeda.datid

WHERE NOT blockedl.granted

AND blockinga.datname *=* current_database()

#5 It is production , We will not able to attach process to gdb.

#6 My main concern is not coming in pg_locks table.

Is it advice to use update statement on pg_class.reloptions column rather
than using ALTER TABLE SET option.

-Dipesh

On Fri, Dec 1, 2017 at 6:49 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
> On 12/01/2017 12:30 AM, dipesh(dot)kamdar(at)gmail(dot)com wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference: 14938
> > Logged by: Dipesh Kamdar
> > Email address: dipesh(dot)kamdar(at)gmail(dot)com
> > PostgreSQL version: 9.5.3
> > Operating system: Linux
> > Description:
> >
> > Postgres : 9.5.3
> > Kernal : 3.10.0-327.13.1.el7.x86_64
> > Linux : x86_64 x86_64 x86_64 GNU/Linux
> > RAM 128GB
> > DISK : 1.5TB
> >
>
> 9.5.3 was released more than a year ago, you might be hitting one of the
> bugs fixed since then.
>
> >
> > We have nightly job that removed millions of records from multiple table.
> >
> > We had following approach.
> > 1. Delete data from table in batch of 50000
> >
> > Problem with above approach many time autovacuum on table and delete
> > statement on table create deadlock.
> > In order to avoid above problem, we have taken following approach.
> >
>
> What do you mean by deadlock? Moreover, autovacuum certainly should not
> deadlock with anything (certainly not with DELETE). In the worst case
> autovacuum should cancel itself automatically.
>
> I also don't quite understand why you do the delete in batches of 50k
> rows, to be honest.
>
> > 1 Turn off autovacuum on table by using ALTER TABLE <tablename> SET
> (
> > autovacuum_enabled=false);
> > 2. Delete data from table in batch of 50000
> > 3. Turn On autovacuum on table by using ALTER TABLE <tablename> SET (
> > autovacuum_enabled=true);
> >
> >
> > Problem with second approach ALTER TABLE tablename SET (
> > autovacuum_enabled=FALSE) get hang very often. I am not seeing
> > anything pg_lock that is waiting for resource and any other process
> > blocking. Process manytime take 12hour , 13hours etc.
> Are you sure there's nothing in pg_locks? What does
>
> SELECT pg_backend_pid();
> ALTER TABLE tablename SET (autovacuum_enabled=FALSE);
>
> and then in ahother session (when the ALTER TABLE gets stuck)
>
> SELECT * FROM pg_locks WHERE pid = $PID; <- pg_backend_pid
> SELECT * FROM pg_locks WHERE NOT granted;
>
> show?
>
> If it really does not show any waiting locks, then you'll need to
> inspect it using gdb. Install debuginfo packages, and then do
>
> gdb -p $PID
> (gdb) bt
>
> FWIW I really doubt you really neet this process of disabling/enabling
> autovacuum. This should work just fine with autovacuum running.
>
> >
> > Found article on net regarding vacuum stuck
> >
> > http://rhaas.blogspot.com/2011/03/troubleshooting-stuck-vacuums.html
> >
>
> That is entirely unrelated. The article is about the autovacuum process
> getting stuck, while you're observing ALTER TABLE getting stuck.
>
> > Database monitoring script is not reporting any waiting, pg_log is not
> > reporting any error or deadlock.
> > Is there anyway we can figure out any process is blocking or waiting for
> > resource etc.
> > My basic understanding about this ALTER TABLE SET command it updated
> record
> > in pg_classs.reloptions column.
> > Does ALTER TABLE SET option block complete table?
> >
>
> It does need a lock on the table, yes. So if there are any long-running
> queries accessing that table, it may need to wait for them to complete.
> But that should be visible in pg_locks, and you claim there's nothing
> (no locks waiting).
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2017-12-01 18:30:33 Re: BUG #14938: ALTER TABLE hang/ poor performance
Previous Message Tomas Vondra 2017-12-01 17:17:45 Re: BUG #14940: Duplicated records inspite of primary key and unique constraint