Re: reindex and copy - deadlock?

From: Litao Wu <litaowu(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: reindex and copy - deadlock?
Date: 2004-06-30 16:17:46
Message-ID: 20040630161746.10572.qmail@web13121.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Tom,

Our PG version is 7.3.2.

The copy process is always there. Besides copy
process, there are many select processes wait also
(it is understandable only when reindex,
but how come selects wait when drop/create index?
>From Postgres doc:
Note: Another approach to dealing with a corrupted
user-table index is just to drop and recreate it. This
may in fact be preferable if you would like to
maintain some semblance of normal operation on the
table meanwhile. REINDEX acquires exclusive lock on
the table, while CREATE INDEX only locks out writes
not reads of the table.
)

Each time, whan this happened, it might hang
on the different index.

But one thing is sure:
reindex or create index is granted lock while
others wait. If reindex/create index is not
the perpetrator, how can PG grants it lock
but not others, like COPY?

Forgive me I had not provided the full table and
index names, IP address, etc. for security reason.

Here is the copy of my the first post on June 8:
Hi,

We often experience with the problem that reindex
cannot be finished in our production database.
It's typically done with 30 minutes. However,
sometimes, when there is another "COPY" process,
reindex will not finish. By monitoring the CPU
time reindex takes, it does not increase at all.
That seems a deadlock. But the following query shows
only reindex process (23127)is granted lock while
COPY process (3149) is not.

Last time when we have this problem and kill
reindex process and COPY process does not work.
We had to bounce the database server.

As you know, when reindex is running, nobody can
access the table.
Can someone kindly help?

Thanks,

Here is lock info from database:

replace | database | transaction | pid
| mode | granted
-----------------------+----------+-------------+-------+---------------------+---------
email | 17613 | |
3149 | RowExclusiveLock | f
email_cre_dom_idx | 17613 | |
23127 | ExclusiveLock | t
email_cid_cre_idx | 17613 | |
23127 | ShareLock | t
email_cid_cre_idx | 17613 | |
23127 | AccessExclusiveLock | t
email | 17613 | |
23127 | ShareLock | t
email | 17613 | |
23127 | AccessExclusiveLock | t
email_cid_cre_dom_idx | 17613 | |
23127 | ShareLock | t
email_cid_cre_dom_idx | 17613 | |
23127 | AccessExclusiveLock | t
email_did_cre_idx | 17613 | |
23127 | ShareLock | t
email_did_cre_idx | 17613 | |
23127 | AccessExclusiveLock | t
email_cre_dom_idx | 17613 | |
23127 | AccessExclusiveLock | t
(11 rows)

Here are the processes of 3149 and 23127 from OS:

postgres 3149 1.3 6.4 154104 134444 ? S
Jun03 92:04 postgres: postgres db1 xx.xx.xx.xx COPY
waiting

postgres 23127 3.2 9.3 228224 194512 ? S
03:35 15:03 postgres: postgres db1 [local] REINDEX

Here are queries from database:
23127 | REINDEX table email

3149 | COPY email (...) FROM stdin

--- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Litao Wu <litaowu(at)yahoo(dot)com> writes:
> > It happened again.
> > This time it hangs when we drop/create index.
> > Here is gdb info with --enable-debug postgres.
>
> Well, that pretty much removes all doubt: something
> has left the buffer
> context lock (cntx_lock) set on a buffer that
> certainly ought to be free.
>
> The problem here is that REINDEX (or CREATE INDEX in
> this case) is the
> victim, not the perpetrator, so we still don't know
> exactly what's
> causing the error. We need to go backwards in time,
> so to speak, to
> identify the code that's leaving the buffer locked
> when it shouldn't.
> I don't offhand have a good idea about how to do
> that. Is there another
> process that is also getting stuck when REINDEX does
> (if so please get
> a backtrace from it too)?
>
> BTW, what Postgres version are you using again? The
> line numbers in
> your trace don't square with any current version of
> bufmgr.c ...
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose
> an index scan if your
> joining column's datatypes do not match
>



__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Duane Lee - EGOVX 2004-06-30 16:32:42 Re: postgres 7.4 at 100%
Previous Message Tom Lane 2004-06-30 15:43:38 Re: reindex and copy - deadlock?