答复: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

From: Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: 答复: [External] Re: pgadmin--pgagent---the process hang by unknow reasons
Date: 2020-12-01 09:17:24
Message-ID: HK2PR03MB461039B960E865A297219C7AA8F40@HK2PR03MB4610.apcprd03.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support pgsql-bugs

Hi Dave

Thanks for your update. I try to upgrade pgagent_10-3.4.0. But failed.
Would you like to help me narrow down this upgrade issue ? Thanks

upgrade path
version upgrade
status
yum install pgagent_10
\i pgagent--3.4--4.0.sql
create extension pgagent;
3.4 – 4.0
failed
compile and build. (install cmake and Boost)

\i pgagent--3.4--4.2.sql
create extension pgagent;
3.4 – 4.2
failed.

Currently pgagent_10 3.4.0 env installed by my teammate. He told me that he install it by yum command and execute a script.
Then those pgagent tables created by this script. There is no extension exist. But those pgagent table exist.

[cid:image005(dot)png(at)01D6C806(dot)72C28D00]

Then I try upgrade pgagent_10 3.40 to 4.0 by yum command. It’s successfully upgrade by yum command.
But the pgagent_10 process unable start.

[root(at)sltfjfrauxq ~]# yum upgrade pgagent_10

Total download size: 138 k
Is this ok [y/N]: y
Downloading Packages:
pgagent_10-4.0.0-4.rhel6.x86_64.rpm | 138 kB 00:08
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Updating : pgagent_10-4.0.0-4.rhel6.x86_64 1/2
Cleanup : pgagent_10-3.4.0-10.rhel6.x86_64 2/2
Verifying : pgagent_10-4.0.0-4.rhel6.x86_64 1/2
Verifying : pgagent_10-3.4.0-10.rhel6.x86_64 2/2

Updated:
pgagent_10.x86_64 0:4.0.0-4.rhel6

Complete!
Version: 4.0.0

[cid:image006(dot)png(at)01D6C806(dot)72C28D00]

The 4.0 pgagent start logs.
Tue Dec 1 13:57:53 2020 DEBUG: Creating primary connection
Tue Dec 1 13:57:53 2020 DEBUG: Parsing connection information...
Tue Dec 1 13:57:53 2020 DEBUG: user: postgres
Tue Dec 1 13:57:53 2020 DEBUG: password: *****
Tue Dec 1 13:57:53 2020 DEBUG: dbname: postgres
Tue Dec 1 13:57:53 2020 DEBUG: hostaddr: 127.0.0.1
Tue Dec 1 13:57:53 2020 DEBUG: port: 5432
Tue Dec 1 13:57:53 2020 DEBUG: Creating DB connection: user=postgres password=abcd-1234 hostaddr=127.0.0.1 port=5432 dbname=postgres
Tue Dec 1 13:57:53 2020 DEBUG: Database sanity check

I try to upgrade it by sql. But still on failed.

[root(at)slbwcbnos2 sql]# pwd
/data/postgres/new_package/pgAgent-4.0.0-Source/sql
[root(at)slbwcbnos2 sql]# ls -al
total 36
drwxr-xr-x 2 501 games 90 Dec 1 14:25 .
drwxr-xr-x 6 501 games 298 Jul 12 2018 ..
-rw-r--r-- 1 501 games 538 Jul 12 2018 pgagent--3.4--4.0.sql
-rw-r--r-- 1 501 games 27772 Jul 12 2018 pgagent.sql
-rw-r--r-- 1 501 games 2603 Jul 12 2018 pgagent--unpackaged--4.0.sql

postgres=# \i pgagent--3.4--4.0.sql
Use "CREATE EXTENSION pgagent UPDATE" to load this file.
postgres=# CREATE EXTENSION pgagent UPDATE;
ERROR: syntax error at or near "UPDATE"
LINE 1: CREATE EXTENSION pgagent UPDATE;
^
postgres=# CREATE EXTENSION pgagent UPDATE;
ERROR: syntax error at or near "UPDATE"
LINE 1: CREATE EXTENSION pgagent UPDATE;
^
postgres=# CREATE EXTENSION pgagent ;
ERROR: extension "pgagent" has no installation script nor update path for version "3.4"
postgres=#

After failed. I try to build and compile the latest version 4.2.
Build and install successfully. But I still on unable upgrade current 3.4 version.
I could create extension successfully on a new database.

postgres=# \i pgagent--3.4--4.2.sql
Use "CREATE EXTENSION pgagent UPDATE" to load this file.
postgres=# CREATE EXTENSION pgagent UPDATE;
ERROR: syntax error at or near "UPDATE"
LINE 1: CREATE EXTENSION pgagent UPDATE;
^
postgres=# CREATE EXTENSION pgagent ;
ERROR: relation "pga_jobagent" already exists
postgres=#

[cid:image007(dot)png(at)01D6C806(dot)72C28D00]

徐志宇(Jack)
Database Engineer

DB Team,ITS. Lenovo China
Phone: 86-18910860709
Email:xuzy13(at)lenovo(dot)com
No.6 Shangdi West Road, Haidian District Beijing, China, 100085

发件人: Dave Page <dpage(at)pgadmin(dot)org>
发送时间: 2020年11月30日 19:29
收件人: Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com>
抄送: pgadmin-support(at)postgresql(dot)org
主题: Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

Hi

On Thu, Nov 26, 2020 at 4:45 PM Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com<mailto:xuzy13(at)lenovo(dot)com>> wrote:
Hi Dave

Thanks for your quick response.
This env was deploy on Jan 2019 by my team mate. Currently the pgagent have 30 jobs running. The version is pgagent_10-3.4.0
I don’t know how to upgrade the pgagent. I try to find upgrade document. But failed.
Only find that edb ppas could upgrade pgagent.
https://www.enterprisedb.com/edb-docs/d/edb-postgres-advanced-server/installation-getting-started/upgrade-guide/11/EDB_Postgres_Advanced_Server_Upgrade_Guide.1.13.html

The PGAgent that comes with EDB Advanced Server is quite different from the Open Source version. Assuming you're using the RPM packages on RHEL/CentOS 6, you should just be able to use "yum upgrade ..." to upgrade to the latest version. Looking at the postgresql-common repository on yum.postgresql.org<http://yum.postgresql.org>, I see that v4.0.0 is available (https://ftp.postgresql.org/pub/repos/yum/common/redhat/rhel-6-x86_64/)

If I re-install the pgagent with latest version. Whether the old pgagent jobs will drop with old version pgagent ?
Would you like to guide me to make pgagent using new Boost package and no impact currently working pgagent jobs ?
I don’t want to rebuild all pgagent jobs. Thanks in advance.

Upgrading pgAgent will not affect the jobs you have defined already.

徐志宇(Jack)
Database Engineer

DB Team,ITS. Lenovo China
Phone: 86-18910860709
Email:xuzy13(at)lenovo(dot)com<mailto:Email%3Axuzy13(at)lenovo(dot)com>
No.6 Shangdi West Road, Haidian District Beijing, China, 100085

发件人: Dave Page <dpage(at)pgadmin(dot)org<mailto:dpage(at)pgadmin(dot)org>>
发送时间: 2020年11月26日 19:39
收件人: Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com<mailto:xuzy13(at)lenovo(dot)com>>
抄送: pgadmin-support(at)postgresql(dot)org<mailto:pgadmin-support(at)postgresql(dot)org>
主题: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

Hi

Given the libwx* references in your stacktrace, you appear to be using an old version of pgagent - we removed the dependency on wxWidgets nearly 2.5 years ago and replaced it with Boost.

Please upgrade and try again.

Thanks.

On Thu, Nov 26, 2020 at 8:05 AM Zhiyu ZY13 Xu <xuzy13(at)lenovo(dot)com<mailto:xuzy13(at)lenovo(dot)com>> wrote:
Hi Support

I using pgagent over 2 years. There are over 30 jobs running by pgagent. Recently. I found a problem that sometime the pgagent hang by unknow reasons.
From the stack information. Look like the pagent experience dead-lock issue in code.
The stack display many thread stop on this function “in __lll_lock_wait”
If you need more information. Please let me know. I suspect this is a bug.

I collect to pgagent trace log and stack information on the attachment.

pgagent trace log
pg_agent_11_24.log
pg_agent_11_26.log
pgagent process stack
others information.

version:
pgagent_10-3.4.0-10.rhel6.x86_64
PG 10.5

The typical stack information.

[postgres(at)sltfjfrauxq pgagent_pd]$ cat 23389.stark.1
Thread 7 (Thread 0x7ff745f5c700 (LWP 906)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15b819 in DBconn::Return() ()
#5 0x00007ff74c161217 in Job::Execute() ()
#6 0x00007ff74c162899 in JobThread::Entry() ()
#7 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#8 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7ff72ffff700 (LWP 908)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15b819 in DBconn::Return() ()
#5 0x00007ff74c161217 in Job::Execute() ()
#6 0x00007ff74c162899 in JobThread::Entry() ()
#7 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#8 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7ff74695d700 (LWP 910)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15b819 in DBconn::Return() ()
#5 0x00007ff74c161217 in Job::Execute() ()
#6 0x00007ff74c162899 in JobThread::Entry() ()
#7 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#8 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7ff74735e700 (LWP 1565)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15b819 in DBconn::Return() ()
#5 0x00007ff74c161217 in Job::Execute() ()
#6 0x00007ff74c162899 in JobThread::Entry() ()
#7 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#8 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7ff74555b700 (LWP 1567)):
#0 0x00007ff74ad40403 in poll () from /lib64/libc.so.6
#1 0x00007ff74bd1c28f in ?? () from /usr/lib64/libpq.so.5
#2 0x00007ff74bd1c310 in ?? () from /usr/lib64/libpq.so.5
#3 0x00007ff74bd178e2 in ?? () from /usr/lib64/libpq.so.5
#4 0x00007ff74bd1865f in PQconnectdb () from /usr/lib64/libpq.so.5
#5 0x00007ff74c15ad71 in DBconn::Connect(wxString const&) ()
#6 0x00007ff74c15af73 in DBconn::DBconn(wxString const&, wxString const&) ()
#7 0x00007ff74c15bfe8 in DBconn::Get(wxString const&, wxString const&) ()
#8 0x00007ff74c16108f in Job::Execute() ()
#9 0x00007ff74c162899 in JobThread::Entry() ()
#10 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#11 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ff744b5a700 (LWP 1569)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15bf6b in DBconn::Get(wxString const&, wxString const&) ()
#5 0x00007ff74c16108f in Job::Execute() ()
#6 0x00007ff74c162899 in JobThread::Entry() ()
#7 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from /usr/lib64/libwx_baseu-2.8.so.0
#8 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ff74c3507e0 (LWP 23389)):
#0 0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007ff74ba979c9 in wxMutexInternal::Lock() () from /usr/lib64/libwx_baseu-2.8.so.0
#4 0x00007ff74c15a99d in DBconn::ClearConnections(bool) ()
#5 0x00007ff74c15e908 in MainRestartLoop(DBconn*) ()
#6 0x00007ff74c15f2a3 in MainLoop() ()
#7 0x00007ff74c15e016 in main ()

徐志宇(Jack)
Database Engineer

DB Team,ITS. Lenovo China
Phone: 86-18910860709
Email:xuzy13(at)lenovo(dot)com<mailto:Email%3Axuzy13(at)lenovo(dot)com>
No.6 Shangdi West Road, Haidian District Beijing, China, 100085

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message Dave Page 2020-12-01 09:48:13 Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons
Previous Message Dave Page 2020-11-30 11:29:06 Re: [External] Re: pgadmin--pgagent---the process hang by unknow reasons

Browse pgsql-bugs by date

  From Date Subject
Next Message Hóa Phan 2020-12-01 09:18:41 Fwd: Posgress 8.3
Previous Message PG Bug reporting form 2020-12-01 08:20:23 BUG #16756: about pg13.0 new feature--“Remove --adduser and --no-adduser from createuser”