BUG #15299: relation does not exist errors

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: jeff(at)pgexperts(dot)com
Subject: BUG #15299: relation does not exist errors
Date: 2018-07-26 16:38:43
Message-ID: 153262312335.1404.10540240090445927053@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15299
Logged by: Jeff Frost
Email address: jeff(at)pgexperts(dot)com
PostgreSQL version: 9.5.13
Operating system: Ubuntu 14.04
Description:

We recently upgraded some production servers to 9.5.13 on Saturday
(7/21/2018) and they ran fine for 2 days. On Wednesday morning, we started
seeing some strange issues where postgres was reporting tons of "relation"
does not exist errors. Here are some snippets from the logs:

2018-07-25 14:33:53.243
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90363,"PARSE",2018-07-25
13:38:03 GMT,287/3021104,0,ERROR,42P01,"relation ""login_informations"" does
not exist",,,
,,,"SELECT ""login_informations"".* FROM ""login_informations"" WHERE
""login_informations"".""id"" = 735211 LIMIT 1",37,,"sidekiq 4.1.2 procore
[0 of 5 busy]"
2018-07-25 14:33:53.251
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90364,"PARSE",2018-07-25
13:38:03 GMT,287/3021107,0,ERROR,42P01,"relation ""login_informations"" does
not exist",,,
,,,"SELECT ""login_informations"".* FROM ""login_informations"" WHERE
""login_informations"".""id"" = 735211 LIMIT 1 ",37,,"sidekiq
4.1.2 procore [0 of 5 busy]"
2018-07-25 14:33:53.252
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90365,"PARSE",2018-07-25
13:38:03 GMT,287/3021109,0,ERROR,42P01,"relation ""login_informations"" does
not exist",,,
,,,"SELECT ""login_informations"".* FROM ""login_informations"" WHERE
""login_informations"".""id"" = 735211 LIMIT 1 ",37,,"sidekiq
4.1.2 procore [0 of 5 busy]"
2018-07-25 14:33:53.258
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90366,"PARSE",2018-07-25
13:38:03 GMT,287/3021110,0,ERROR,42P01,"relation ""login_informations"" does
not exist",,,
,,,"SELECT ""login_informations"".* FROM ""login_informations"" WHERE
""login_informations"".""id"" = 735211 LIMIT 1",37,,"sidekiq 4.1.2 procore
[0 of 5 busy]"
2018-07-25 14:33:53.295
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90367,"PARSE",2018-07-25
13:38:03 GMT,287/3021111,0,ERROR,42P01,"relation ""login_informations"" does
not exist",,,
,,,"SELECT ""login_informations"".* FROM ""login_informations"" WHERE
""login_informations"".""id"" = 735211 LIMIT 1 ",37,,"sidekiq
4.1.2 procore [0 of 5 busy]"
2018-07-25 14:33:53.296
GMT,"app_db","webapp",34887,"127.0.0.1:56740",5b587d3b.8847,90368,"PARSE",2018-07-25
13:38:03 GMT,287/3021112,0,ERROR,42P01,"relation ""permission_templates""
does not exist",
,,,,,"SELECT ""permission_templates"".* FROM ""permission_templates"" WHERE
""permission_templates"".""id"" = 137880 LIMIT 1 ",39,,
"sidekiq 4.1.2 procore [0 of 5 busy]"

Connecting to the DB via psql and issuing a \d login_informations showed
that the table was there. Running one of the queries from the logs by hand
worked fine.

Since we use pgbouncer as a connection pooler, we then connected using psql
through pgbouncer on one of the affected hosts and were able to reproduce
the issue, so the theory was that some postgresql backends had lost track of
relations. We stopped/started pgbouncer to allow those backends to exit and
the issue was resolved.

A little about the DB in case this is helpful:

DB size: 12TB
DB load: 25k xacts/sec
Characterization: 80% read / 20% write
Replication role: primary with 1 direct replica and a few cascaded
replicas

So far haven't been able to reproduce synthetically, but let me know if
there is any more info we can pull from the server if this happens again.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-07-26 16:50:13 Re: BUG #15299: relation does not exist errors
Previous Message Bossart, Nathan 2018-07-26 15:40:07 Re: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack