Re: query hangs out

From: Антон Глушаков <a(dot)glushakov86(at)gmail(dot)com>
To: ikramuddin <ikram(dot)amani815(at)gmail(dot)com>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: query hangs out
Date: 2025-05-21 09:06:54
Message-ID: CAHnOmacs_JB_wjA9g=PZyuH5jtbFb_qkf-dS7h7H4bkS4nTQeg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-admin

The problem is with only one table.

As a result, I determined that the problem is on page 5 of the table (I
made SELECT ctid selections until it hangs).
Then I tried to delete rows by ctid (5, 0-100) from the table until I found
the problematic row.

Content through pageinspect:
# SELECT * FROM heap_page_items(get_raw_page('"InboxState"', 5)) where lp =
51;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------
lp | 51
lp_off | 3760
lp_flags | 1
lp_len | 100
t_xmin | 136269917
t_xmax | 66664135
t_field3 | 0
t_ctid | (47,13)
t_infomask2 | 8203
t_infomask | 4929
t_hoff | 32
t_bits | 1111011000000000
t_oid |
t_data |
\x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78bcf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000

I couldn't delete it in the standard way (delete from "InboxState" where
ctid = '(5,51)') - it also hangs.

But I can freeze it through pg_surgery.

# select heap_force_freeze('"InboxState"'::regclass, ARRAY['(5,
51)']::tid[]);

Output after freeze:
digitalarchive=# SELECT * FROM heap_page_items(get_raw_page('"InboxState"',
5)) where lp = 51;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------
lp | 51
lp_off | 3760
lp_flags | 1
lp_len | 100
t_xmin | 2
t_xmax | 0
t_field3 | 0
t_ctid | (5,51)
t_infomask2 | 11
t_infomask | 2817
t_hoff | 32
t_bits | 1111011000000000
t_oid |
t_data |
\x3e8a7c00000000000100000090877a16b4b308dd9460898784c4af2dab692693d29bdf78bcf5153401000000fd55f20f44ec08dd9460f88969b943ab3cd8020000000000

After that, queries to the table started to work normally.
I'll note that there are absolutely no errors in the postgres logs,
checksums are enabled, there are no errors for them either.

It seems that this is a bug.

ср, 21 мая 2025 г. в 02:52, ikramuddin <ikram(dot)amani815(at)gmail(dot)com>:

> Is it taking too long only for this table or other tables also? If the
> issue is with this single table then check when it started to happened ,
> mean after creating one index or whatever change you perform just get back
> to that point and now the query should run fine
>
>
>
>
> On Tue, 20 May 2025 at 9:14 PM, Антон Глушаков <a(dot)glushakov86(at)gmail(dot)com>
> wrote:
>
>> Thanks for the advice.
>> I tried to remove all indexes and constraints from the table - it did not
>> help.
>> I have a copy of the data (before truncate) - I can test any hypothesis
>>
>> вт, 20 мая 2025 г. в 18:25, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>:
>>
>>> On Tue, 2025-05-20 at 16:48 +0300, Антон Глушаков wrote:
>>> > I encountered a very strange behavior.
>>> > For any query (even a simple count(*) to one specific table (a small
>>> 30MB table with 3 indexes,
>>> > without any specific data types - everything is standard out of the
>>> box vanilla Postgres) -
>>> > the query hangs dead. Waited more than 24 hours - the query did not
>>> complete).
>>> >
>>> >
>>> > Similarly, the vacuum process to the table hangs.
>>> > Only Kill -9 with a full restart helps
>>> >
>>> > I get a backtrace, from it - I then examined the pg_multixact
>>> directory, which at the time of
>>> > the problem had swelled to 900MB and had several thousand files.
>>> > I excluded long and inactive transactions, as well as prepared
>>> statements.
>>> >
>>> > The workaround in the end was this - truncate the table (it was
>>> successful), then vacuum freeze
>>> > each DB, and after that the files from pg_multixact disappeared.
>>> >
>>> > What could it be? vacuum\freeze\mulitxact settings are default.
>>> > At the same time, the value pg_database.datminmxid=1
>>> > Could the problem with the hang be related to the many old files in
>>> pg_multixact ? (judging by the backtrace - yes)
>>>
>>> I can't say for certain, but I have seen cases like that where index
>>> corruption sent
>>> processes into an endless loop. Next time you could try to rebuild the
>>> indexes.
>>>
>>> Yours,
>>> Laurenz Albe
>>>
>>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Laurenz Albe 2025-05-21 12:38:20 Re: query hangs out
Previous Message Антон Глушаков 2025-05-20 15:43:36 Re: query hangs out