Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.

From: Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>
To: tomohiro hiramitsu <hiramit(dot)tm(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, skoposov(at)ed(dot)ac(dot)uk, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.
Date: 2021-02-18 11:29:16
Message-ID: CAP0=ZVKdHgLGvFJHE_6x2PTrw50_FLqPUSdRNzVTKSb_y8naCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Wed, Jan 6, 2021 at 11:13 AM Kasahara Tatsuhito
<kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Mon, Dec 28, 2020 at 4:31 PM tomohiro hiramitsu <hiramit(dot)tm(at)gmail(dot)com> wrote:
> >
> > Hi
> >
> > > Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
> > >
> > > That said, I'd really prefer to see *some* kind of counter that could
> > > get people noticing this a bit earlier, rather than just bugging them
> > > in the logfile once it's gone over a threshold. A statistics counter
> > > maybe, but I guess that'd have to be tracked at a per-table level to
> > > be really useful, which would make it potentially fairly expensive to
> > > keep around...
> >
> > I think the statistics on OID assignments per-table is a good idea, but how about simply logging the event to notify users as the first step?
> > I think it's easy to implement and helpful for users.
> +1
> It is a simple and quick way for users to know if the OID assignment
> is taking a long time or not.
> At least it will be better than now.
>
> > First of all, I would like to provide a way for the user to know that it is taking a long time to generate the OID, and provide an opportunity for the user to deal with this problem with statement_timeout etc.
> > This is an interim measure, but I think it's better than it is.
> >
> > Since it is complicated to handle WaitEvent and statistics for each table, I first considered a patch that simply outputs logs, giving priority to avoiding a hang state without outputting logs.
> >
> > * v01-0001-OID_log_output.patch
> > In this patch, GetNewOidWithIndex outputs a log when it loops too many times to assign a new OID.
> > Also, if the log is output even once, it will be output to the log when the OID is assigned.
> >
> > * This patch does not cause an error even if it loops too many times.
> > How much time can be tolerated in the OID search loop depends on the user, so I think that the user should decide the time to make an error with statement_timeout.
> >
> > * The log output interval threshold increases by "* = 2"(exponential backoff) and the maximum number of log outputs is set to 5.
> > The reason for setting this threshold is to reduce the number of log outputs and output the log before stopping at statement_timeout etc. I think the appropriate value for this threshold is controversial.
> > Since it is not necessary to keep logging, the upper limit is set to 5 times.
> >
> > * The log output interval threshold is set to 1 million for the first log output.
> > In my environment, it takes about 4 seconds to loop 1 million times, and the 5th log is output after about 2 minutes.I think this threshold is just right considering statements_timeout, but it's controversial.
> >
> > Log output does not need to continue output during OID assignment. I think the first few and the last log are enough.
> >
> > I will add this patch to the commitfest. I look forward to your feedback about the patch.
I tested the patch.
The patch can be applied with the current HEAD
(eb42110d952f8d1ad4049b8f2491e9dfba75ffed) and the regression test
passed.

I actually tried to reproduce OID assignment conflicts using the
following procedure.
================================================================================================
DROP TABLE IF EXISTS t1;

-- Create a table. SET TOAST_TUPLE_TARGET for generate much TOAST records.
SELECT 'CREATE UNLOGGED TABLE t1 ( c1 int,' || string_agg(str, ',') ||
') WITH (TOAST_TUPLE_TARGET = 128);'
FROM (SELECT ' c' || i || ' text' as str FROM generate_series(2,21)i )t;\gexec

-- Generate data and insert. This query generate 20 TOAST records per-rows.
-- So this results in 16 million TOAST rows (same as
GETNEWOID_NOTIFICATION_LIMIT).
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT generate_series(1,800000)
,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

-- Check min(chunk_id) to rewind OID following test
SELECT 'SELECT min(chunk_id) FROM pg_toast.' || relname
FROM pg_class WHERE oid = (SELECT reltoastrelid FROM pg_class WHERE
relname = 't1');\gexec

-- Check normal(non OID conflicting) insert's duration
\timing
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

INSERT 0 1
Time: 0.899 ms

-- Reset OID to the value of min(chunk_id) by using pg_resetwal (-o)
-- Check problematic(OID conflicting) insert's duration
\timing
WITH r AS (SELECT string_agg(str, '') as d FROM (SELECT
md5(clock_timestamp()::text) AS str
FROM generate_series(1,4)t)s)
INSERT INTO t1 SELECT 1,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d FROM r;

INSERT 0 1
Time: 36399.478 ms (00:36.399)
================================================================================================

And I got following log messages.
(The STATEMENT section is long and has been cut)
===================================================================================================
2021-02-18 19:04:59.055 JST [32153] LOG: failed to assign new OID in
relation "pg_toast_32018451" after 1000000 retries
2021-02-18 19:04:59.055 JST [32153] STATEMENT: WITH r ...
2021-02-18 19:05:01.299 JST [32153] LOG: failed to assign new OID in
relation "pg_toast_32018451" after 2000000 retries
2021-02-18 19:05:01.299 JST [32153] STATEMENT: WITH r ...
2021-02-18 19:05:05.802 JST [32153] LOG: failed to assign new OID in
relation "pg_toast_32018451" after 4000000 retries
2021-02-18 19:05:05.802 JST [32153] STATEMENT: WITH r ...
2021-02-18 19:05:14.853 JST [32153] LOG: failed to assign new OID in
relation "pg_toast_32018451" after 8000000 retries
2021-02-18 19:05:14.853 JST [32153] STATEMENT: WITH r ...
2021-02-18 19:05:33.185 JST [32153] LOG: failed to assign new OID in
relation "pg_toast_32018451" after 16000000 retries
2021-02-18 19:05:33.185 JST [32153] STATEMENT: WITH r ...
2021-02-18 19:05:33.185 JST [32153] LOG: the new OID has been
assigned in relation "pg_toast_32018451" after 16000021 retries
2021-02-18 19:05:33.185 JST [32153] STATEMENT: WITH r ...
===================================================================================================

It works as expected, and I think the message is useful.

Here are some comments about the code.

+#define GETNEWOID_NOTIFICATION_INTERVAL_FACTOR 1000000

It seems to be more of a minimum value for conflict detection rather
than an interval factor.
So, it would be better to use another name, such as
GETNEWOID_NOTIFICATION_MINVAL or others.

+ /* retry count and notification limit check */
+ if (retry_count == next_notify && next_notify <=
GETNEWOID_NOTIFICATION_LIMIT)
+ {
+ ereport(LOG,
+ (errmsg("failed to assign new OID in relation \"%s\"
after "UINT64_FORMAT" retries",
+ RelationGetRelationName(relation), retry_count)));
+ next_notify *= 2; /* double it for the next notification */
+ }
+ retry_count++;

I think that we should add more comments about why output this LOG message.

Best regards,

> >
> >
> > By the way
> > How about distributing OIDs so that the GetNewOidWithIndex function doesn't take long to find an available OID?
> > For example, skip one and assign an OID.
> >
> > As a side effect of this method, the OID progresses faster.
> > (Maybe the btree index will be fragmented faster)
> I think the idea of assigning OIDs to sparse is interesting.
> However, I think it needs to find out how much it will affect the performance.
> It could be a non-negligible overhead, especially for large amounts of
> data insertion/updating.
>
> Best regards,
>
> > It may be better to include skips only for TOAST chunk_id, as this method may have other side effects.
> >
> > regards.
> > --
> > Tomohiro Hiramitsu
> > NTT Open Source Software Center
> >
> > 2020年12月21日(月) 16:44 Magnus Hagander <magnus(at)hagander(dot)net>:
> >>
> >> On Wed, Nov 18, 2020 at 8:27 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> >
> >> > Hi,
> >> >
> >> > On 2020-11-18 11:57:05 +0100, Magnus Hagander wrote:
> >> > > This definitely sounds like something that's worth putting out as a
> >> > > wait event. Even before you start traversing millions of OIDs it might
> >> > > gradually start to show up, and being able to monitor that would
> >> > > definitely be useful.
> >> >
> >> > I don't think this is likely to work well as a wait event. All the index
> >> > traversals etc will do IO, acquire locks, etc, which will overwrite the
> >> > wait event and reset it to nothing once done.
> >>
> >> Oh meh. Yeah, I didn't think about the actual implementation of doing it :/
> >>
> >> That said, I'd really prefer to see *some* kind of counter that could
> >> get people noticing this a bit earlier, rather than just bugging them
> >> in the logfile once it's gone over a threshold. A statistics counter
> >> maybe, but I guess that'd have to be tracked at a per-table level to
> >> be really useful, which would make it potentially fairly expensive to
> >> keep around...
> >>
> >> --
> >> Magnus Hagander
> >> Me: https://www.hagander.net/
> >> Work: https://www.redpill-linpro.com/
> >>
> >>
> >>
> >>
>
>
> --
> Tatsuhito Kasahara
> kasahara.tatsuhito _at_ gmail.com

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2021-02-18 13:58:48 Re: BUG #16867: savepoints vs. commit and chain
Previous Message David G. Johnston 2021-02-18 00:17:45 Re: BUG #16868: Cannot find sqlstat error codes.