Re: pgbench bug candidate: negative "initial connection time"

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, kuroda(dot)hayato(at)fujitsu(dot)com, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench bug candidate: negative "initial connection time"
Date: 2021-06-18 15:46:05
Message-ID: 20210619004605.83386a39ffca891f809f38a5@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Horiguchi-san, Fabien,

On Fri, 18 Jun 2021 15:58:48 +0200 (CEST)
Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:

>
> >>> /* must be something wrong */
> >>> pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
> >>> goto done;
> >>
> >> Should say such like "thread %d aborted: %s() failed: ...".
>
> After having a lookg, there are already plenty such cases. I'd say not to
> change anything for beta, and think of it for the next round.

Agreed. Basically, I think the existing message should be left as is.

> >> ====
> >> errno = THREAD_BARRIER_INIT(&barrier, nthreads);
> >> if (errno != 0)
> >> + {
> >> pg_log_fatal("could not initialize barrier: %m");
> >> + exit(1);
> >>
> >> This is a run-time error. Maybe we should return 2 in that case.
>
> I think that you are right, but there are plenty such places where exit
> should be 2 instead of 1 if the doc is followed:
>
> """Errors during the run such as database errors or problems in the script
> will result in exit status 2."""
>
> My beta take is to let these as they are, i.e. pretty inconsistent all
> over pgbench, and schedule a cleanup on the next round.

As same as the below Fabian's comment about thread->logfile,

> >> ===
> >> if (thread->logfile == NULL)
> >> {
> >> pg_log_fatal("could not open logfile \"%s\": %m", logpath);
> >> - goto done;
> >> + exit(1);
> >>
> >> Maybe we should exit with 2 this case.
> >
> > Yep.
>
> The bench is not even started, this is not really run time yet, 1 seems
> ok. The failure may be due to a typo in the path which comes from the
> user.

the bench is not started at THREAD_BARRIER_INIT, so I think exit(1) is ok.

>
> >> /* must be something wrong */
> >> - pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD);
> >> + pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
> >> goto done;
> >>
> >> Why doesn't a fatal error cause an immediate exit?
> >
> > Good point. I do not know, but I would expect it to be the case, and AFAICR
> > it does not.
> >
> >> (And if we change this to fatal, we also need to change similar errors in
> >> the same function to fatal.)
> >
> > Possibly.
>
> On second look, I think that error is fine, indeed we do not stop the
> process, so "fatal" it is not;

I replaced this 'fatal' with 'error' because we are aborting the client
instead of exit(1). When pgbench was rewritten to use common logging API
by the commit 30a3e772b40, somehow pg_log_fatal was used, but I am
wondering it should have be pg_log_error.

> Attached Yugo-san patch with some updates discussed in the previous mails,
> so as to move things along.

Thank you for update. I agree with this fix.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Egor Rogov 2021-06-18 16:22:51 pg_stats and range statistics
Previous Message Tom Lane 2021-06-18 15:24:00 Re: Centralizing protective copying of utility statements