Re: [HACKERS] kqueue

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Mateusz Guzik <mjguzik(at)gmail(dot)com>, Keith Fiske <keith(at)omniti(dot)com>, Matteo Beccati <php(at)beccati(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Marko Tiikkaja <marko(at)joh(dot)to>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: [HACKERS] kqueue
Date: 2018-09-27 23:09:11
Message-ID: 20180927230911.n2optypc3wrhqcfr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-09-28 10:55:13 +1200, Thomas Munro wrote:
> On Tue, May 22, 2018 at 12:07 PM Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > On Mon, May 21, 2018 at 7:27 PM, Mateusz Guzik <mjguzik(at)gmail(dot)com> wrote:
> > > I have benchmarked the change on a FreeBSD box and found an big
> > > performance win once the number of clients goes beyond the number of
> > > hardware threads on the target machine. For smaller number of clients
> > > the win was very modest.
> >
> > So to summarise your results:
> >
> > 32 connections: ~445k -> ~450k = +1.2%
> > 64 connections: ~416k -> ~544k = +30.7%
> > 96 connections: ~331k -> ~508k = +53.6%
>
> I would like to commit this patch for PostgreSQL 12, based on this
> report. We know it helps performance on macOS developer machines and
> big FreeBSD servers, and it is the right kernel interface for the job
> on principle.

Seems reasonable.

> Matteo Beccati reported a 5-10% performance drop on a
> low-end Celeron NetBSD box which we have no explanation for, and we
> have no reports from server-class machines on that OS -- so perhaps we
> (or the NetBSD port?) should consider building with WAIT_USE_POLL on
> NetBSD until someone can figure out what needs to be fixed there
> (possibly on the NetBSD side)?

Yea, I'm not too worried about that. It'd be great to test that, but
otherwise I'm also ok to just plonk that into the template.

> @@ -576,6 +592,10 @@ CreateWaitEventSet(MemoryContext context, int nevents)
> if (fcntl(set->epoll_fd, F_SETFD, FD_CLOEXEC) == -1)
> elog(ERROR, "fcntl(F_SETFD) failed on epoll descriptor: %m");
> #endif /* EPOLL_CLOEXEC */
> +#elif defined(WAIT_USE_KQUEUE)
> + set->kqueue_fd = kqueue();
> + if (set->kqueue_fd < 0)
> + elog(ERROR, "kqueue failed: %m");
> #elif defined(WAIT_USE_WIN32)

Is this automatically opened with some FD_CLOEXEC equivalent?

> +static inline void
> +WaitEventAdjustKqueueAdd(struct kevent *k_ev, int filter, int action,
> + WaitEvent *event)
> +{
> + k_ev->ident = event->fd;
> + k_ev->filter = filter;
> + k_ev->flags = action | EV_CLEAR;
> + k_ev->fflags = 0;
> + k_ev->data = 0;
> +
> + /*
> + * On most BSD family systems, udata is of type void * so we could simply
> + * assign event to it without casting, or use the EV_SET macro instead of
> + * filling in the struct manually. Unfortunately, NetBSD and possibly
> + * others have it as intptr_t, so here we wallpaper over that difference
> + * with an unsightly lvalue cast.
> + */
> + *((WaitEvent **)(&k_ev->udata)) = event;

I'm mildly inclined to hide that behind a macro, so the other places
have a reference, via the macro definition, to this too.

> + if (rc < 0 && event->events == WL_POSTMASTER_DEATH && errno == ESRCH)
> + {
> + /*
> + * The postmaster is already dead. Defer reporting this to the caller
> + * until wait time, for compatibility with the other implementations.
> + * To do that we will now add the regular alive pipe.
> + */
> + WaitEventAdjustKqueueAdd(&k_ev[0], EVFILT_READ, EV_ADD, event);
> + rc = kevent(set->kqueue_fd, &k_ev[0], count, NULL, 0, NULL);
> + }

That's, ... not particulary pretty. Kinda wonder if we shouldn't instead
just add a 'pending_events' field, that we can check at wait time.

> diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
> index 90dda8ea050..4bcabc3b381 100644
> --- a/src/include/pg_config.h.in
> +++ b/src/include/pg_config.h.in
> @@ -330,6 +330,9 @@
> /* Define to 1 if you have isinf(). */
> #undef HAVE_ISINF
>
> +/* Define to 1 if you have the `kqueue' function. */
> +#undef HAVE_KQUEUE
> +
> /* Define to 1 if you have the <langinfo.h> header file. */
> #undef HAVE_LANGINFO_H
>
> @@ -598,6 +601,9 @@
> /* Define to 1 if you have the <sys/epoll.h> header file. */
> #undef HAVE_SYS_EPOLL_H
>
> +/* Define to 1 if you have the <sys/event.h> header file. */
> +#undef HAVE_SYS_EVENT_H
> +
> /* Define to 1 if you have the <sys/ipc.h> header file. */
> #undef HAVE_SYS_IPC_H

Should adjust pg_config.win32.h too.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-09-28 00:11:21 overflow in snprintf() when printing INT64_MIN
Previous Message Thomas Munro 2018-09-27 22:55:13 Re: [HACKERS] kqueue