Re: pgsql: Add parallel-aware hash joins.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2018-01-04 19:20:33
Message-ID: 20180104192033.irv4ppf44i2fesze@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2018-01-04 12:11:37 -0500, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Thu, Jan 4, 2018 at 11:00 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Also, what the devil is happening on skink?
>
> > So, skink is apparently dying during shutdown of a user-connected
> > backend, and specifically the one that executed the 'tablespace' test.
>
> Well, yeah, valgrind is burping: the postmaster log is full of
>
> ==10544== VALGRINDERROR-BEGIN
> ==10544== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
> ==10544== at 0x7011490: epoll_pwait (epoll_pwait.c:42)
> ==10544== by 0x4BF40B: WaitEventSetWaitBlock (latch.c:1048)
> ==10544== by 0x4BF40B: WaitEventSetWait (latch.c:1000)
> ==10544== by 0x3C0B3B: secure_read (be-secure.c:166)
> ==10544== by 0x3CCD9E: pq_recvbuf (pqcomm.c:963)
> ==10544== by 0x3CDA07: pq_getbyte (pqcomm.c:1006)
> ==10544== by 0x4E2A2D: SocketBackend (postgres.c:339)
> ==10544== by 0x4E444E: ReadCommand (postgres.c:512)
> ==10544== by 0x4E7588: PostgresMain (postgres.c:4085)
> ==10544== by 0x4641D0: BackendRun (postmaster.c:4412)
> ==10544== by 0x467308: BackendStartup (postmaster.c:4084)
> ==10544== by 0x4675F7: ServerLoop (postmaster.c:1757)
> ==10544== by 0x4689D4: PostmasterMain (postmaster.c:1365)
> ==10544== Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==10544==
> ==10544== VALGRINDERROR-END
>
> But (a) this is happening in multiple branches, and (b) we've not
> changed anything near that code in awhile. I think something broke
> in valgrind itself.

Some packages on skink have been upgraded. It appears that there either
was a libc or valgrind change that made valgrind not recognize that a
pointer of 0 might not point anywhere :(

Let me check whether valgrind accept multiple suppression files, in
which case I could add a suppression for this error to all
branches. Will also check whether I can reproduce locally.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2018-01-04 19:59:13 pgsql: Fix incorrect computations of length of null bitmap in pageinspe
Previous Message Peter Eisentraut 2018-01-04 18:56:36 pgsql: Refactor channel binding code to fetch cbind_data only when nece

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-01-04 19:20:43 Re: bug? import foreign schema forgets to import column description
Previous Message Alvaro Herrera 2018-01-04 19:20:23 Re: [HACKERS] Proposal: Local indexes for partitioned table