SOLVED: unexpected EIDRM on Linux

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Michael Fuhr <mike(at)fuhr(dot)org>
Subject: SOLVED: unexpected EIDRM on Linux
Date: 2007-07-02 18:24:18
Message-ID: 20773.1183400658@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It's a plain old Linux kernel bug: it returns EIDRM when it really ought
to say EINVAL, and apparently always has. The surprising part is really
that we've not seen it many times before.

Kudos to Michael Fuhr for thinking to write a test program investigating
whether randomly-chosen IDs would yield EIDRM --- that was what led me
to study the kernel source code closely enough to realize it was just
wrong.

regards, tom lane

------- Forwarded Messages

Date: Mon, 2 Jul 2007 10:59:43 -0600
From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [GENERAL] shmctl EIDRM preventing startup

I don't know if this is relevant but on both the box that rebooted
and on another box that's been up for several weeks I see a pattern
of shmid's for which shmctl() returns EIDRM (the EACCES errors are
for segments that are in use by another user; I'm not running as root):

$ ./shmctl-test 0 1048576
shmctl(0 / 0): ERROR: Identifier removed
shmctl(1 / 0x1): ERROR: Identifier removed
shmctl(2 / 0x2): ERROR: Identifier removed
shmctl(32768 / 0x8000): ERROR: Identifier removed
shmctl(32769 / 0x8001): ERROR: Identifier removed
shmctl(32770 / 0x8002): ERROR: Identifier removed
shmctl(65536 / 0x10000): ERROR: Permission denied
shmctl(65537 / 0x10001): ERROR: Identifier removed
shmctl(65538 / 0x10002): ERROR: Identifier removed
shmctl(98304 / 0x18000): ERROR: Identifier removed
shmctl(98305 / 0x18001): ERROR: Permission denied
shmctl(98306 / 0x18002): ERROR: Identifier removed
shmctl(131072 / 0x20000): ERROR: Identifier removed
shmctl(131073 / 0x20001): ERROR: Identifier removed
shmctl(131074 / 0x20002): ERROR: Identifier removed
shmctl(163840 / 0x28000): ERROR: Identifier removed
shmctl(163841 / 0x28001): ERROR: Identifier removed
shmctl(163842 / 0x28002): ERROR: Permission denied
[...]
shmctl(983040 / 0xf0000): ERROR: Identifier removed
shmctl(983041 / 0xf0001): ERROR: Identifier removed
shmctl(983042 / 0xf0002): ERROR: Identifier removed
shmctl(1015808 / 0xf8000): ERROR: Identifier removed
shmctl(1015809 / 0xf8001): ERROR: Identifier removed
shmctl(1015810 / 0xf8002): ERROR: Identifier removed
shmctl(1048576 / 0x100000): ERROR: Identifier removed

--
Michael Fuhr

#include <sys/ipc.h>
#include <sys/shm.h>

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int
main(int argc, char *argv[])
{
int shmid, min_shmid, max_shmid, tmp_shmid;
struct shmid_ds buf;

if (argc != 3) {
fprintf(stderr, "Usage: %s min_shmid max_shmid\n", argv[0]);
return EXIT_FAILURE;
}

min_shmid = atoi(argv[1]);
max_shmid = atoi(argv[2]);

if (min_shmid > max_shmid) {
tmp_shmid = min_shmid;
min_shmid = max_shmid;
max_shmid = tmp_shmid;
}

for (shmid = min_shmid; shmid <= max_shmid; shmid++) {
if (shmctl(shmid, IPC_STAT, &buf) == -1 && errno != EINVAL) {
printf("shmctl(%d / %#x): ERROR: %s\n", shmid, shmid, strerror(errno));
}
}

return EXIT_SUCCESS;
}

------- Message 2

Date: Mon, 02 Jul 2007 14:17:05 -0400
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Subject: Re: [GENERAL] shmctl EIDRM preventing startup

Michael Fuhr <mike(at)fuhr(dot)org> writes:
> On Mon, Jul 02, 2007 at 01:14:01PM -0400, Tom Lane wrote:
>> Oh, that's pretty durn interesting. I get the same type of pattern on
>> my FC6 box, but not on HPUX.

> I don't get this pattern on FreeBSD 6.2 or Solaris 9 either.

Well, I've just traced through the Linux code, and I find:

1. The low-order 15 bits of the shmid are simply an index into an array
of valid shmem entries. I'm not sure what is in index 0, but there's
apparently a live entry of some sort there. Index 1 is the first actual
shmem segment allocated, and thereafter the first free slot is chosen
whenever you make a new shmem segment.

2. When you try to stat a segment, it takes the low-order 15 bits of the
supplied ID and indexes into this array. If no such entry (out of
range, or NULL entry) you get EINVAL as expected. If there's an entry
but its high-order ID bits don't match the supplied ID, you get EIDRM.

This is why the set of EIDRM IDs moves around as you create and delete
valid segments.

As near as I can tell, this is flat out a case of the kernel returning
the wrong error code. It should say EINVAL when there's a mismatch.

It's a bit surprising that we have not seen a lot more reports of this
problem, because AFAICS the probability of a collision is extremely high
if there's more than one creator of shmem segments on a system.

I can reproduce the bug as follows:

1. Start postmaster 1.
2. Start postmaster 2 (different data directory and port).
3. Manually kill -9 both postmasters.
4. Manually ipcrm both shmem segments.
5. Start postmaster 2.
6. (Try to) start postmaster 1 --- it will fail because of EIDRM,
because its saved shmem id points at slot 1 which is now in use
by postmaster 2.

I'm going to generate a smaller test program showing this and file
a bug report at Red Hat.

In the mean time, it looks like we should assume EIDRM means EINVAL
on Linux, because AFAICS there is not actually anyplace in that code
that should return EIDRM; their data structure doesn't really have
any state that would justify returning such a code.

regards, tom lane

------- End of Forwarded Messages

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2007-07-02 18:36:44 Re: Postgresql.conf cleanup
Previous Message Pavel Stehule 2007-07-02 18:17:49 Re: what is difference between LOCAL and GLOBAL TEMP TABLES in PostgreSQL