BUG #17949: Adding an index introduces serialisation anomalies.

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: artem(dot)anisimov(dot)255(at)gmail(dot)com
Subject: BUG #17949: Adding an index introduces serialisation anomalies.
Date: 2023-05-28 12:26:31
Message-ID: 17949-a0f17035294a55e2@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17949
Logged by: Artem Anisimov
Email address: artem(dot)anisimov(dot)255(at)gmail(dot)com
PostgreSQL version: 15.3
Operating system: fedora 38
Description:

Hello dear pg authors,

I have come across a behaviour in pg that contradicts
https://www.postgresql.org/docs/15/transaction-iso.html#XACT-SERIALIZABLE.

I've minimised my testcase to a scenario that essentially duplicates a
scenario that you describe in the following paragraph:
> In particular, it is possible to see unique constraint violations
> caused by conflicts with overlapping Serializable transactions
> even after explicitly checking that the key isn't present
> before attempting to insert it.
> This can be avoided by making sure that all Serializable transactions
that
> insert potentially conflicting keys explicitly check if they can do so
first.

At the end of the report there is a C reproducer that highlights the
problem. Let me give a high-level overview of the scenario first.

I have N threads each trying to "acquire an exclusive lock" this way:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM locks WHERE path = $1;
quit if there already is a lock
INSERT INTO locks(path, ...) VALUES($1, ...);
COMMIT;

Once all threads have attempted to acquire a lock, I count how many of them
succeeded to INSERT INTO. Normally, there is only one thread that succeeds,
which is what I expect. However, if I add a HASH or a BTREE index on "path",
it becomes possible for multiple threads to do a successful INSERT INTO,
which violates the serialisability.

The problem can be reproduced with the default postgresql.conf, but it takes
some time. If I increase "shared_buffers" to 1024MB, the issue appears
almost immediately (fewer chances to promote predicate locks to locks on the
whole table?).

I've seen this behaviour with pg 13, 15 and 16 (503b055).

Now let us see the reproducer. It has two primary components:
1. test_once() spawns 32 threads that try to acquire a lock, waits for them,
and counts the number of "acquired exclusive locks",
2. try_acquire_lock() executes a transaction described above.

To build the reproducer:
$ gcc -std=c99 -o test test.c -pthread `pkg-config libpq --cflags --libs`

To run the reproducer:
$ ./test
or
$ DB_CONNSTR="dbname=abc host=def user=ghi" ./test

The reproducer:

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

#include <libpq-fe.h>

// Comment this to make serialisation anomalies go away.
#define WITH_INDEX

// I have seen the problem with as few as 3 threads. 32 threads make
// the issue appear much sooner.
#define NR_THREADS (32)
#define NR_RUNS (1024 * 1024)

static PGconn *conns[NR_THREADS];

static void* try_acquire_lock(void *arg)
{
PGconn *c = arg;
PGresult *res;
int ntuples;

res = PQexec(c, "BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);

res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
assert(PQresultStatus(res) == PGRES_TUPLES_OK);
ntuples = PQntuples(res);
PQclear(res);

if (ntuples > 0) {
// someone else already has a lock
res = PQexec(c, "COMMIT");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
return NULL;
}

res = PQexec(c, "INSERT INTO locks(path) VALUES('xyz')");
PQclear(res);

res = PQexec(c, "COMMIT");
PQclear(res);

return NULL;
}

static void test_once(void)
{
PGconn *c = conns[0];
PGresult *res;
int ntuples;

pthread_t thrs[NR_THREADS];
for (int i = 0; i < NR_THREADS; ++i)
pthread_create(&thrs[i], NULL, &try_acquire_lock,
conns[i]);
for (int i = 0; i < NR_THREADS; ++i)
pthread_join(thrs[i], NULL);

res = PQexec(c, "SELECT * FROM locks WHERE path = 'xyz'");
assert(PQresultStatus(res) == PGRES_TUPLES_OK);
ntuples = PQntuples(res);
PQclear(res);

if (ntuples != 1)
printf("ntuples = %d\n", ntuples);
assert(ntuples == 1);

res = PQexec(c, "TRUNCATE TABLE locks");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
}

static void prepare_db(void)
{
PGconn *c = conns[0];
PGresult *res;

res = PQexec(c, "DROP TABLE locks");
PQclear(res);

res = PQexec(c, "CREATE TABLE locks (path TEXT NOT NULL)");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);

#ifdef WITH_INDEX
res = PQexec(c, "CREATE INDEX ON locks USING HASH(path)");
assert(PQresultStatus(res) == PGRES_COMMAND_OK);
PQclear(res);
#endif
}

int main(void)
{
const char *connstr = getenv("DB_CONNSTR");
if (connstr == NULL)
connstr = "dbname=postgres";

for (int i = 0; i < NR_THREADS; ++i) {
conns[i] = PQconnectdb(connstr);
assert(PQstatus(conns[i]) == CONNECTION_OK);
}
prepare_db();

for (int i = 0; i < NR_RUNS; ++i)
test_once();

for (int i = 0; i < NR_THREADS; ++i)
PQfinish(conns[i]);

return 0;
}

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message jian he 2023-05-29 00:00:00 Re: Comparing date strings with jsonpath expression
Previous Message PG Bug reporting form 2023-05-26 17:27:39 BUG #17948: libpq seems to misbehave in a pipelining corner case