Table locking oddness in PLpgSQL function

From: Dave Harkness <daveh(at)MEconomy(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Table locking oddness in PLpgSQL function
Date: 2001-10-03 00:38:40
Message-ID: 5.1.0.14.2.20011002171345.050d6b20@mail.meconomy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello all,

I just discovered PLpgSQL yesterday (yay!) as we began development of our
more robust database backend, but I'm seeing some odd behavior with table
locking. The problem I'm seeing is that two database transactions,
initiated via two separate JDBC client connections, are able to obtain
simultaneous exclusive table locks on the same table. If I lock the table
in PSQL, the JDBC calls do indeed block until I end my transaction in PSQL.
However, two JDBC calls don't appear to block each other. It looks as if
PLpgSQL only ensures that it is allowed to obtain the locks without
actually holding them.

I used PLpgSQL to implement a next_id function that allows IDs to be
allocated in continuous blocks (thus I believe I cannot use sequences). The
client application generates its own IDs but tracks them in the database.
To minimize DB calls the client allocates IDs in blocks and doles them out
one-at-a-time. Thus, 20 calls to IDFactory.nextID() in Java will result in
only 2 calls to the database function next_id_block(10).

As well each object type (there are 6) that needs IDs gets its own
IDFactory, *and* there are multiple Java clients accessing the same
database. Therefore, it's quite possible for two IDFactories to call
next_id_block(count) at the same time. Thus I'm locking the table in
exclusive mode to force synchronous access. I've also tried access
exclusive mode to no avail. I then changed it to use SELECT ... FOR UPDATE
rather than locking the entire table; same result.

Here's the table definition:

create table idfactory
(
name varchar(20) not null primary key,
next_id integer not null default 1,
change_num smallint not null default 1
) ;

This is the psuedo-code algorithm I've implemented in PLpgSQL:

next_id_block ( count )
(1) lock idfactory table

(2) read idfactory row
(3) update idfactory row
increment next_id by count
increment change_num by 1
where change_num is equal to that read in (2)

(4) FAIL if (3) updated 0 rows

(5) return next_id read in (2)

My intent is that by locking the idfactory table, I can assure that no one
else can update the row between it being read in (2) and updated in (3).
I've tried calling this function in both read-committed and serializable
transaction level. The reality, however, is that some threads are trying to
update the row concurrently and failing (0 rows updated since the
change_num value no longer matches). In serializable mode, I get a Postgres
error:

ERROR: Can't serialize access due to concurrent update

I'll try to illustrate what seems to be happening in the case of two threads.

Time Thread A Thread B
1 lock
2 read 1, 1
3 lock
4 read 1, 1
5 write 11, 2
6 write 11, 2
7 return 1
8 FAIL

It's my understanding that thread B should block at T3 since thread A
locked the table at T1. Thread B shouldn't continue until thread A's
transaction is ended (either by commit or abort). True? The truly odd part
is that if I start up PSQL, begin a transaction, and then lock the table,
all the threads calling the function block until I end the transaction,
just as I'd expect. However, the threads won't block each other!

Here's the stored function itself:

create function next_id_block (
varchar , integer
)
returns bigint
as '
DECLARE
-- Parameters
name_key alias for $1 ;
block_size alias for $2 ;

-- Constants
FIRST_ID constant bigint := 1 ;

-- Locals
id_rec record ;
new_last_id bigint ;
num_rows integer ;
BEGIN
-- To avoid a retry-loop, lock the whole table for the transaction
lock table idfactory in access exclusive mode ;

-- Read the current value of next_id
select into id_rec * from idfactory where name = name_key ;

-- Increment it by block_size
new_last_id := id_rec.next_id + block_size ;
update idfactory
set next_id = new_last_id,
change_num = change_num + 1
where name = name_key and change_num = id_rec.change_num ;

-- Error if filter not found
get diagnostics num_rows = ROW_COUNT ;
if num_rows != 1 then
raise exception ''Failed to update idfactory.next_id to % for % at
%'',
new_last_id, name_key, id_rec.change_num;
return -1 ;
end if ;

return id_rec.next_id ;
END ;
' language 'plpgsql' with (isstrict) ;

Finally, here's the JDBC code I'm using to call it:

protected void allocate ( int count )
{
PreparedStatement stmt = null;
ResultSet result = null;
long newNextID = INVALID_ID;

try
{
stmt = conn.prepareStatement("select next_id_block(?, ?)");

stmt.setString(1, name);
stmt.setInt(2, count);

// conn.setAutoCommit(false);
result = stmt.executeQuery();

if ( ! result.next() )
throw new SQLException("Function next_id_block failed");

// Pull out the new value and close the result set.
newNextID = Nulls.getLong(result, 1);

try { result.close(); result = null; }
catch ( SQLException ignore ) { }

// conn.commit();

// Null values are not allowed.
if ( Nulls.is(newNextID) )
throw new SQLException("Function next_id_block returned null");

nextID = newNextID;
lastID = nextID + count;
}
catch ( SQLException e )
{
e.printStackTrace();
}
finally
{
if ( result != null )
{
try { result.close(); }
catch ( SQLException ignore ) { }
}
}
}

Anyway, this was rather long, but I wanted to provide all the information
necessary up front. Thank you for any thoughts or ideas you might have.

PostgreSQL: 7.1.3
JDBC: 7.1-1.2
JDK: 1.3.0_02
Client Platform: Win2k 5.00.2195 sp2
Server Platform: RedHat 7.1

Peace,
Dave

--
David Harkness
MEconomy, Inc.

Browse pgsql-general by date

  From Date Subject
Next Message Tatsuo Ishii 2001-10-03 01:01:16 Re: [GENERAL] Problem with the accents
Previous Message Dave Harkness 2001-10-02 22:41:13 Re: LOCK TABLE oddness in PLpgSQL function called via