vary read_only in SPI calls? or poke at the on-entry snapshot?

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: vary read_only in SPI calls? or poke at the on-entry snapshot?
Date: 2018-09-20 02:29:42
Message-ID: 5BA30616.70207@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

SOME BACKGROUND:

The code for PL/Java functions resides in 'class images'. These are made
available, in advance, by loading them from 'jar files' into some tables
in the database.

When a PL/Java function is called, if the JVM hasn't loaded its class image
yet, the JVM calls PL/Java's loader, which does an SPI query to retrieve
the image. This SPI query, only needed on the first reference, is
effectively part of the activity of the PL/Java function that has been
called. Among other things, that means the read_only SPI parameter is
passed as true if the PL/Java function in question is declared IMMUTABLE.

(A purist reaction to this might be to say no PL/Java function ever deserves
to be declared IMMUTABLE then, as the first call can generate a database
query to fetch the function code. But it has been that way a long time,
and as a practical matter, I'm not sure how much less "immutable" that
is than whatever happens to load/map/retrieve the needed code in other PLs.
It still strikes me as reasonable to declare a function immutable as long as
that's a fair description of what the function itself does.)

A headache does result in one corner case though.

THE HEADACHE:

The operation of making class images available in advance, from a jar file,
is done by the install_jar() function. The function reads images from the
jar file and stores them in the database. It can then also execute SQL
commands contained in the file. These can be declarations of the functions
and types and any other actions appropriate when installing the jar. They
can include queries that use the types and functions that just got declared.

A query that uses a newly-declared type will illustrate the headache,
because it results in a nested call into PL/Java for the type I/O
function, which is (typically) declared immutable. If it's the first
use of the class, the JVM will need to call the loader. The loader does
an SPI query to retrieve the image. Because this takes place 'during'
a call to an 'immutable' function, read_only => true is passed to SPI.
One effect of read_only => true in SPI is to rely on a snapshot from
the very start of the operation---when the jar file just loaded wasn't
there yet! Therefore, the loader fails to find the class.

So, this headache only affects references to the classes in a jar from
the install commands contained in the same jar, and only when some of
those install commands refer to immutable functions, and only when such
commands occur before any call to a stable or volatile function that
happens to be defined in the same class.

It can be worked around by reordering commands in the jar or by adding
a no-op function declared stable or volatile and adding an early query
that calls it. But these are hacks, and I would like to address the root
issue so hacks aren't necessary.

PASS read_only => false FROM THE CLASS LOADER?

One obvious approach would be to just arrange for read_only always to
be false in SPI queries originating from the class loader, regardless
of the volatility declaration on the function on whose behalf the loader
is operating.

As to the convention of setting read_only based on the function's
volatility, the manual says "While authors of C functions are able to
violate this convention, it's unlikely to be a good idea to do so."

Maybe this is an example where an exception is reasonable. So, a
function declared immutable would generate read_only SPI queries for
all of its normal behavior, but its first call might first generate
a non-read_only query from the class loader, followed by read_only
ones doing the function's own business.

SOMEHOW ADVANCE THE ON-ENTRY SNAPSHOT DURING install_jar?

An alternative could be to make the install_jar function smarter,
since it is only during the operation of install_jar that there is
any issue here. The function does two things in sequence:

1. Load the class images from the jar file
2. Execute the SQL commands in the file

install_jar has not much control or insight into step 2; it is just
reading query text and passing it to SPI, and we'll see the problem
if one of those commands refers to a not-yet-loaded class from step 1.

But install_jar knows exactly what it's doing in step 1. It populates
the tables of class images, using ordinary, non-read_only SPI calls.
It could even CommandCounterIncrement the snapshot those calls are
using---only that's futile, because that snapshot gets popped at the
end of step 1, leaving the former ActiveSnapshot, where the counter
is unchanged.

Any SPI calls made in step 2 will either get a new snapshot (in the
read_only false case), or rely on the current ActiveSnapshot (in the
read_only true case), which is the one that's too old to succeed.

Would it be unprecedented / be unreasonable / break anything for the
install_jar function to simply force a CommandCounterIncrement
at the end of step 1 (after its temporary snapshot has been popped,
so the former/on-entry ActiveSnapshot gets the increment)?

This really strikes me as the less blunt, more surgical solution to
the headache. The crux of the problem is just the persistence, during
install_jar, of a snapshot where the jar you told it to install is
still missing. This proposed fix would violate (in this one narrow
circumstance) the documented visibility rule for read_only SPI calls,
but the documented rule just isn't useful for this case.

But is there any reason this wouldn't work, or could break something?

DECISION TIME ...

1. fiddle the loader to always pass read_only => false to SPI calls,
regardless of the volatility of the function it is loading for.

2. leave the loader alone, and adjust install_jar (an infrequent
operation) to do something heretical with its on-entry snapshot.

I'm leaning toward 2. What say you? Is there a 3 or a 4 that I'm
overlooking?

Thanks,
-Chap

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2018-09-20 03:09:33 Re: [HACKERS] Bug in to_timestamp().
Previous Message Michael Paquier 2018-09-20 00:03:31 Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)