Re: SQL:2011 application time

From: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2024-02-29 21:16:56
Message-ID: d1a793bd-d1f1-46d7-90b6-8b89b6feb1cc@illuminatedcomputing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Here is another patch series for application time. It addresses the feedback from the last few
emails. Details below:

On 1/28/24 16:00, jian he wrote:
> + <para>
> + In a temporal foreign key, the delete/update will use
> + <literal>FOR PORTION OF</literal> semantics to constrain the
> + effect to the bounds being deleted/updated in the referenced row.
> + </para>
>
> in v24-0003-Add-temporal-FOREIGN-KEYs.patch
> <literal>FOR PORTION OF</literal> not yet implemented, so we should
> not mention it.

Fixed.

> + <para>
> + If the last column is marked with <literal>PERIOD</literal>,
> + it must be a period or range column, and the referenced table
> + must have a temporal primary key.
> can we change "it must be a period or range column" to "it must be a
> range column", maybe we can add it on another patch.

Rewrote this section to be clearer.

On 2/1/24 21:53, jian he wrote:
> I've attached a new patch that further simplified the tests. (scope
> v24 patch's 0002 and 0003)
> Please ignore previous email attachments.

Thanks, I've pulled in most of these changes to the tests.

> I've only applied the v24, 0002, 0003.
> seems in doc/src/sgml/ref/create_table.sgml
> lack the explanation of `<replaceable
> class="parameter">temporal_interval</replaceable>`

You're right. Actually I think it is clearer without adding a separate name here, so I've updated
the docs to use `column_name | period_name`.

> since foreign key ON {UPDATE | DELETE} {CASCADE,SET NULL,SET DEFAULT}
> not yet supported,
> v24-0003 create_table.sgml should reflect that.

Updated.

> within the function ATAddForeignKeyConstraint, you called
> FindFKPeriodOpersAndProcs,
> but never used the computed outputs: periodoperoid, periodprocoid, opclasses.
> We validate these(periodoperoid, periodprocoid) at
> lookupTRIOperAndProc, FindFKPeriodOpersAndProcs.
> I'm not sure whether FindFKPeriodOpersAndProcs in
> ATAddForeignKeyConstraint is necessary.

This is explained in the comment above: we will do the same lookup when the foreign key is checked,
but we should make sure it works now so we can report the problem to the user.

> + * Check if all key values in OLD and NEW are "equivalent":
> + * For normal FKs we check for equality.
> + * For temporal FKs we check that the PK side is a superset of its old value,
> + * or the FK side is a subset.
> "or the FK side is a subset." is misleading, should it be something
> like "or the FK side is a subset of X"?

Okay, changed.

> + if (indexStruct->indisexclusion) return i - 1;
> + else return i;
>
> I believe our style should be (with proper indent)
> if (indexStruct->indisexclusion)
> return i - 1;
> else
> return i;

Fixed.

> in transformFkeyCheckAttrs
> + if (found && is_temporal)
> + {
> + found = false;
> + for (j = 0; j < numattrs + 1; j++)
> + {
> + if (periodattnum == indexStruct->indkey.values[j])
> + {
> + opclasses[numattrs] = indclass->values[j];
> + found = true;
> + break;
> + }
> + }
> + }
>
> can be simplified:
> {
> found = false;
> if (periodattnum == indexStruct->indkey.values[numattrs])
> {
> opclasses[numattrs] = indclass->values[numattrs];
> found = true;
> }
> }

Changed.

> Also wondering, at the end of the function transformFkeyCheckAttrs `if
> (!found)` part:
> do we need another error message handle is_temporal is true?

I think the existing error message works well for both temporal and non-temporal cases.

> @@ -212,8 +213,11 @@ typedef struct NewConstraint
> ConstrType contype; /* CHECK or FOREIGN */
> Oid refrelid; /* PK rel, if FOREIGN */
> Oid refindid; /* OID of PK's index, if FOREIGN */
> + bool conwithperiod; /* Whether the new FOREIGN KEY uses PERIOD */
> Oid conid; /* OID of pg_constraint entry, if FOREIGN */
> Node *qual; /* Check expr or CONSTR_FOREIGN Constraint */
> + Oid *operoids; /* oper oids for FOREIGN KEY with PERIOD */
> + Oid *procoids; /* proc oids for FOREIGN KEY with PERIOD */
> ExprState *qualstate; /* Execution state for CHECK expr */
> } NewConstraint;
> primary key can only one WITHOUT OVERLAPS,
> so *operoids and *procoids
> can be replaced with just
> `operoids, procoids`.
> Also these two elements in struct NewConstraint not used in v24, 0002, 0003.

I've removed these entirely. Sorry, they were leftover from an earlier revision.

On 2/12/24 01:55, Peter Eisentraut wrote:
> (Also, the last (0007) patch has some compiler warnings and also
> causes the pg_upgrade test to fail. I didn't check this further, but
> that's why the cfbot is all red.)

Fixed the pg_upgrade problem. I'm not seeing compiler warnings. If they still exist can you point me
to those?

> As a general comment, we need to figure out the right terminology
> "period" vs. "temporal", especially if we are going to commit these
> features incrementally. But I didn't look at this too hard here yet.

Agreed. I think it is okay to use "temporal" in the docs for the feature in general, if we clarify
that non-temporal values are also supported. That is what the rest of the world calls this kind of
thing.

The word "period" is confusing because it can be the `PERIOD` keyword used in temporal FKs, or also
the SQL:2011 `PERIOD` object that is like our range types. And then we also have ranges, etc. In the
past I was using "interval" to mean "range or PERIOD" (and "interval" is used by Date in his
temporal book), but perhaps that is too idiosyncratic. I've removed "interval" from the FK docs, and
instead I've tried to be very explicit and avoid ambiguity. (I haven't given as much attention to
cleaning up the later patches' docs yet.)

> * v24-0002-Add-GiST-referencedagg-support-func.patch
>
> Do we really need this level of generality? Are there examples not
> using ranges that would need a different aggregate function? Maybe
> something with geometry (points and lines)? But it seems to me that
> then we'd also need some equivalent to "without portion" support for
> those types and a multirange equivalent (basically another gist
> support function wrapped around the 0004 patch).

I'm not sure how else to do it. The issue is that `range_agg` returns a multirange, so the result
type doesn't match the inputs. But other types will likely have the same problem: to combine boxes
you may need a multibox. The combine mdranges you may need a multimdrange.

I agree we need something to support "without portion" too. The patches here give implementations
for ranges and multiranges. But that is for `FOR PORTION OF`, so it comes after the foreign key
patches (part 5 here).

Btw that part changed a bit since v24 because as jian he pointed out, our type system doesn't
support anyrange inputs and an anyrange[] output. So I changed the support funcs to use SETOF. I
could alternately add anyrangearray and anymultirangearray pseudotypes. It's not the first time I've
wanted those, so I'd be happy to go that way if folks are open to it. It seems like it should be a
totally separate patch though.

> * v24-0003-Add-temporal-FOREIGN-KEYs.patch
>
> - contrib/btree_gist/expected/without_overlaps.out
> - contrib/btree_gist/sql/without_overlaps.sql
>
> typo "exusts"

Fixed.

> - doc/src/sgml/ref/create_table.sgml
>
> This mentions FOR PORTION OF from a later patch.
>
> It is not documented that SET NULL and SET DEFAULT are not supported,
> even though that is added in a later patch. (So this patch should say
> that it's not supported, and then the later patch should remove that.)

All fixed.

> - src/backend/commands/indexcmds.c
>
> The changes to GetOperatorFromWellKnownStrategy() don't work for
> message translations. We had discussed a similar issue for this
> function previously. I think it's ok to leave the function as it was.
> The additional context could be added with location pointers or
> errcontext() maybe, but it doesn't seem that important for now.

Okay I've tried a different approach here that should fit better with t9n. Let me know if it still
needs work.

> - src/backend/commands/tablecmds.c
>
> The changes in ATAddForeignKeyConstraint(), which are the meat of the
> changes in this file, are very difficult to review in detail. I tried
> different git-diff options to get a sensible view, but it wasn't
> helpful. Do we need to do some separate refactoring here first?

I moved the FindFKComparisonOperators refactor into a separate patch, and that seems to confuse git
less. Your suggestion to group the PERIOD attribute with the others (below) also helped a lot to cut
down the diff here. In fact it means I only call FindFKComparisonOperators once, so pulling it into
a separate method is not even necessary anymore. But I do think it helps simplify what's already a
very long function, so I've left it in. Let me know if more work is needed here.

> The error message "action not supported for temporal foreign keys"
> could be more detailed, mention the action. Look for example how the
> error for the generated columns is phrased. (But note that for
> generated columns, the actions are impossible to support, whereas here
> it is just something not done yet. So there should probably still be
> different error codes.)

Fixed.

> - src/backend/nodes/outfuncs.c
> - src/backend/nodes/readfuncs.c
>
> Perhaps you would like to review my patch 0001 in
> <https://www.postgresql.org/message-id/859d6155-e361-4a05-8db3-4aa1f007ff28@eisentraut.org>,
> which removes the custom out/read functions for the Constraint node.
> Then you could get rid of these changes.

That is a nice improvement!

> - src/backend/utils/adt/ri_triggers.c
>
> The added #include "catalog/pg_range.h" doesn't appear to be used for
> anything.

Removed.

> Maybe we can avoid the added #include "commands/tablecmds.h" by
> putting the common function in some appropriate lower-level module.

Moved to pg_constraint.{c,h}.

> typo "PEROID"

Fixed.

> Renaming of ri_KeysEqual() to ri_KeysStable() doesn't improve clarity,
> I think. I think we can leave the old name and add a comment (as you
> have done). There is a general understanding around this feature set
> that "equal" sometimes means "contained" or something like that.

Okay.

> The function ri_RangeAttributeNeedsCheck() could be documented better.
> It's bit terse and unclear. From the code, it looks like it is used
> instead of row equality checks. Maybe a different function name would
> be suitable.

I realized I could simplify this a lot and reuse ri_AttributesEqual, so the whole method is gone now.

> Various unnecessary reformatting in RI_FKey_check().

Fixed, sorry about that.

> When assembling the SQL commands, you need to be very careful about
> fully quoting and schema-qualifying everything. See for example
> ri_GenerateQual().

Went through everything and added quoting & schemes to a few places that were missing it.

> Have you checked that the generated queries can use indexes and have
> suitable performance? Do you have example execution plans maybe?

The plans look good to me. Here are some tests:

-- test when inserting/updating the FK side:

regression=# explain analyze select 1
from (
select valid_at as r
from only temporal_rng x
where id = '[8,8]'
and valid_at && '[2010-01-01,2012-01-01)'
for key share of x
) x1
having '[2010-01-01,2012-01-01)'::tsrange <@ range_agg(x1.r);
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=8.19..8.20 rows=1 width=4) (actual time=0.165..0.167 rows=0 loops=1)
Filter: ('["2010-01-01 00:00:00","2012-01-01 00:00:00")'::tsrange <@ range_agg(x1.r))
Rows Removed by Filter: 1
-> Subquery Scan on x1 (cost=0.14..8.18 rows=1 width=32) (actual time=0.152..0.153 rows=0 loops=1)
-> LockRows (cost=0.14..8.17 rows=1 width=38) (actual time=0.151..0.151 rows=0 loops=1)
-> Index Scan using temporal_rng_pk on temporal_rng x (cost=0.14..8.16 rows=1
width=38) (actual time=0.150..0.150 rows=0 loops=1)
Index Cond: ((id = '[8,9)'::int4range) AND (valid_at && '["2010-01-01
00:00:00","2012-01-01 00:00:00")'::tsrange))
Planning Time: 0.369 ms
Execution Time: 0.289 ms
(9 rows)

-- test when deleting/updating from the PK side:

regression=# explain analyze select 1 from only temporal_rng x where id = '[8,8]' and valid_at &&
'[2010-01-01,2012-01-01)'
for key share of x;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
LockRows (cost=0.14..8.17 rows=1 width=10) (actual time=0.079..0.079 rows=0 loops=1)
-> Index Scan using temporal_rng_pk on temporal_rng x (cost=0.14..8.16 rows=1 width=10)
(actual time=0.078..0.078 rows=0 loops=1)
Index Cond: ((id = '[8,9)'::int4range) AND (valid_at && '["2010-01-01
00:00:00","2012-01-01 00:00:00")'::tsrange))
Planning Time: 0.249 ms
Execution Time: 0.123 ms
(5 rows)

I will do some further tests with more rows, but I haven't yet.

> - src/backend/utils/adt/ruleutils.c
>
> This seems ok in principle, but it's kind of weird that the new
> argument of decompile_column_index_array() is called "withPeriod"
> (which seems appropriate seeing what it does), but what we are passing
> in is conwithoutoverlaps. Maybe we need to reconsider the naming of
> the constraint column? Sorry, I made you change it from "contemporal"
> or something, didn't I? Maybe "conperiod" would cover both meanings
> better?

Certainly conperiod is easier to read. Since we are using it for PK/UNIQUE/FKs, conperiod also seems
like a better match. FKs don't use WITHOUT OVERLAPS syntax, and OTOH PK/UNIQUEs will still accept a
PERIOD (eventually, also a range/etc now). I've renamed it, but since the old name was already
committed with the PK patch, I've broken the renaming into a separate patch that could be committed
without anything else.

> - src/backend/utils/cache/lsyscache.c
>
> get_func_name_and_namespace(): This function would at least need some
> identifier quoting. There is only one caller (lookupTRIOperAndProc),
> so let's just put this code inline there; it's not worth a separate
> global function. (Also, you could use psprintf() here to simplify
> palloc() + snprintf().)

Removed.

> - src/include/catalog/pg_constraint.h
>
> You are changing in several comments "equality" to "comparison". I
> suspect you effectively mean "equality or containment"? Maybe
> "comparison" is too subtle to convey that meaning? Maybe be more
> explicit.

Okay, changed.

> You are changing a foreign key from DECLARE_ARRAY_FOREIGN_KEY to
> DECLARE_ARRAY_FOREIGN_KEY_OPT. Add a comment about it, like the one
> just above has.

I don't need this change at all now that we're using GENERATED columns for PERIODs, so I've taken it
out.

> - src/include/catalog/pg_proc.dat
>
> For the names of the trigger functions, maybe instead of
>
> TRI_FKey_check_ins
>
> something like
>
> RI_FKey_period_check_ins
>
> so that all RI trigger functions group under a common prefix.

Renamed.

> On second thought, do we even need separate functions for this?
> Looking at ri_triggers.c, the temporal and non-temporal functions are
> the same, and all the differences are handled in the underlying
> implementation functions.

My thinking was to avoid making the non-temporal functions suffer in performance and complexity.
What do you think? I've kept the separate functions here but I can combine them if you like.

> - src/include/nodes/parsenodes.h
>
> The constants FKCONSTR_PERIOD_OP_CONTAINED_BY and
> FKCONSTR_PERIOD_PROC_REFERENCED_AGG could use more documentation here.

Removed. They are obsolete now (and were already in v24---sorry!).

> For the Constraint struct, don't we just need a bool field saying
> "this is a period FK", and then we'd know that the last column is the
> period? Like we did for the primary keys (bool without_overlaps).

Okay, changed. Also in ATExecAddConstraint we can treat the PERIOD element like any other FK
element, which simplifies the changes there a lot.

> - src/include/parser/kwlist.h
>
> For this patch, the keyword PERIOD can be unreserved. But it
> apparently will need to be reserved later for the patch that
> introduces PERIOD columns. Maybe it would make sense to leave it
> unreserved for this patch and upgrade it in the later one.

I tried doing this but got a shift/reduce conflict, so it's still reserved here.

Thanks,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

Attachment Content-Type Size
v25-0001-Rename-conwithoutoverlaps-to-conperiod.patch text/x-patch 9.3 KB
v25-0002-Add-GiST-referencedagg-support-func.patch text/x-patch 11.6 KB
v25-0003-Refactor-FK-operator-lookup.patch text/x-patch 15.3 KB
v25-0004-Add-temporal-FOREIGN-KEYs.patch text/x-patch 167.7 KB
v25-0005-Add-support-funcs-for-FOR-PORTION-OF.patch text/x-patch 43.3 KB
v25-0006-Add-UPDATE-DELETE-FOR-PORTION-OF.patch text/x-patch 162.0 KB
v25-0007-Add-CASCADE-SET-NULL-SET-DEFAULT-for-temporal-fo.patch text/x-patch 141.2 KB
v25-0008-Add-PERIODs.patch text/x-patch 151.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-02-29 21:19:43 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message Daniel Gustafsson 2024-02-29 21:08:44 Re: [PoC] Federated Authn/z with OAUTHBEARER