Re: SQL:2011 application time

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2024-01-09 06:46:34
Message-ID: CALDaNm0ZC8R9818WkLUp4-LYGVFr9KBZHxhyJyaBUFUXUHYiYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 6 Jan 2024 at 05:50, Paul Jungwirth <pj(at)illuminatedcomputing(dot)com> wrote:
>
> Getting caught up on reviews from November and December:
>
> On 11/19/23 22:57, jian he wrote:
> >
> > I believe the following part should fail. Similar tests on
> > src/test/regress/sql/generated.sql. line begin 347.
> >
> > drop table if exists gtest23a,gtest23x cascade;
> > CREATE TABLE gtest23a (x int4range, y int4range,
> > CONSTRAINT gtest23a_pk PRIMARY KEY (x, y WITHOUT OVERLAPS));
> > CREATE TABLE gtest23x (a int4range, b int4range GENERATED ALWAYS AS
> > ('empty') STORED,
> > FOREIGN KEY (a, PERIOD b ) REFERENCES gtest23a(x, PERIOD y) ON UPDATE
> > CASCADE); -- should be error?
>
> Okay, I've added a restriction for temporal FKs too. But note this will
> change once the PERIODs patch (the last one here) is finished. When the
> generated column is for a PERIOD, there will be logic to "reroute" the
> updates to the constituent start/end columns instead.
>
> > begin;
> > drop table if exists fk, pk cascade;
> > CREATE TABLE pk (id int4range, valid_at int4range,
> > CONSTRAINT pk_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
> > );
> > CREATE TABLE fk (
> > id int4range,valid_at tsrange, parent_id int4range,
> > CONSTRAINT fk FOREIGN KEY (parent_id, valid_at)
> > REFERENCES pk
> > );
> > rollback;
> > --
> > the above query will return an error: number of referencing and
> > referenced columns for foreign key disagree.
> > but if you look at it closely, primary key and foreign key columns both are two!
> > The error should be saying valid_at should be specified with "PERIOD".
>
> Ah okay, thanks for the clarification! This is tricky because the user
> left out the PERIOD on the fk side, and left out the entire pk side, so
> those columns are just implicit. So there is no PERIOD anywhere.
> But I agree that if the pk has WITHOUT OVERLAPS, we should expect a
> corresponding PERIOD modifier on the fk side and explain that that's
> what's missing. The attached patches include that.
>
> > I found out other issues in v18.
> > I first do `git apply` then `git diff --check`, there is a white
> > space error in v18-0005.
>
> Fixed, thanks!
>
> > You also need to change update.sgml and delete.sgml <title>Outputs</title> part.
> > Since at most, it can return 'UPDATE 3' or 'DELETE 3'.
>
> This doesn't sound correct to me. An UPDATE or DELETE can target many
> rows. Also I don't think the inserted "leftovers" should be included in
> these counts. They represent the rows updated/deleted.
>
> > --the following query should work?
> > drop table pk;
> > CREATE table pk(a numrange PRIMARY key,b text);
> > insert into pk values('[1,10]');
> > create or replace function demo1() returns void as $$
> > declare lb numeric default 1; up numeric default 3;
> > begin
> > update pk for portion of a from lb to up set b = 'lb_to_up';
> > return;
> > end
> > $$ language plpgsql;
> > select * from demo1();
>
> Hmm this is a tough one. It is correct that the `FROM __ TO __` values cannot be column references.
> They are computed up front, not per row. One reason is they are used to search the table. In fact
> the standard basically allows nothing but literal strings here. See section 14.14, page 971 then
> look up <point in time> on page 348 and <datetime value expression> on page 308. The most
> flexibility you get is you can add/subtract an interval to the datetime literal. We are already well
> past that by allowing expressions, (certain) functions, parameters, etc.
>
> OTOH in your plpgsql example they are not really columns. They just get represented as ColumnRefs
> and then passed to transformColumnRef. I'm surprised plpgsql does it that way. As a workaround you
> could use `EXECUTE format(...)`, but I'd love to make that work as you show instead. I'll keep
> working on this one but it's not done yet. Perhaps I can move the restriction into
> analysis/planning. If anyone has any advice it'd be welcome.
>
> On 12/6/23 05:22, jian he wrote:
> > this TODO:
> > * TODO: It sounds like FOR PORTION OF might need to do something here too?
> > based on comments on ExprContext. I refactor a bit, and solved this TODO.
>
> The patch looks wrong to me. We need to range targeted by `FROM __
> TO __` to live for the whole statement, not just one tuple (see just
> above). That's why it gets computed in the Init function node.
>
> I don't think that TODO is needed anymore at all. Older versions of the
> patch had more expressions besides this one, and I think it was those I
> was concerned about. So I've removed the comment here.
>
> > tring to the following TODO:
> > // TODO: Need to save context->mtstate->mt_transition_capture? (See
> > comment on ExecInsert)
> >
> > but failed.
> > I also attached the trial, and also added the related test.
> >
> > You can also use the test to check portion update with insert trigger
> > with "referencing old table as old_table new table as new_table"
> > situation.
>
> Thank you for the test case! This is very helpful. So the problem is
> `referencing new table as new_table` gets lost. I don't have a fix yet
> but I'll work on it.
>
> On 12/11/23 00:31, jian he wrote:
> > - false); /* quiet */
> > + false); /* quiet */
> >
> > Is the above part unnecessary?
>
> Good catch! Fixed.
>
> > I am confused. so now I only apply v19, 0001 to 0003.
> > period_to_range function never used. maybe we can move this part to
> > 0005-Add PERIODs.patch?
> > Also you add change in Makefile in 0003, meson.build change in 0005,
> > better put it on in 0005?
>
> You're right, those changes should have been in the PERIODs patch. Moved.
>
> > +/*
> > + * We need to handle this shift/reduce conflict:
> > + * FOR PORTION OF valid_at FROM INTERVAL YEAR TO MONTH TO foo.
> > + * This is basically the classic "dangling else" problem, and we want a
> > + * similar resolution: treat the TO as part of the INTERVAL, not as part of
> > + * the FROM ... TO .... Users can add parentheses if that's a problem.
> > + * TO just needs to be higher precedence than YEAR_P etc.
> > + * TODO: I need to figure out a %prec solution before this gets committed!
> > + */
> > +%nonassoc YEAR_P MONTH_P DAY_P HOUR_P MINUTE_P
> > +%nonassoc TO
> >
> > this part will never happen?
> > since "FROM INTERVAL YEAR TO MONTH TO"
> > means "valid_at" will be interval range data type, which does not exist now.
>
> It appears still needed to me. Without those lines I get 4 shift/reduce
> conflicts. Are you seeing something different? Or if you have a better
> solution I'd love to add it. I definitely need to fix this before that
> patch gets applied.
>
> > for all the refactor related to ri_PerformCheck, do you need (Datum) 0
> > instead of plain 0?
>
> Casts added.
>
> > + <para>
> > + If the table has a range column or
> > + <link linkend="ddl-periods-application-periods"><literal>PERIOD</literal></link>,
> > + you may supply a <literal>FOR PORTION OF</literal> clause, and
> > your delete will
> > + only affect rows that overlap the given interval. Furthermore, if
> > a row's span
> >
> >
> https://influentialpoints.com/Training/basic_statistics_ranges.htm#:~:text=A%20range%20is%20two%20numbers,or%20the%20difference%20between%20them
> > So "range" is more accurate than "interval"?
>
> I don't think we should be using R to define the terms "range" and
> "interval", which both already have meanings in Postgres, SQL, and the
> literature for temporal databases. But I'm planning to revise the docs'
> terminology here anyway. Some temporal database texts use "interval"
> in this sense, and I thought it was a decent term to mean "range or
> PERIOD". But now we need something to mean "range or multirange or
> custom type or PERIOD". Actually "portion" seems like maybe the best
> term, since the SQL syntax `FOR PORTION OF` reinforces that term. If you
> have suggestions I'm happy for ideas.
>
> > +/* ----------
> > + * ForPortionOfState()
> > + *
> > + * Copies a ForPortionOfState into the current memory context.
> > + */
> > +static ForPortionOfState *
> > +CopyForPortionOfState(ForPortionOfState *src)
> > +{
> > + ForPortionOfState *dst = NULL;
> > + if (src) {
> > + MemoryContext oldctx;
> > + RangeType *r;
> > + TypeCacheEntry *typcache;
> > +
> > + /*
> > + * Need to lift the FOR PORTION OF details into a higher memory context
> > + * because cascading foreign key update/deletes can cause triggers to fire
> > + * triggers, and the AfterTriggerEvents will outlive the FPO
> > + * details of the original query.
> > + */
> > + oldctx = MemoryContextSwitchTo(TopTransactionContext);
> >
> > should it be "Copy a ForPortionOfState into the TopTransactionContext"?
>
> You're right, the other function comments here use imperative mood. Changed.
>
> New patches attached, rebased to 43b46aae12. I'll work on your feedback from Jan 4 next. Thanks!

One of the test has failed in CFBot at [1] with:

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/generated.out
/tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/generated.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/generated.out
2024-01-06 00:34:48.078691251 +0000
+++ /tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/generated.out
2024-01-06 00:42:08.782292390 +0000
@@ -19,7 +19,9 @@
table_name | column_name | dependent_column
------------+-------------+------------------
gtest1 | a | b
-(1 row)
+ pt | de | p
+ pt | ds | p
+(3 rows)

More details of the failure is available at [2].

[1] - https://cirrus-ci.com/task/5739983420522496
[2] - https://api.cirrus-ci.com/v1/artifact/task/5739983420522496/log/src/test/recovery/tmp_check/log/regress_log_027_stream_regress

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-01-09 06:50:45 Re: Relation bulk write facility
Previous Message vignesh C 2024-01-09 06:32:04 Re: Skipping schema changes in publication