Re: record identical operator - Review

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: record identical operator - Review
Date: 2013-09-20 12:38:45
Message-ID: 1379680725.74598.YahooMailNeo@web162902.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Steve Singer <steve(at)ssinger(dot)info> wrote:
> On 09/12/2013 06:27 PM, Kevin Grittner wrote:
>> Attached is a patch for a bit of infrastructure I believe to be
>> necessary for correct behavior of REFRESH MATERIALIZED VIEW
>> CONCURRENTLY as well as incremental maintenance of matviews.
>
> Here is attempt at a review of the v1 patch.

Thanks for looking at it.

> There has been a heated discussion on list about if we even want this
> type of operator.

It's not a question of whether we want to allow operators which
define comparisons between objects in a non-standard way.  We
already have 11 such cases; this would add one more to make 12.  In
all cases there are different operators for making a comparison
that return potentially different results from the default
operators, to support uses that are specific to that type.

> I think there is agreement that better (as in more obscure) operators
> than === and !== need to be picked  and we need to find a place in the
> user-docs to warn users of the behaviour of this operators.  Hannu has
> proposed
>
> *==*      "binary equal, surely very equal by any other definition as
> wall"
> !==?      "maybe not equal" -- "binary inequal, may still be
> equal by
> other comparison methods"
>
> and no one has yet objected to this.

I do.  The issue is that there are a great many places that there
are multiple definitions of equality.  We generally try to use
something which tries to convey something about the nature of the
operation, since the fact that it can have different results is a
given.  There is nothing in those operators that says "binary image
comparison".  I thought about prepending a hash character in front
of normal operators, because that somehow suggests binary
operations to my eye (although I have no idea whether it does so
for anyone else), but those operators are already used for an
alternative set of comparisons for intervals, with a different
meaning.  I'm still trying to come up with something I really like.

> My vote would be to update the patch to
> use those operator names and then figure out where we can document them.  It it
> means we have to section describing the equal operator for records as well then
> maybe we should do that so we can get on with things.

> Code Review
> --------------------
>
> The record_image_eq and record_image_cmp functions are VERY similar.  It seems
> to me that you could have the meet of these functions into a common function
> (that isn't exposed via SQL) that then behaves differently  in 2 or 3 spots
> based on a parameter that controls if it detoasts the tuple for the compare or
> returns false on the equals.

Did you look at the record_eq and record_cmp functions which exist
before this patch?  Is there a reason we should do it one way for
the default operators and another way for this alternative?  Do you
think we should change record_eq and record_cmp to do things the
way you suggest?

> Beyond that I don't see any issues with the code.
>
> You call out a question about two records being compared have a different number
> of columns should it always be an error, or only an error when they match on the
> columns they do have in common.
>
> The current behaviour is
>
>   select * FROM r1,r2 where r1<==r2;
>   a | b | a | b | c
> ---+---+---+---+---
>   1 | 1 | 1 | 2 | 1
> (1 row)
>
> update r2 set b=1;
> UPDATE 1
> test=# select * FROM r1,r2 where r2<==r1;
> ERROR:  cannot compare record types with different numbers of columns
>
> This seems like a bad idea to me.  I don't like that I get a type comparison
> error only sometimes based on the values of the data in the column.  If I'm
> a user testing code that compares two records with this operator and I test my
> query with 1 value pair then and it works then I'd naively expect to get a
> true or false on all other value sets of the same record type.

Again, this is a result of following the precedent set by the
existing record comparison operators.

test=# create table r1 (c1 int, c2 int);
CREATE TABLE
test=# create table r2 (c1 int, c2 int, c3 int);
CREATE TABLE
test=# insert into r1 values (1, 2);
INSERT 0 1
test=# insert into r2 values (3, 4, 5);
INSERT 0 1
test=# select * from r1 join r2 on r1 < r2;
 c1 | c2 | c1 | c2 | c3
----+----+----+----+----
  1 |  2 |  3 |  4 |  5
(1 row)

test=# update r1 set c1 = 3, c2 = 4;
UPDATE 1
test=# select * from r1 join r2 on r1 < r2;
ERROR:  cannot compare record types with different numbers of columns

Would be in favor of forcing a change to the record comparison
operators which have been in use for years?  If not, is there a
good reason to have them behave differently in this regard?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-20 12:40:31 Re: Freezing without write I/O
Previous Message Dimitri Fontaine 2013-09-20 12:35:31 Re: Where to load modules from?