Re: record identical operator - Review

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: record identical operator - Review
Date: 2013-10-01 02:46:20
Message-ID: 20131001024620.GB13385@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 27, 2013 at 03:34:20PM -0700, Kevin Grittner wrote:
> >> The arguments for this patch are
> >> * We want the materialized view to return the same value as
> >> would be returned if the query were executed directly.  This not
> >> only means that the values should be the same according to a
> >> datatypes = operator but that they should appear the same 'to
> >> the eyeball'.
>
> And to functions the user can run against the values.  The example
> with the null bitmap for arrays being included when not necessary
> is something that isn't directly apparent to the eye, but queries
> which use pg_column_size would not get equal results.

pg_column_size() is by definition internal details, so I am not worried
about that not matching.

> >> * Supporting the materialized view refresh check via SQL makes a
> >> lot of sense but doing that requires exposing something via SQL
> >> * If we adequately document what we mean by
> >> record_image_identical and the operator we pick for this then
> >> users shouldn't be surprised at what they get if they use this
>
> We first need to document the existing record comparison operators.
> If they read the docs for comparing "row_constructors" and expect
> that to be the behavior they get when they compare records, they
> will be surprised.

Well, if they appear in \do, I am thinking they should be documented.

> > This is a good summary.  I think there are a few things that make
> > this issue difficult to decide:
> >
> > 1.  We have to use an operator to give the RMVC (REFRESH
> > MATERIALIZED VIEW CONCURRENTLY) SPI query the ability to optimize
> > this query.  If we could do this with an SQL or C function, there
> > would be less concern about the confusion caused by adding this
> > capability.
>
>  (a)  We can't.
>
>  (b)  Why would that be less confusing?

Because function names, especially long ones, are more clearly
self-documenting than operators.

> > Question:  If we are comparing based on some primary key, why do
> > we need this to optimize?
>
> Because comparing primary keys doesn't tell us whether the old and
> new values in the row all match.

OK, but my question was about why we need a full set of operators rather
than just equal, and maybe not equal. I thought you said we needed
others, e.g. >, so we could do merge joins, but I thought we would just
be doing comparisons after primary keys are joined, and if that is true,
we could just use a function.

Actually, I am now realizing you have to use the non-binary-level equals
comparison on keys, then the binary-level equals on rows for this to
work --- that's pretty confusing. Is that true?

> > 3.  Our type casting and operators are already complex, and
> > adding another set of operators only compounds that.
>
> It cannot have any effect on any of the existing operators, so I'm
> not sure whether you are referring to the extra operators and
> functions, or something else.  It does not, for example, introduce
> any risk of "ambiguous operators".

It makes our system more complex for the user to understand.

> > One interesting approach would be to only allow the operator to
> > be called from SPI queries.
>
> Why would that be a good idea?

Because then it would be more of an "internal" operator.

> > It would also be good to know about similar non-default entries
> > in pg_opclass so we can understand the expected impact.
>
> A quick query (lacking schema information and schema qualification)
> shows what is there by default:

OK, the unique list is:

opcname
---------------------
varchar_ops
kd_point_ops
cidr_ops
text_pattern_ops
varchar_pattern_ops
bpchar_pattern_ops
(6 rows)

Do these all have operators defined too?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-10-01 03:04:21 Re: Completing PL support for Event Triggers
Previous Message Bruce Momjian 2013-10-01 02:19:31 Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block