Re: jsonpath

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: jsonpath
Date: 2019-02-24 09:03:19
Message-ID: CAPpHfdvov6kWfT3hk8KMSah+kzNuN2AXaZfBqnSU0YkzDVkR2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Attached is revised version of jsonpath. Assuming that jsonpath have
problem places, I decided to propose partial implementation.
Following functionality was cut from jsonpath:
1) Support of datetime datatypes. Besides error suppression, this
functionality have troubles with timezones. According to standard we
should support timezones in jsonpath expressions. But that would
prevent jsonpath functions from being immutable, that in turn limits
their usage in expression indexes.
2) Suppression of numeric errors. I will post it as a separate patch.
Pushing this even this partial implementation of jsonpath to PG 12 is
still very useful. Also it will simplify a lot pushing other parts of
SQL/JSON to future releases.

Besides this changes, there is some refactoring and code beautification.

On Wed, Jan 30, 2019 at 5:28 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> > +/*
> > + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> > + * Actual datatype (returned in 'typid', 'typmod') is determined by
> > + * presence of date/time/zone components in the format string.
> > + */
> > +Datum
> > +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> > + int32 *typmod, int *tz)
>
> Given other to_<type> functions being SQL callable, I'm not a fan of the
> naming here.

I've removed that for now.

> > +{
> > + struct pg_tm tm;
> > + fsec_t fsec;
> > + int fprec = 0;
> > + int flags;
> > +
> > + do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> > +
> > + *typmod = fprec ? fprec : -1; /* fractional part precision */
> > + *tz = 0;
> > +
> > + if (flags & DCH_DATED)
> > + {
> > + if (flags & DCH_TIMED)
> > + {
> > + if (flags & DCH_ZONED)
> > + {
> > + TimestampTz result;
> > +
> > + if (tm.tm_zone)
> > + tzname = (char *) tm.tm_zone;
> > +
> > + if (tzname)
> > + {
> > + int dterr = DecodeTimezone(tzname, tz);
> > +
> > + if (dterr)
> > + DateTimeParseError(dterr, tzname, "timestamptz");
>
>
> Do we really need 6/7 indentation levels in new code?

Also, removed that.

> > +jsonpath_scan.c: FLEXFLAGS = -CF -p -p
>
> Why -CF, and why is -p repeated?

BTW, for our SQL grammar we have

> scan.c: FLEXFLAGS = -CF -p -p

Is it kind of default?

> > - JsonEncodeDateTime(buf, val, DATEOID);
> > + JsonEncodeDateTime(buf, val, DATEOID, NULL);
>
> ISTM API changes like this ought to be done in a separate patch, to ease
> review.

Also, removed that for now.

> > }
> >
> > +
> > /*
> > * Compare two jbvString JsonbValue values, a and b.
> > *
>
> Random WS change.

Reverted

> No binary input support? I'd suggest adding that, but keeping the
> representation the same. Otherwise just adds unnecesary complexity for
> driver authors wishing to use the binar protocol.

Binary support is added. I decided to make it in jsonb manner.
Format version 1 is text format, but it's possible we would have other
versions in future.

> > +/********************Support functions for JsonPath**************************/
> > +
> > +/*
> > + * Support macroses to read stored values
> > + */
>
> s/macroses/macros/

Fixed.

> > @@ -0,0 +1,2776 @@
> > +/*-------------------------------------------------------------------------
> > + *
> > + * jsonpath_exec.c
> > + * Routines for SQL/JSON path execution.
> > + *
> > + * Copyright (c) 2019, PostgreSQL Global Development Group
> > + *
> > + * IDENTIFICATION
> > + * src/backend/utils/adt/jsonpath_exec.c
> > + *
> > + *-------------------------------------------------------------------------
> > + */
>
> Somewhere there needs to be higher level documentation explaining how
> this stuff is supposed to work on a code level.

High level comments are added to jsonpath.c (about jsonpath binary
representation) jsonpath_exec.c (about jsonpath execution).

> > +
> > +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> > +#define jspStrictAbsenseOfErrors(cxt) (!(cxt)->laxMode)
> > +#define jspAutoUnwrap(cxt) ((cxt)->laxMode)
> > +#define jspAutoWrap(cxt) ((cxt)->laxMode)
> > +#define jspIgnoreStructuralErrors(cxt) ((cxt)->ignoreStructuralErrors)
> > +#define jspThrowErrors(cxt) ((cxt)->throwErrors)
>
> What's the point?

These are convenience macros, which improves code readability. For
instance, instead of direct checking for laxMode, we check for
particular aspects of its behavior. For code reader, it becomes
easier to understand why do we behave one way or another.

> > +#define ThrowJsonPathError(code, detail) \
> > + ereport(ERROR, \
> > + (errcode(ERRCODE_ ## code), \
> > + errmsg(ERRMSG_ ## code), \
> > + errdetail detail))
> > +
> > +#define ThrowJsonPathErrorHint(code, detail, hint) \
> > + ereport(ERROR, \
> > + (errcode(ERRCODE_ ## code), \
> > + errmsg(ERRMSG_ ## code), \
> > + errdetail detail, \
> > + errhint hint))
>
> These seem ill-advised, just making it harder to understand the code,
> and grepping for it, without actually meaningfully simplifying anything.

Removed. Instead, I've introduced RETURN_ERROR() macro, which either
returns jperError or issues ereport given in the argument.

> > +/*
> > + * Find value of jsonpath variable in a list of passing params
> > + */
>
> What is that comment supposed to mean?

Comment is rephrased.

> > +/*
> > + * Get the type name of a SQL/JSON item.
> > + */
> > +static const char *
> > +JsonbTypeName(JsonbValue *jb)
> > +{
>
> Wasn't there pretty similar infrastructure elsewhere?

Yes, but it wasn't exposed. Moved to jsonb.c

> > +/*
> > + * Cross-type comparison of two datetime SQL/JSON items. If items are
> > + * uncomparable, 'error' flag is set.
> > + */
>
> Sounds like it'd not raise an error, but it does in a bunch of scenarios.

Removed as well.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Jsonpath-engine-and-docs-v33.patch application/octet-stream 258.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-02-24 09:21:10 Re: pg_dump multi VALUES INSERT
Previous Message David Rowley 2019-02-24 07:49:34 Re: Ordered Partitioned Table Scans