Re: jsonpath

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: jsonpath
Date: 2019-01-30 02:28:26
Message-ID: 20190130022826.of7w6xv5lbvzqxfp@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> +/*
> + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> + * Actual datatype (returned in 'typid', 'typmod') is determined by
> + * presence of date/time/zone components in the format string.
> + */
> +Datum
> +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> + int32 *typmod, int *tz)

Given other to_<type> functions being SQL callable, I'm not a fan of the
naming here.

> +{
> + struct pg_tm tm;
> + fsec_t fsec;
> + int fprec = 0;
> + int flags;
> +
> + do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> +
> + *typmod = fprec ? fprec : -1; /* fractional part precision */
> + *tz = 0;
> +
> + if (flags & DCH_DATED)
> + {
> + if (flags & DCH_TIMED)
> + {
> + if (flags & DCH_ZONED)
> + {
> + TimestampTz result;
> +
> + if (tm.tm_zone)
> + tzname = (char *) tm.tm_zone;
> +
> + if (tzname)
> + {
> + int dterr = DecodeTimezone(tzname, tz);
> +
> + if (dterr)
> + DateTimeParseError(dterr, tzname, "timestamptz");

Do we really need 6/7 indentation levels in new code?

> +jsonpath_scan.c: FLEXFLAGS = -CF -p -p

Why -CF, and why is -p repeated?

> - JsonEncodeDateTime(buf, val, DATEOID);
> + JsonEncodeDateTime(buf, val, DATEOID, NULL);

ISTM API changes like this ought to be done in a separate patch, to ease
review.

> }
>
> +
> /*
> * Compare two jbvString JsonbValue values, a and b.
> *

Random WS change.

> +/*****************************INPUT/OUTPUT*********************************/

Why do we need this much code to serialize data that we initially got in
serialized manner? Couldn't we just keep the original around?

> +Datum
> +jsonpath_in(PG_FUNCTION_ARGS)
> +{

No binary input support? I'd suggest adding that, but keeping the
representation the same. Otherwise just adds unnecesary complexity for
driver authors wishing to use the binar protocol.

> +/********************Support functions for JsonPath**************************/
> +
> +/*
> + * Support macroses to read stored values
> + */

s/macroses/macros/

> +++ b/src/backend/utils/adt/jsonpath_exec.c

Gotta say, I'm unenthusiastic about yet another execution engine in some
PG subsystem.

> @@ -0,0 +1,2776 @@
> +/*-------------------------------------------------------------------------
> + *
> + * jsonpath_exec.c
> + * Routines for SQL/JSON path execution.
> + *
> + * Copyright (c) 2019, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + * src/backend/utils/adt/jsonpath_exec.c
> + *
> + *-------------------------------------------------------------------------
> + */

Somewhere there needs to be higher level documentation explaining how
this stuff is supposed to work on a code level.

> +
> +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> +#define jspStrictAbsenseOfErrors(cxt) (!(cxt)->laxMode)
> +#define jspAutoUnwrap(cxt) ((cxt)->laxMode)
> +#define jspAutoWrap(cxt) ((cxt)->laxMode)
> +#define jspIgnoreStructuralErrors(cxt) ((cxt)->ignoreStructuralErrors)
> +#define jspThrowErrors(cxt) ((cxt)->throwErrors)

What's the point?

> +#define ThrowJsonPathError(code, detail) \
> + ereport(ERROR, \
> + (errcode(ERRCODE_ ## code), \
> + errmsg(ERRMSG_ ## code), \
> + errdetail detail))
> +
> +#define ThrowJsonPathErrorHint(code, detail, hint) \
> + ereport(ERROR, \
> + (errcode(ERRCODE_ ## code), \
> + errmsg(ERRMSG_ ## code), \
> + errdetail detail, \
> + errhint hint))

These seem ill-advised, just making it harder to understand the code,
and grepping for it, without actually meaningfully simplifying anything.

> +/*
> + * Find value of jsonpath variable in a list of passing params
> + */

What is that comment supposed to mean?

> +/*
> + * Get the type name of a SQL/JSON item.
> + */
> +static const char *
> +JsonbTypeName(JsonbValue *jb)
> +{

Wasn't there pretty similar infrastructure elsewhere?

> +/*
> + * Cross-type comparison of two datetime SQL/JSON items. If items are
> + * uncomparable, 'error' flag is set.
> + */

Sounds like it'd not raise an error, but it does in a bunch of scenarios.

> @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
> 2200N E ERRCODE_INVALID_XML_CONTENT invalid_xml_content
> 2200S E ERRCODE_INVALID_XML_COMMENT invalid_xml_comment
> 2200T E ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION invalid_xml_processing_instruction
> +22030 E ERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE duplicate_json_object_key_value
> +22031 E ERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION invalid_argument_for_json_datetime_function
> +22032 E ERRCODE_INVALID_JSON_TEXT invalid_json_text
> +22033 E ERRCODE_INVALID_JSON_SUBSCRIPT invalid_json_subscript
> +22034 E ERRCODE_MORE_THAN_ONE_JSON_ITEM more_than_one_json_item
> +22035 E ERRCODE_NO_JSON_ITEM no_json_item
> +22036 E ERRCODE_NON_NUMERIC_JSON_ITEM non_numeric_json_item
> +22037 E ERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT non_unique_keys_in_json_object
> +22038 E ERRCODE_SINGLETON_JSON_ITEM_REQUIRED singleton_json_item_required
> +22039 E ERRCODE_JSON_ARRAY_NOT_FOUND json_array_not_found
> +2203A E ERRCODE_JSON_MEMBER_NOT_FOUND json_member_not_found
> +2203B E ERRCODE_JSON_NUMBER_NOT_FOUND json_number_not_found
> +2203C E ERRCODE_JSON_OBJECT_NOT_FOUND object_not_found
> +2203F E ERRCODE_JSON_SCALAR_REQUIRED json_scalar_required
> +2203D E ERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS too_many_json_array_elements
> +2203E E ERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
> too_many_json_object_members

Are all of these invented by us?

Greetings,

Andres Freund

In response to

  • Re: jsonpath at 2019-01-29 01:00:19 from Alexander Korotkov

Responses

  • Re: jsonpath at 2019-01-30 04:34:00 from Alexander Korotkov
  • Re: jsonpath at 2019-02-24 09:03:19 from Alexander Korotkov

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-01-30 02:35:13 Re: Early WIP/PoC for inlining CTEs
Previous Message Hubert Zhang 2019-01-30 02:26:52 Re: Control your disk usage in PG: Introduction to Disk Quota Extension