| From: | Haibo Yan <tristan(dot)yim(at)gmail(dot)com> |
|---|---|
| To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
| Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: implement CAST(expr AS type FORMAT 'template') |
| Date: | 2026-07-02 00:00:58 |
| Message-ID: | CABXr29Ed4NW4Uk=TBTBwWpJT1YbwYifHSQ7Ny_=o95YR=-c7RQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi David,
Thanks, this is helpful.
On Wed, Jul 1, 2026 at 1:59 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Wed, Jul 1, 2026 at 9:45 AM Haibo Yan <tristan(dot)yim(at)gmail(dot)com> wrote:
>>
>> Hi Jian,
>>
>> Thanks for looking at this.
>>
>> On Wed, Jul 1, 2026 at 6:45 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>> >
>> > On Wed, Jul 1, 2026 at 1:39 AM Haibo Yan <tristan(dot)yim(at)gmail(dot)com> wrote:
>> > >
>> > > Thanks for looking at the patches.
>> > > I liked your suggestion to call these “format casts” rather than
>> > > “formatters”, so I changed the series in that direction. The DDL is
>> > > now:
>> > > ------------------------------------------------------
>> > > CREATE FORMAT CAST (...)
>> > > DROP FORMAT CAST (...)
>> > > ------------------------------------------------------
>> > > rather than CREATE/DROP FORMATTER, so the patch no longer adds
>> > > FORMATTER as a keyword.
>> > > I also renamed the related catalog and node names, so this now uses
>> > > pg_format_cast and CoerceViaFormatCast. The
>> > > pg_dump/object-address/comment/extension code and the regression tests
>> > > have been updated to use the new terminology as well.
>> > > The rest of the design is the same as before: format casts are still
>> > > catalog-driven, and both built-in and user-defined cases go through
>> > > the same lookup path.
>> > >
>> >
>> > In an earlier thread [1], [2], I proposed introducing:
>> >
>> > CREATE CAST (source_type AS target_type)
>> > WITH [SAFE] FUNCTION function_name [ (argument_type [, ...]) ]
>> >
>> > That drew some pushback over non-standard conformance.
>> > Here, we are introducing
>> >
>> > CREATE FORMAT CAST (source_type AS target_type)
>> > WITH FUNCTION function_name [ (argument_type [, ...]) ]
>> >
>> > This syntax is quite close to CREATE CAST.
>> > per https://www.postgresql.org/docs/devel/sql-createcast.html
>> > CREATE FORMAT CAST would be an extension to the standard, though we
>> > already have non-standard precedent: AS IMPLICIT.
>> >
>> > Before proceeding, do we need to reach consensus on the syntax itself?
>>
>> yes, I agree that we should get agreement on the DDL syntax before going
>> much further.
>>
>> I see the analogy with the earlier SAFE discussion, but I think this case is a
>> little different. This patch creates a separate object kind rather than adding
>> another option to ordinary CREATE CAST
>>
>> >
>> > Does the following alternative syntax make sense for CAST FORMAT?
>>
>> My reason for preferring CREATE FORMAT CAST is that a format cast is not
>> really an ordinary cast with one more attribute. Ordinary casts live in
>> pg_cast and are tied to things like cast context and cast method. A format
>> cast is only considered for
>>
>> CAST(expr AS type FORMAT fmt)
>>
>>
>> and not for ordinary casts without a FORMAT clause, assignment casts, or
>> implicit casts. It also always needs a function whose second argument is the
>> FORMAT expression, not the destination typmod argument used by ordinary cast
>> functions.
>> So I think
>>
>> CREATE CAST (...) WITH FUNCTION ... AS FORMAT
>>
>> would make this look more like a pg_cast entry than it really is.
>
>
> I think the fact the standard put this inside the 'cast(...)' means it's quite reasonable to consider the aspect part of a cast definition as opposed to something wholly different.
>
> When we issue "create table" both pg_class and pg_attribute are modified. It seems quite reasonable that executing "create cast" causes both pg_cast and pg_cast_format to be populated.
>
> With two separate commands a cast could exist on pg_cast_format that doesn't exist in pg_cast. I'm unsure whether this is allowed or desired...but seems a bit unexpected to me at least.
>
I see what you mean. I had been thinking of format casts as separate because
they are only considered for
CAST(expr AS type FORMAT fmt)
and not for ordinary casts without a FORMAT clause. That is why the patch
currently uses a separate catalog and a separate CREATE FORMAT CAST command.
But your pg_class / pg_attribute analogy is a good point. A separate
catalog does not necessarily imply a separate DDL command, and it may be
surprising if a format-cast entry can exist without any corresponding ordinary
cast relationship.
Maybe the question for the thread is whether a format cast should be modeled as
a separate object, or as additional information associated with an ordinary
cast. The latter could still use a separate catalog internally, but be created
through some form of CREATE CAST.
>>
>> >
>> > CREATE CAST (source_type AS target_type)
>> > WITH FUNCTION function_name [ (argument_type [, ...]) ]
>> > AS FORMAT
>> > ----------------------------------------------------------------------------------------
>> > As I mentioned previously [3], there's an open question around this case:
>> >
>> > CAST('{2022-01-01, 2022-21-01}' AS DATE[] FORMAT 'YYYY-DD-MM');
>> >
>> > does this mean the format template ('YYYY-DD-MM') should be applied to each
>> > element (2022-01-01, 2022-21-01) individually via the cast format function?
>
>
> Yes, the internal array formatting structure used by PostgreSQL isn't (needn't be) user-configurable; the useful thing for the format to apply to is the value of the elements of the array.
>
> IMO the only change the addition of a format clause should make when performing the cast is allowing a cast that would fail due to a syntax error to succeed.
>
> postgres=# select cast(cast('{"2025-01-01","2026-02-02","2027-03-03"}'::text as
> text[]) as date[]);
> date
> ------------------------------------
> {2025-01-01,2026-02-02,2027-03-03}
> (1 row)
>
> postgres=# select cast(cast('{"2025T01-01","2026T02-02","2027T03-03"}'::text as
> text[]) as date[]);
> ERROR: invalid input syntax for type date: "2025T01-01"
If PostgreSQL would normally cast an array by applying the element cast to each
element, then it is natural to expect the FORMAT clause to affect that element
cast rather than the array container syntax.
CAST('{2025T01-01,2026T02-02}' AS date[] FORMAT 'YYYY"T"MM-DD')
could reasonably mean: parse the array structure as usual, and apply the
formatted text -> date conversion to each element.
The current patch does not implement that. It does an exact source/target
lookup in pg_format_cast, so an array target would currently require a format
cast for the array type itself. But I agree that this may not be the behavior
users would expect, especially given the existing array coercion behavior.
If we decide that element-wise array formatting is required, one possible
implementation would be to keep the scalar format-cast function signature as
function(source_element_type, text) returns target_element_type
and add a format-aware path in the array coercion logic. The array expression
would iterate over elements and pass the same FORMAT expression to the scalar
format cast. In that design, the array container itself would not need a
separate format-cast entry.
>
> A format clause would make the second query not error with an appropriate format specification - in which case it then behaves identically to the first query.
I agree that this is the important case for input parsing. I think FORMAT is
also useful in the output direction, though. For example,
CAST(date '2026-07-01' AS text FORMAT 'YYYY/MM/DD')
would succeed without FORMAT, but the FORMAT clause controls the textual
representation.
>
> If the SQL Standard somehow contradicts this intent I'd be curious to understand what it does intend.
>
> David J.
>
So I think we have two related design questions to settle:
whether format casts are separate objects or additional information
associated with ordinary casts;
whether element-wise array formatting should be part of the initial design.
Regards,
Haibo
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Chao Li | 2026-07-01 23:51:03 | Re: doc: fix pg_stat_autovacuum_scores threshold wording |