Re: formatting.c cleanup

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: formatting.c cleanup
Date: 2025-10-20 13:31:16
Message-ID: F6D35E15-B49D-4CF2-8094-293F2907E137@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 20, 2025, at 17:50, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> The file formatting.c contains some hard to read and understand code. For the attached patch series, I made a few more or less mechanical passes over it to modernize the code a bit, renamed some symbols to be clearer, adjusted some types to make things more self-documenting, removed some redundant code to make some things more compact. I hope this helps a bit.
> <v1-0001-formatting.c-cleanup-Remove-dashes-in-comments.patch><v1-0002-formatting.c-cleanup-Move-loop-variables-definiti.patch><v1-0003-formatting.c-cleanup-Use-size_t-for-string-length.patch><v1-0004-formatting.c-cleanup-Add-some-const-char-qualifie.patch><v1-0005-formatting.c-cleanup-Use-array-syntax-instead-of-.patch><v1-0006-formatting.c-cleanup-Remove-unnecessary-extra-par.patch><v1-0007-formatting.c-cleanup-Remove-unnecessary-extra-lin.patch><v1-0008-formatting.c-cleanup-Remove-unnecessary-zeroize-m.patch><v1-0009-formatting.c-cleanup-Improve-formatting-of-some-s.patch><v1-0010-formatting.c-cleanup-Change-TmFromChar.clock-fiel.patch><v1-0011-formatting.c-cleanup-Change-several-int-fields-to.patch><v1-0012-formatting.c-cleanup-Rename-DCH_S_-to-DCH_SUFFIX_.patch><v1-0013-formatting.c-cleanup-Change-fill_str-return-type-.patch>

0001 - Only removes dashes from comments. No logic change. LGTM.

0002 - Moves loop variable into for, which makes function local variable list shorter, and makes it easier to see which variables will still be used after for loop. So I think that improves readability. I searched for missing occurrence, and didn’t find. So, LGTM.

0003 - Replace int with size_t wherever possible. LGTM.

0004 - Changes “char *” to “const char *” wherever possible.

I found some “text *” can be “const text *”:

```
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 4c217d0e825..cc290903bb6 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1043,7 +1043,7 @@ typedef struct NUMProc
read_post, /* to_number - number of dec. digit */
read_pre; /* to_number - number non-dec. digit */

- char *number, /* string with number */
+ char *number, /* string with number */
*number_p, /* pointer to current number position */
*inout, /* in / out buffer */
*inout_p, /* pointer to current inout position */
@@ -1110,11 +1110,11 @@ static bool from_char_seq_search(int *dest, const char **src,
const char *const *array,
char **localized_array, Oid collid,
FormatNode *node, Node *escontext);
-static bool do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
+static bool do_to_timestamp(text *date_txt, const text *fmt, Oid collid, bool std,
struct pg_tm *tm, fsec_t *fsec, struct fmt_tz *tz,
int *fprec, uint32 *flags, Node *escontext);
static void fill_str(char *str, int c, size_t maxlen);
-static FormatNode *NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree);
+static FormatNode *NUM_cache(int len, NUMDesc *Num, const text *pars_str, bool *shouldFree);
static char *int_to_roman(int number);
static int roman_to_int(NUMProc *Np, size_t input_len);
static void NUM_prepare_locale(NUMProc *Np);
@@ -3902,7 +3902,7 @@ DCH_cache_fetch(const char *str, bool std)
* for formatting.
*/
static text *
-datetime_to_char_body(TmToChar *tmtc, text *fmt, bool is_interval, Oid collid)
+datetime_to_char_body(TmToChar *tmtc, const text *fmt, bool is_interval, Oid collid)
{
FormatNode *format;
char *fmt_str,
@@ -4394,7 +4394,7 @@ datetime_format_has_tz(const char *fmt_str)
* struct 'tm', 'fsec', struct 'tz', and 'fprec'.
*/
static bool
-do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
+do_to_timestamp(text *date_txt, const text *fmt, Oid collid, bool std,
struct pg_tm *tm, fsec_t *fsec, struct fmt_tz *tz,
int *fprec, uint32 *flags, Node *escontext)
{
@@ -4952,7 +4952,7 @@ NUM_cache_fetch(const char *str)
* Cache routine for NUM to_char version
*/
static FormatNode *
-NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree)
+NUM_cache(int len, NUMDesc *Num, const text *pars_str, bool *shouldFree)
{
FormatNode *format = NULL;
char *str;
```

0005 - Replaces pointer+offset with array syntax. LGMT.

0006 - Removes unnecessary (). LGTM.

0007 - I am not convinced with this change. Combining two constant string into one causes the line too long, exceeding 80 columns.

0008 - Removes some macros. LGTM.

0009 - Improves some structure definition. LGTM.

0010 - Changes TmFromChar.clock from int to bool.

A suggestion is that maybe we can move the new field “clock_12_hour” next to “bool has_tz”, so that size of the structure is reduced by 4 bytes.

0011 - Changes some int to enum. LGTM.

0012 - Renames DCH_S_* to DCH_SUFFIX_*, make the symbols more descriptive. LGTM.

0013 - Changes fill_str() to return void.

```
-static char *
+static void
fill_str(char *str, int c, size_t maxlen)
{
memset(str, c, maxlen);
str[maxlen] = '\0';
- return str;
}
```

This function has no comment, so I am not sure what “maxlen” exactly mean. Usually, we do “str[maxlen-1] = ‘\0’ because maxlen usually implies max length of the buffer. But this function seems to fill maxlen of c into str, then “maxlen” might not be a good name, “count” could be better.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-10-20 13:32:26 Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master
Previous Message Zhenghua Lyu 2025-10-20 13:04:54 Re: Question on ThrowErrorData