Preliminary results for proposed new pgindent implementation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Piotr Stefaniak <postgres(at)piotr-stefaniak(dot)me>
Subject: Preliminary results for proposed new pgindent implementation
Date: 2017-05-19 03:00:40
Message-ID: 30527.1495162840@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Over in this thread:
https://www.postgresql.org/message-id/flat/E1dAmxK-0006EE-1r%40gemulon.postgresql.org
we've been discussing switching to FreeBSD's version of "indent",
specifically the updated version that Piotr Stefaniak is working on,
as the basis for pgindent. There seem to be a small number of bugs
that Piotr needs to fix, but overall it is looking remarkably good
to me.

To create a diff with not too much noise in it, I applied the hacks
(fixes?) already mentioned in the other thread, and I tweaked things
slightly to suppress changes in alignment of comments on #else and
#endif lines. We might well want to reconsider that later, because it's
quite random right now (basically, #else comments are indented to column
33, #endif comments are indented to column 10, and for no reason at all
the latter do not have standard tab substitution). But again, I'm trying
to drill down to the changes we want rather than the incidental ones.

At this point, the changes look pretty good, although still bulky.
I've attached a diff between current HEAD and re-indented HEAD
for anybody who wants to peruse all the details, but basically
what I'm seeing is:

* Improvements in formatting around sizeof and related constructs,
for example:

- *phoned_word = palloc(sizeof(char) * strlen(word) +1);
+ *phoned_word = palloc(sizeof(char) * strlen(word) + 1);

* Likewise, operators after casts work better than before:

res = shm_mq_send_bytes(mqh, sizeof(Size) - mqh->mqh_partial_bytes,
- ((char *) &nbytes) +mqh->mqh_partial_bytes,
+ ((char *) &nbytes) + mqh->mqh_partial_bytes,
nowait, &bytes_written);

* Sane formatting of function typedefs, for example:

* use buf as work area if NULL in-place copy
*/
int (*pull) (void *priv, PullFilter *src, int len,
- uint8 **data_p, uint8 *buf, int buflen);
+ uint8 **data_p, uint8 *buf, int buflen);
void (*free) (void *priv);
};

* Non-typedef struct pointers are now formatted consistently, for example:

-mdcbuf_finish(struct MDCBufData * st)
+mdcbuf_finish(struct MDCBufData *st)

* Better handling of pointers with const/volatile qualifiers, for example:

-static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile * ptr,
+static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile *ptr,

* Better handling of PG_USED_FOR_ASSERTS_ONLY declarations, for example

BlockNumber blkno;
Buffer buf;
- Bucket new_bucket PG_USED_FOR_ASSERTS_ONLY = InvalidBucket;
+ Bucket new_bucket PG_USED_FOR_ASSERTS_ONLY = InvalidBucket;
bool bucket_dirty = false;

* Corner cases where no space was left before a comment are fixed:

@@ -149,12 +149,12 @@ typedef struct RewriteStateData
bool rs_logical_rewrite; /* do we need to do logical rewriting */
TransactionId rs_oldest_xmin; /* oldest xmin used by caller to
* determine tuple visibility */
- TransactionId rs_freeze_xid;/* Xid that will be used as freeze cutoff
- * point */
+ TransactionId rs_freeze_xid; /* Xid that will be used as freeze cutoff
+ * point */
TransactionId rs_logical_xmin; /* Xid that will be used as cutoff
* point for logical rewrites */
- MultiXactId rs_cutoff_multi;/* MultiXactId that will be used as cutoff
- * point for multixacts */
+ MultiXactId rs_cutoff_multi; /* MultiXactId that will be used as cutoff
+ * point for multixacts */
MemoryContext rs_cxt; /* for hash tables and entries and tuples in
* them */
XLogRecPtr rs_begin_lsn; /* XLogInsertLsn when starting the rewrite */

There are also changes that are not really wins, just different, but they
do arguably improve consistency. One is that comments following "else"
and "case" are now always indented at least as far as column 33 (or
indent's -c parameter), whereas the old code would sometimes let them
slide to the left of that:

@@ -723,7 +723,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
/* shouldn't happen */
elog(ERROR, "wrong number of arguments");
}
- else /* is_async */
+ else /* is_async */
{
/* get async result */
conname = text_to_cstring(PG_GETARG_TEXT_PP(0));

This also happens for comments on a function's "}" close brace:

@@ -722,7 +722,7 @@ _metaphone(char *word, /* IN */
End_Phoned_Word;

return (META_SUCCESS);
-} /* END metaphone */
+} /* END metaphone */

It's hard to see these as anything except bug fixes, because the
old indent code certainly looks like it intends to always force
same-line columns to at least the -c column. I haven't quite figured
out why it fails to, or where the new version fixed that. Still,
it's a difference that isn't a particular benefit to us.

Another set of changes is slightly different handling of unrecognized
typedef names:

@@ -250,7 +250,7 @@ typedef enum
PGSS_TRACK_NONE, /* track no statements */
PGSS_TRACK_TOP, /* only top level statements */
PGSS_TRACK_ALL /* all statements, including nested ones */
-} PGSSTrackLevel;
+} PGSSTrackLevel;

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)
The handling of such names was already slightly wonky, though;
note that the previous version was already differently indented
from what would happen if PGSSTrackLevel were known.

Another issue I'm noticing is that "extern C" declarations are
getting reformatted:

diff --git a/src/interfaces/ecpg/include/ecpgtype.h b/src/interfaces/ecpg/include/ecpgtype.h
index 7cc47e9..236d028 100644
--- a/src/interfaces/ecpg/include/ecpgtype.h
+++ b/src/interfaces/ecpg/include/ecpgtype.h
@@ -34,7 +34,7 @@
#define _ECPGTYPE_H

#ifdef __cplusplus
-extern "C"
+extern "C"
{
#endif

The pgindent Perl script is fooling around with these already, and
I think what it's doing is probably not quite compatible with what
the new indent version does, but I haven't tracked it down yet.

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

regards, tom lane

Attachment Content-Type Size
new-pgindent-changes-1.patch.gz application/x-gzip 112.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-05-19 03:03:51 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Bossart, Nathan 2017-05-19 02:44:22 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands