Consider \v to the list of whitespace characters in the parser

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Evan Jones <evan(dot)jones(at)datadoghq(dot)com>
Subject: Consider \v to the list of whitespace characters in the parser
Date: 2023-06-21 06:45:32
Message-ID: ZJKcjNwWHHvw9ksQ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,
(Adding Evan in CC as he has reported the original issue with hstore.)

$subject has showed up as a subject for discussion when looking at the
set of whitespace characters that we use in the parsers:
https://www.postgresql.org/message-id/CA+HWA9bTRDf52DHyU+JOoqEALgRGRo5uHUYTFuduoj3cBfer+Q@mail.gmail.com

On HEAD, these are \t, \n, \r and \f which is consistent with the list
that we use in scanner_isspace().

This has quite some history, first in 9ae2661 that dealt with an old
issue with BSD's isspace where whitespaces may not be detected
correctly. hstore has been recently changed to fix the same problem
with d522b05, still depending on scanner_isspace() for the job makes
the handling of \v kind of strange.

That's not the end of the story. There is an inconsistency with the
way array values are handled for the same problem, where 95cacd1 added
handling for \v in the list of what's considered a whitespace.

Attached is a patch to bring a bit more consistency across the board,
by adding \v to the set of characters that are considered as
whitespace by the parser. Here are a few things that I have noticed
in passing:
- JSON should not escape \v, as defined in RFC 7159.
- syncrep_scanner.l already considered \v as a whitespace. Its
neighbor repl_scanner.l did not do that.
- There are a few more copies that would need a refresh of what is
considered as a whitespace in their respective lex scanners:
psqlscan.l, psqlscanslash.l, cubescan.l, segscan.l, ECPG's pgc.l.

One thing I was wondering: has the SQL specification anything specific
about the way vertical tabs should be parsed?

Thoughts and comments are welcome.
Thanks,
--
Michael

Attachment Content-Type Size
isspace_v.patch text/x-diff 9.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-06-21 06:49:06 Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
Previous Message Peter Geoghegan 2023-06-21 06:23:19 Re: Assert while autovacuum was executing