| From: | David Zhang <david(dot)zhang(at)highgo(dot)ca> | 
|---|---|
| To: | Daniel Verite <daniel(at)manitou-mail(dot)org> | 
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: Making psql error out on output failures | 
| Date: | 2020-02-18 18:28:26 | 
| Message-ID: | a950d923-057c-ae22-2c0e-634160d02ead@highgo.ca | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
hi,
I did some further research on this bug. Here is the summary:
1. Tried to wrap "fputs" similar to "fprintf" redefined by "pg_fprintf", 
but it ended up with too much error due to "fputs" is called everywhere 
in "print_unaligned_text". If add an extra static variable to track the 
output status, then it will be an overkill and potentially more bugs may 
be introduced.
2. Investigated some other libraries such as libexplain 
(http://libexplain.sourceforge.net/) but it definitely will introduce 
the complexity.
3. I think a better way to resolve this issue will still be the solution 
with an extra %m, which can make the error message much more informative 
to the end users. The reason is that,
3.1 Providing the reasons for errors is required by PostgreSQL document, 
https://www.postgresql.org/docs/12/error-style-guide.html
     "Reasons for Errors
     Messages should always state the reason why an error occurred. For 
example:
     BAD:    could not open file %s
     BETTER: could not open file %s (I/O failure)
     If no reason is known you better fix the code."
3.2 Adding "%m" can provide a common and easy to understand reasons 
crossing many operating systems. The "%m" fix has been tested on 
platforms: CentOS 7, RedHat 7, Ubuntu 18.04, Windows 7/10, and macOS 
Mojave 10.14, and it works.
3.3 If multiple errors happened after the root cause "No space left on 
device", such as "No such file or directory" and "Permission denied", 
then it make sense to report the latest one. The end users suppose to 
know the latest error and solve it first. Eventually, "No space left on 
device" error will be showing up.
3.4 Test log on different operating systems.
### CentOS 7
[postgres(at)localhost ~]$ uname -a
Linux localhost.localdomain 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 
15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[postgres(at)localhost ~]$ sudo mount -t tmpfs -o rw,size=1M tmpfs /mnt/ramdisk
[postgres(at)localhost ~]$ df -h
tmpfs                    1.0M     0  1.0M   0% /mnt/ramdisk
[postgres(at)localhost ~]$ psql
psql (12.2)
Type "help" for help.
postgres=# select repeat('111', 1000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)
[postgres(at)localhost ~]$ psql -d postgres  -At -c "select repeat('111', 
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)
[postgres(at)localhost ~]$ psql -d postgres  -At -c "select repeat('111', 
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)
### RedHat 7
[root(at)rh7 postgres]# uname -a
Linux rh7 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux
[postgres(at)rh7 ~]$ sudo mkdir -p /mnt/ramdisk
We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:
     #1) Respect the privacy of others.
     #2) Think before you type.
     #3) With great power comes great responsibility.
[sudo] password for postgres:
[postgres(at)rh7 ~]$ sudo mount -t tmpfs -o rw,size=1M tmpfs /mnt/ramdisk
[postgres(at)rh7 ~]$ psql -d postgres
psql (12.2)
Type "help" for help.
postgres=# select repeat('111', 1000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)
[postgres(at)rh7 ~]$ psql -d postgres  -At -c "select repeat('111', 
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)
[postgres(at)rh7 ~]$ psql -d postgres  -At -c "select repeat('111', 
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)
### Ubuntu 18.04
postgres=# select repeat('111', 10000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)
postgres=# \q
david(at)david-VM:~$ psql -d postgres  -At -c "select repeat('111', 
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)
david(at)david-VM:~$ psql -d postgres  -At -c "select repeat('111', 
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)
### Windows 7
postgres=# select repeat('111', 1000000) \g E:/file
Error printing tuples (No space left on device)
postgres=# \q
C:\pg12.1\bin>psql -d postgres -U postgres -h 172.20.14.29 -At -c 
"select repeat
('111', 100000)" >> E:/file
Error printing tuples (No space left on device)
C:\pg12.1\bin>psql -d postgres -U postgres -h 172.20.14.29 -At -c 
"select repeat
('111', 100000)" -o E:/file
Error printing tuples (No space left on device)
### Windows 10
postgres=# select repeat('111', 1000000) \g E:/file
Error printing tuples (No space left on device)
postgres=# \q
C:\>psql -d postgres -U postgres -h 192.168.0.19 -At -c "select 
repeat('111', 10000000)" -o E:/file
Error printing tuples (No space left on device)
C:\>psql -d postgres -U postgres -h 192.168.0.19 -At -c "select 
repeat('111', 2000000)" >> E:/file
Error printing tuples (No space left on device)
### macOS Mojave 10.14
postgres=# select repeat('111', 1000000) \g  /Volumes/sdcard/file
Error printing tuples (No space left on device)
postgres=# \q
MacBP:bin david$ psql -d postgres -h 192.168.0.10 -At -c "select 
repeat('111', 3000000)" > /Volumes/sdcard/file
Error printing tuples (No space left on device)
MacBP:bin david$ psql -d postgres -h 192.168.0.10 -At -c "select 
repeat('111', 3000000)" -o /Volumes/sdcard/file
Error printing tuples (No space left on device)
On 2020-01-29 1:51 a.m., Daniel Verite wrote:
> 	David Zhang wrote:
>
>>> Are you sure? I don't find that redefinition. Besides
>>> print_aligned_text() also calls putc and puts.
>> Yes, below is the gdb debug message when psql first time detects the
>> error "No space left on device". Test case, "postgres=# select
>> repeat('111', 1000000) \g /mnt/ramdisk/file"
>> bt
>> #0  flushbuffer (target=0x7ffd6a709ad0) at snprintf.c:313
> Indeed. For some reason gdb won't let me step into these fprintf()
> calls, but you're right they're redefined (through include/port.h):
>
> #define vsnprintf		pg_vsnprintf
> #define snprintf		pg_snprintf
> #define vsprintf		pg_vsprintf
> #define sprintf 		pg_sprintf
> #define vfprintf		pg_vfprintf
> #define fprintf 		pg_fprintf
> #define vprintf 		pg_vprintf
> #define printf(...)		pg_printf(__VA_ARGS__)
>
> Anyway, I don't see it leading to an actionable way to reliably keep
> errno, as discussed upthread.
>
> Best regards,
-- 
David
Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
| Attachment | Content-Type | Size | 
|---|---|---|
| Fix-psql-to-report-IO-errors-on-writing-query-results_26-2400.patch | text/plain | 961 bytes | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Stephen Frost | 2020-02-18 18:33:33 | Re: DB running out of memory issues after upgrade | 
| Previous Message | Nagaraj Raj | 2020-02-18 18:10:08 | Re: DB running out of memory issues after upgrade |