New "-b slim" option in 2019b zic: should we turn that on?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: New "-b slim" option in 2019b zic: should we turn that on?
Date: 2019-07-17 22:42:07
Message-ID: 24998.1563403327@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I just finished updating our timezone code to match IANA release
2019b. There's an interesting new switch in zic: if you say
"-b slim", it generates zone data files that have only 64-bit
data (not the 32-bit plus 64-bit data that it's been emitting
for years), and it drops other space-wasting hacks that are needed
only for backwards compatibility with old timezone libraries.

That is not us, so I wonder whether we shouldn't turn on that switch.

I did a quick comparison of the file sizes, and indeed there's
a noticeable per-file savings, eg

$ ls -l timezone.fat/America/New_York timezone.slim/America/New_York
-rw-r--r--. 3 postgres postgres 3536 Jul 17 18:08 timezone.fat/America/New_York
-rw-r--r--. 3 postgres postgres 1744 Jul 17 18:07 timezone.slim/America/New_York

$ ls -l timezone.fat/Europe/Paris timezone.slim/Europe/Paris
-rw-r--r--. 1 postgres postgres 2962 Jul 17 18:08 timezone.fat/Europe/Paris
-rw-r--r--. 1 postgres postgres 1105 Jul 17 18:07 timezone.slim/Europe/Paris

Now, since the files are pretty much all under 4K, that translates
to exactly no disk space savings on my ext4 filesystem :-(

$ du -hs timezone.fat timezone.slim
1.6M timezone.fat
1.6M timezone.slim

But other filesystems that are smarter about small files would
probably benefit. Also, there's a significant difference in
the size of a compressed tarball:

-rw-rw-r--. 1 postgres postgres 148501 Jul 17 18:09 timezone.fat.tgz
-rw-rw-r--. 1 postgres postgres 80511 Jul 17 18:09 timezone.slim.tgz

not that that really helps us, because we don't include these
generated files in our tarballs.

Despite the marginal payoff, I'm strongly tempted to enable this
switch. The only reason I can think of not to do it is if somebody
is using a Postgres installation's share/timezone tree as tzdata
for some other program with not-up-to-date timezone library code.
But who would that be?

A possible compromise is to turn it on only in HEAD, though I'd
rather keep all the branches working the same as far as the
timezone code goes.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-07-17 22:45:39 Re: using explicit_bzero
Previous Message Andres Freund 2019-07-17 22:32:53 Re: PG 11 JIT deform failure