Re: Uh-oh: documentation PDF output no longer builds in HEAD

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Uh-oh: documentation PDF output no longer builds in HEAD
Date: 2015-11-10 12:50:39
Message-ID: CA+TgmoZRjnM6tmFovyCN-V7N598O3vZpf378gbWs7bTzHJO_eA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 9, 2015 at 7:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> Curiously though, that gets us down to this:
>> 30615 strings out of 245828
>> 397721 string characters out of 1810780
>> which implies that indeed FlowObjectSetup *is* the cause of most of
>> the strings being entered. I'm not sure how that squares with the
>> observation that there are less than 5000 \pagelabel entries in the
>> postgres-US.aux file. Time for more digging.
>
> Well, after much digging, I've found what seems a workable answer.
> It turns out that the original form of FlowObjectSetup is just
> unbelievably awful when it comes to handling of hyperlink anchors:
> it will put a hyperlink anchor into the PDF for every "flow object",
> that is, everything in the document that could possibly have a link
> to it, whether or not it actually is linked to. And aside from bloating
> the PDF file, it turns out that the hyperlink stuff also consumes some
> control sequence names, which is why we're running out of strings.
>
> There already is logic (probably way older than the hyperlink code)
> in jadetex to avoid generating page-number labels for objects that have
> no cross-references. So what I did to fix this was to piggyback on
> that code: with the attached jadetex.cfg, both a page-number label
> and a hyperlink anchor will be generated for all and only those flow
> objects that have either a page-number reference or a hyperlink reference.
> (We could try to separate those things, but then we'd need two control
> sequence names not one per object for tracking purposes, and anyway many
> objects will have both kinds of reference if they have either.)
>
> This gets us down to ~135000 strings to build HEAD, and not incidentally,
> the resulting PDF is about half the size it was before. I think I've
> also fixed a number of formerly unexplainable broken hyperlinks in the
> PDF; some are still broken, but they were that way before. (It looks
> like <xref> with endterm doesn't work very well in jadetex; all the
> remaining bad links seem to be associated with uses of that.)
>
> Barring objection I'll commit this tomorrow. I'm inclined to back-patch
> it at least into 9.5, maybe further, because I'm afraid we may be closer
> than we realized to exceeding the strings limit in the back branches too.

I am in awe.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-10 14:17:37 Re: Proposal: Trigonometric functions in degrees
Previous Message Michael Paquier 2015-11-10 12:24:59 Per-table log_autovacuum_min_duration is actually documented