Re: ISN patch that applies cleanly with git apply

From: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Jan Otto <asche(at)me(dot)com>
Subject: Re: ISN patch that applies cleanly with git apply
Date: 2010-10-18 15:07:02
Message-ID: AANLkTimcyvx47XuJq0vy7jVjXL0QwJSSMOst0YU8SuJe@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I may be in the minority here, but I'm inclined to just apply this and
> move on.  I agree with your concerns that this whole module is badly
> designed and that the approach of hard-coding the list of ISBN ranges
> is fundamentally unscalable, but unless we're going to rip it out or
> rearchitect it, we don't really get any benefit out of digging in our
> heels and refusing to apply occasional band-aids.

I don't think you'll be in the minority. I suspect that we'll end up
applying this patch, because your reasoning is sound. As long as we're
stuck with this broken design, we might as well cover up the cracks.
We should at least acknowledge that the entire approach is
wrong-headed though.

> 1. Stop storing the list of ISBN ranges in an array and put it in a
> table.  That way, users without C programming skills can still update
> the list of ranges, and users with C programming skills can do it
> without recompiling.

+1. We might even be able to include a function written in an
untrusted pl, that keeps the ranges current by downloading and parsing
the XML file that Jan Otto linked to. Can it be considered a stable
"web service" (a term that I use loosely)? Is that the kind of thing
we'd ever consider doing? After all, I suppose that the International
ISBN Agency are the highest authority in the world on this matter.

Otherwise, we could encourage the user to implement their own function
to keep the table current (which would be called by a psql cron job or
something), but in practice most would use a separately distributed
untrusted pl function. That would prevent us from having to take full
responsibility for someone else's XML file/ "web service" becoming
unavailable. ISBN international don't make any promises about the
file, so perhaps even this is a bad idea.

> 2. Don't try to validate the list of ISBN ranges at all.

Actually, we just use the range information to hyphenate ISBNs...they
won't actually be rejected unless they have an invalid check digit,
or, in the case of ISBN-13s, if their country codes (first 3 digits)
aren't "bookland", either 978 or 979. Perhaps the problem of new
ranges being introduced over time isn't all that much of a problem.

Could this patch be backported as a bug fix? It adds the new bookland
country code of 979. Prior versions of the patch will outright reject
these correct ISBN-13s, so I think that is a good idea.

--
Regards,
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Cave-Ayland 2010-10-18 15:13:04 Re: knngist - 0.8
Previous Message Tom Lane 2010-10-18 14:52:21 Re: WIP: extensible enums