Hostnames in pg_hba.conf

From: Bart Samwel <bart(at)samwel(dot)tk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Hostnames in pg_hba.conf
Date: 2010-02-11 13:13:09
Message-ID: ded01eb21002110513n296d60b7me5255820a69a4bff@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi there,

I've been working on a patch to add hostname support to pg_hba.conf. It's
not ready for public display yet, but I would just like to run a couple of
issues / discussion points past everybody.

ISSUE #1: Performance / caching

At present, I've simply not added caching. The reasoning for this is as
follows:
(a) getaddrinfo doesn't tell us about expiry, so when do you refresh?
(b) If you put the cache in the postmaster, it will not work for exec-based
backends as opposed to fork-based backends, since those read pg_hba.conf
every time they are exec'ed.
(c) If you put this in the postmaster, the postmaster will have to update
the cache every once in a while, which may be slow and which may prevent new
connections while the cache update takes place.
(d) Outdated cache entries may inexplicably and without any logging choose
the wrong rule for some clients. Big aargh: people will start using this to
specify 'deny' rules based on host names.

If you COULD get expiry info out of getaddrinfo you could potentially store
this info in a table or something like that, and have it updated by the
backends? But that's way over my head for now. ISTM that this stuff may
better be handled by a locally-running caching DNS server, if people have
performance issues with the lack of caching. These local caching DNS servers
can also handle expiry correctly, etcetera.

We should of course still take care to look up a given hostname only once
for each connection request.

ISSUE #2: Reverse lookup?

There was a suggestion on the TODO list on the wiki, which basically said
that maybe we could use reverse lookup to find "the" hostname and then check
for that hostname in the list. I think that won't work, since IPs can go by
many names and may not support reverse lookup for some hostnames (/etc/hosts
anybody?). Furthermore, due to the top-to-bottom processing of pg_hba.conf,
you CANNOT SKIP entries that might possibly match. For instance, if the
third line is for host "foo.example.com" and the fifth line is for "
bar.example.com", both lines may apply to the same IP, and you still HAVE to
check the first one, even if reverse lookup turns up the second host name.
So it doesn't save you any lookups, it just costs an extra one.

ISSUE #3: Multiple hostnames?

Currently, a pg_hba entry lists an IP / netmask combination. I would suggest
allowing lists of hostnames in the entries, so that you can at least mimic
the "match multiple hosts by a single rule". Any reason not to do this?

Comments / bright ideas are welcome, especially regarding issue #1.

Cheers,
Bart

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-02-11 13:18:23 Re: knngist patch support
Previous Message Simon Riggs 2010-02-11 13:06:39 Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL