Re: Password identifiers, protocol aging and SCRAM protocol

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Julian Markwort <julian(dot)markwort(at)uni-muenster(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Valery Popov <v(dot)popov(at)postgrespro(dot)ru>
Subject: Re: Password identifiers, protocol aging and SCRAM protocol
Date: 2016-12-09 08:06:24
Message-ID: 0ec34d07-dc25-a2e5-2e85-f7a26c4d3423@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/09/2016 05:58 AM, Michael Paquier wrote:
> On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>> Actually, we don't give away that information currently. If you try to log
>>> in with password or MD5 authentication, and the user doesn't exist, you get
>>> the same error as with an incorrect password. So, I think we do need to give
>>> the client a made-up salt and iteration count in that case, to hide the fact
>>> that the user doesn't exist. Furthermore, you can't just generate random
>>> salt and iteration count, because then you could simply try connecting
>>> twice, and see if you get the same salt and iteration count. We need to
>>> deterministically derive the salt from the username, so that you get the
>>> same salt/iteration count every time you try connecting with that username.
>>> But it needs indistinguishable from a random salt, to the client. Perhaps a
>>> SHA hash of the username and some per-cluster secret value, created by
>>> initdb. There must be research papers out there on how to do this..
>>
>> A simple idea would be to use the system ID when generating this fake
>> salt? That's generated by initdb, once per cluster. I am wondering if
>> it would be risky to use it for the salt. For the number of iterations
>> the default number could be used.

I think I'd feel better with a completely separate randomly-generated
value for this. System ID is not too difficult to guess, and there's no
need to skimp on this. Yes, default number of iterations makes sense.

We cannot completely avoid leaking information through this,
unfortunately. For example, if you have a user with a non-default number
of iterations, and an attacker probes that, he'll know that the username
was valid, because he got back a non-default number of iterations. But
let's do our best.

> I have been thinking more about this part quite a bit, and here is the
> most simple thing that we could do while respecting the protocol.
> That's more or less what I think you have in mind by re-reading
> upthread, but it does not hurt to rewrite the whole flow to be clear:
> 1) Server gets the startup packet, maps pg_hba.conf and moves on to
> the scram authentication code path.
> 2) Server sends back sendAuthRequest() to request user to provide a
> password. This maps to the plain/md5 behavior as no errors would be
> issued to user until he has provided a password.
> 3) Client sends back the password, and the first message with the user name.
> 4) Server receives it, and checks the data. If a failure happens at
> this stage, just ERROR on PG-side without sending back a e= message.
> This includes the username-mismatch, empty password and end of
> password validity. So we would never use e=unknown-user. This sticks
> with what you quoted upthread that the server may end the exchange
> before sending the final message.

If we want to mimic the current behavior with MD5 authentication, I
think we need to follow through with the challenge, and only fail in the
last step, even if we know the password was empty or expired. MD5
authentication doesn't currently give away that information to the user.

But it's OK to bail out early on OOM, or if the client sends an outright
broken message. Those don't give away any information on the user account.

> 5) Server sends back the challenge, and client answers back with its
> reply to it.

> Then enters the final stage of the exchange, at which point the server
> would issue its final message that would be e= in case of errors. If
> something like an OOM happens, no message would be sent so failing on
> an OOM ERROR on PG side would be fine as well.

> 6) Read final message from client and validate.
> 7) issue final message of server.
>
> On failure at steps 6) or 7), an e= message is returned instead of the
> final message. Does that look right?

Yep.

> One thing is: when do we look up at pg_authid? After receiving the
> first message from client or before beginning the exchange? As the
> first message from client has the user name, it would make sense to do
> the lookup after receiving it, but from PG prospective it would just
> make sense to use the data already present in the startup packet. The
> current patch does the latter. What do you think?

Let's see what fits the program flow best. Probably best to do it before
beginning the exchange. I'm hacking on this right now...

> By the way, I have pushed the extra patches you sent into this branch:
> https://github.com/michaelpq/postgres/tree/scram

Thanks! We had a quick chat with Michael, and agreed that we'd hack
together on that github repository, to avoid stepping on each other's
toes, and cut rebased patch sets from there to pgsql-hackers every now
and then.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-12-09 08:11:17 Re: Password identifiers, protocol aging and SCRAM protocol
Previous Message Michael Paquier 2016-12-09 07:24:38 Re: `array_position...()` causes SIGSEGV