Re: Missing checks when malloc returns NULL...

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Missing checks when malloc returns NULL...
Date: 2016-08-31 07:30:34
Message-ID: CAB7nPqTfSDR7T4aPxELJbw1W9qR5U3L4=i+KgR11vH+smxZ0Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 31, 2016 at 7:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
>> [ malloc-nulls-v5.patch ]
>
> I've committed some form of all of these changes

Thanks!

> except the one in
> adt/pg_locale.c. I'm not entirely sure whether we need to do anything
> about that at all, but if we do, this doesn't cut it:
>
> thousands_sep = db_encoding_strdup(encoding, extlconv->thousands_sep);
> grouping = strdup(extlconv->grouping);
>
> + if (!grouping)
> + elog(ERROR, "out of memory");
> +
> #ifdef WIN32
> /* use monetary to set the ctype */
> setlocale(LC_CTYPE, locale_monetary);
>
> There are multiple things wrong with that:
>
> 1. The db_encoding_strdup calls can also return NULL on OOM, and aren't
> being checked likewise. (And there's another plain strdup further down,
> too.)
>
> 2. You can't safely throw an elog right there, because you need to restore
> the backend's prevailing setlocale state first.

Arg I haven't though of this one.

> 3. Also, if you do it like that, you'll leak whatever strings were already
> strdup'd. (This is a relatively minor problem, but still, if we're
> trying to be 100% correct then we're not there yet.)
>
> Also, now that I'm looking at it, I see there's another pre-existing bug:
>
> 4. An elog exit is possible, due to either OOM or encoding conversion
> failure, inside db_encoding_strdup(). This means we have problems #2 and
> #3 even in the existing code.
>
> Now, I believe that the coding intention here was that assigning NULL
> to the respective fields of CurrentLocaleConv is okay and better than
> failing the operation completely. One argument against that is that
> it's unclear whether everyplace that uses any of those fields is checking
> for NULL first; and in any case, silently falling back to nonlocalized
> behavior might not be better than reporting OOM. Still, it's certainly
> better than risking problem #2, which could cause all sorts of subsequent
> malfunctions.
>
> I think that a complete fix for this might go along the lines of
>
> 1. While we are setlocale'd to the nonstandard locales, collect all the
> values by strdup'ing into a local variable of type "struct lconv".
> (We must strdup for the reason noted in the comment, that localeconv's
> output is no longer valid once we do another setlocale.) Then restore
> the standard locale settings.

The one at the top of the file... That's really platform-dependent.

> 2. Use db_encoding_strdup to replace any strings that need to be
> converted. (If it throws an elog, we have no damage worse than
> leaking the already strdup'd strings.)
>
> 3. Check for any nulls in the struct; if so, use free_struct_lconv
> to release whatever we did get, and then throw elog("out of memory").
>
> 4. Otherwise, copy the struct to CurrentLocaleConv.
>
> If we were really feeling picky, we could probably put in a PG_TRY
> block around step 2 to release the strdup'd storage after a conversion
> failure. Not sure if it's worth the trouble.

It doesn't sound that much complicated to do that, I'll see about it,
but I guess that we could just do it in another thread..

> BTW, I marked the commitfest entry closed, but that may have been
> premature --- feel free to reopen it if you submit additional patches
> in this thread.

There are still two things that could be worked I guess:
- plperl and pltcl cleanup, and their abusive use of malloc. I'll
raise a new thread about that after brushing up a bit what I have and
add a new entry in the CF as that's not directly related to this
thread.
- ShmemAlloc and its missing checks. And we can do something here.

I have slept on it, and looked at the numbers. There are 11 calls to
ShmemAlloc in the code, and 4 of them are performing checks. And in
one of them there is this pattern in ShmemInitStruct(), which is also
something ShmemInitHash relies on:

/* It isn't in the table yet. allocate and initialize it */
structPtr = ShmemAlloc(size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
hash_search(ShmemIndex, name, HASH_REMOVE, NULL);
LWLockRelease(ShmemIndexLock);
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("not enough shared memory for data structure"
" \"%s\" (%zu bytes requested)",
name, size)));
}
Which means that processes have an escape window when initializing
shared memory by cleaning up the index if an entry cannot be found and
then cannot be created properly. I don't think that it is a good idea
to change that by forcing ShmemAlloc to fail. So I would tend to just
have the patch attached and add those missing NULL-checks on all the
existing ShmemAlloc() calls.

Opinions?
--
Michael

Attachment Content-Type Size
malloc-nulls-v6.patch application/x-patch 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-08-31 07:35:41 Re: Declarative partitioning - another take
Previous Message Robert Haas 2016-08-31 07:17:55 Re: Declarative partitioning - another take