Re: My Experiment of PG crash when dealing with huge amount of data

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: 高健 <luckyjackgao(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: My Experiment of PG crash when dealing with huge amount of data
Date: 2013-09-02 19:06:58
Message-ID: CAMkU=1xFGpLW323BCFSQYDuJDmQy54ruefci47evmAatkP8NHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Sep 1, 2013 at 6:25 PM, 高健 <luckyjackgao(at)gmail(dot)com> wrote:
>>To spare memory, you would want to use something like:
>
>>insert into test01 select generate_series,
>>repeat(chr(int4(random()*26)+65),1024) from
>>generate_series(1,2457600);
>
> Thanks a lot!
>
> What I am worrying about is that:
> If data grows rapidly, maybe our customer will use too much memory ,

The size of the data has little to do with it. Take your example as
an example. The database could have been nearly empty before you
started running that query. A hostile or adventurous user can craft
queries that will exhaust the server's memory without ever needing any
particular amount of data in data_directory, except maybe in the temp
tablespace.

So it is a matter of what kind of users you have, not how much data
you anticipate having on disk.

The parts of PostgreSQL that might blow up memory based on ordinary
disk-based tables are pretty well protected by shared_buffers,
temp_buffers, work_mem, maintenance_work_mem, etc. already. It is the
things that don't directly map to data already on disk which are
probably more vulnerable.

> Is
> ulimit command a good idea for PG?

I've used ulimit -v on a test server (which was intentionally used to
test things to limits of destruction), and was happy with the results.
It seemed like it would error out the offending process, or just the
offending statement, in a graceful way; rather than having random
processes other than the culprit be brutally killed by OOM, or having
the machine just swap itself into uselessness. I'd be reluctant to
use it on production just on spec that something bad *might* happen
without it, but if I started experiencing problems caused by a single
rogue process using outrageous amounts of memory, that would be one of
my first stops.

Experimentally, shared memory does count against the -v limit, and the
limit has to be set rather higher than shared_buffers, or else your
database won't even start.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2013-09-02 20:35:08 Re: [GENERAL] currval and DISCARD ALL
Previous Message Tim Kane 2013-09-02 15:53:12 Re: psql --command option ignores --variable's