Re: Memory Issue with array_agg?

From: Robert Sosinski <rsosinski(at)ticketevolution(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org >> PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Memory Issue with array_agg?
Date: 2013-08-19 15:06:47
Message-ID: CAH1hji2D7DM417r0icB307r+fSQ-Z11nBS5PJnd+7Hr7FTK5cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Pavel,

What kind of example do you need? I cant give you the actual data I have
in the table, but I can give you an example query and the schema attached
below. From there, I would just put in 2 million rows worth 1.2 Gigs of
data. Average size of the the extended columns (using the pg_column_size
function) in bytes are:

guid: 33
name: 2.41
currency: 4
fields: 120.32

example query:

-- find duplicate records using a guid
select guid, array_agg(id) from orders group by guid;

example schema:
Table "public.things"

Column | Type | Modifiers
| Storage | Stats target | Description
------------+-----------------------------+-----------------------------------------------------+----------+--------------+-------------
id | integer | not null default
nextval('things_id_seq'::regclass) | plain | |
version | integer | not null
| plain | |
created_at | timestamp without time zone | not null
| plain | |
updated_at | timestamp without time zone | not null
| plain | |
foo_id | integer | not null
| plain | |
bar_id | integer | not null
| plain | |
baz_id | integer | not null
| plain | |
guid | character varying | not null
| extended | |
name | character varying | not null
| extended | |
price | numeric(12,2) | not null
| main | |
currency | character varying | not null
| extended | |
amount | integer | not null
| plain | |
the_date | date | not null
| plain | |
fields | hstore |
| extended | |
Indexes:
"things_pkey" PRIMARY KEY, btree (id)
"things_foo_id_idx" btree (foo_id)
"things_bar_id_idx" btree (bar_id)
"things_baz_id_idx" btree (baz_id)
"things_guid_uidx" UNIQUE, btree (guid)
"things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10,
'0'::text))
"things_price_idx" btree (price)

Foreign-key constraints:
"things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id)
"things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id)
"things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id)
Triggers:
timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW EXECUTE
PROCEDURE timestamps_tfun()

Let me know if you need anything else.

Thanks,

On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>wrote:

> Hello
>
> please, can you send some example or test?
>
> Regards
>
> Pavel Stehule
>
>
> 2013/8/19 Robert Sosinski <rsosinski(at)ticketevolution(dot)com>
>
>> When using array_agg on a large table, memory usage seems to spike up
>> until Postgres crashes with the following error:
>>
>> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
>> because of crash of another server process
>> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded
>> this server process to roll back the current transaction and exit, because
>> another server process exited abnormally and possibly corrupted shared
>> memory.
>> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able
>> to reconnect to the database and repeat your command.
>>
>> I've definitely isolated it down to using array_agg, as when I changed
>> the query to use string_agg, it worked fine. I also tried using array_agg
>> on a few different queries, all yielding the same issue. Swapping in
>> string_agg fixed the issue once more.
>>
>> This particular table has over 2 million rows and is 1.2 Gigs, and when I
>> ran the query while viewing htop, the virtual size of the Postgres
>> process ballooned to 13.9G until crashing.
>>
>> The version of Postgres I am using is: PostgreSQL 9.2.4 on
>> x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1)
>> 4.7.2, 64-bit
>>
>> Any help would be much appreciated, thanks!
>>
>> -Robert
>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2013-08-19 15:12:15 Re: Memory Issue with array_agg?
Previous Message Heikki Linnakangas 2013-08-19 11:48:21 Re: [JDBC] Incorrect response code after XA recovery