Re: Best way to construct PostgreSQL ArrayType (_int4) from C int array

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Adrian Schreyer <ams214(at)cam(dot)ac(dot)uk>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Best way to construct PostgreSQL ArrayType (_int4) from C int array
Date: 2011-04-28 04:34:20
Message-ID: BANLkTiki7G-O2DcGGGJno-S=q+HE+hD00A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2011/4/27 Adrian Schreyer <ams214(at)cam(dot)ac(dot)uk>:
> The largest arrays I expect at the moment are more or less sparse
> vectors of around 4.8k elements and I have noticed that the
> input/output (C/C++ extension) does not scale well with the number of
> elements in the array.
>
> Using a function that sums all elements in the array, this is the time
> it takes for ~150k arrays of various sizes (including ordering desc
> and limit 10):
>

PostgreSQL doesn't use a index for access to array fields. So access
to fields of packed arrays can be slower for higher subscripts.

Regards

Pavel Stehule

> 128: 61ms
> 256: 80ms
> 512: 681ms
> 1024 1065ms
> 2048 7682ms
> 4096 21332ms
>
> That's why I thought that the construction of the PostgreSQL array was
> not optimal.
>
> On Wed, Apr 27, 2011 at 14:49, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Wed, Apr 27, 2011 at 6:02 AM, Adrian Schreyer <ams214(at)cam(dot)ac(dot)uk> wrote:
>>> At the moment I am using the following code to construct a PostgreSQL
>>> array from a C array in my C extension but I am not so sure if this is
>>> really the best solution:
>>>
>>> const int *data = array.data(); // C array
>>> Datum *d = (Datum *) palloc(sizeof(Datum) * size);
>>>
>>> for (int i = 0; i < size; i++) d[i] = Int32GetDatum(data[i]);
>>>
>>> ArrayType *a = construct_array(d, size, INT4OID, sizeof(int4), true, 'i');
>>>
>>> Is this okay or is there a better solution (existing function in the
>>> PostgreSQL source for example)?
>>
>> That's pretty much the best way AFAIK. Int32GetDatum doesn't do
>> anything fancy -- it's just a 32 bit mask/assignment.  constructing
>> the array at once is going to be a lot better than incrementally
>> creating it.  Do you expect the arrays to be large, say bigger than
>> 10k elements?
>>
>> merlin
>>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Palle Girgensohn 2011-04-28 05:25:41 Re: [ANNOUNCE] PostgreSQL Core Team
Previous Message Phoenix Kiula 2011-04-28 04:12:34 Re: Partitioning an existing table