Skip site navigation (1) Skip section navigation (2)

Re: Performance of a large array access by position (tested version 9.1.3)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
Cc: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>, Jesper Krogh <jesper(at)krogh(dot)cc>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of a large array access by position (tested version 9.1.3)
Date: 2012-06-26 09:03:14
Message-ID: CAFj8pRA03UgeHmpau+Ze-V8dmpSxhJbtk87Va3_9pbHT2UeHTA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
2012/6/26 Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>:
>
>
> On Tue, Jun 26, 2012 at 6:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>>
>> 2012/6/26 Marc Mamin <M(dot)Mamin(at)intershop(dot)de>:
>> >
>> >>> On 22/06/12 09:02, Maxim Boguk wrote:
>> >
>> >>> May be I completely wrong but I always assumed that the access speed
>> >>> to the array element in PostgreSQL should be close to constant time.
>> >>> But in tests I found that access speed degrade as O(N) of array size.
>> >
>> >>> Is that behaviour is correct?
>>
>> yes - access to n position means in postgresql - skip n-1 elements
>>
>
> Hi,
>
> I understand what you mean, but in my test for all values of N test
> performed access only to first 10000 elements of the array independent of
> the array size....
> So i still can't see why access time degrade soo fast for N>10000...:
>
> WITH
> --GENERATE single entry table with single ARRAY field of size N
> t AS (SELECT ARRAY(SELECT * FROM generate_series(1,N)) AS _array)
> --iterate over first 10000 elements of that ARRAY
> SELECT count((SELECT _array[i] FROM t)) FROM generate_series(1,10000) as
> g(i);
>
> ... if access time depend only on position then after 10k there should not
> be any serious degradation, in fact a perfromance degradation is almost
> linear.
> 10k        34ms
> 50k       177ms
> 100k      321ms
> 500k     4100ms
> 1M       8100ms
> 2M      22000ms
> 5M      61000ms
> 10M    220000ms = 22ms to sinlge array element access.

in this use case TOAST/DETOAST is not used, but probably there are
problem with array copy. Taking n element of array means calling some
function, and postgresql uses only passing parameters by value - so
copy of large array needs higher time.

this issue is solved partially in 9.2, where you can use FOR EACH
cycle http://www.postgresql.org/docs/9.1/interactive/plpgsql-control-structures.html#PLPGSQL-FOREACH-ARRAY

Regards

Pavel

>
> And I think no toasting/detoasting happen in my test case.
>
> Kind Regards,
> Maksym

In response to

pgsql-performance by date

Next:From: Merlin MoncureDate: 2012-06-26 13:36:54
Subject: Re: Can I do better than this heapscan and sort?
Previous:From: Pavel StehuleDate: 2012-06-26 08:22:41
Subject: Re: Performance of a large array access by position (tested version 9.1.3)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group