Skip site navigation (1) Skip section navigation (2)

Re: Index Size

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Nick Raj <nickrajjain(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Index Size
Date: 2011-05-31 06:22:38
Message-ID: 4DE4892E.5070304@postnewspapers.com.au (view raw or flat)
Thread:
Lists: pgsql-general
On 30/05/11 20:53, Nick Raj wrote:
> Hi,
> 
> Cube code provided by postgres contrib folder. It uses the NDBOX structure.
> On creating index, it's size increase at a high rate.

Here's what I get on 8.4 with a cleaned up test case. It uses the
original poster's data ( output_vehicle.sql ) and just automates adding
rows in powers of 2 and checking the table size.

(Quoted because it is the only way to stop Thunderbird wrapping text in
plain text editing mode - argh!)

>  nrows | tablesize | totalsize | indexsize | bpr_row | bpr_total | bpr_index 
> -------+-----------+-----------+-----------+---------+-----------+-----------
>      0 |         0 |      8192 |      8192 |         |           |          
>      2 |      8192 |     16384 |      8192 |    4096 |      8192 |      4096
>      4 |      8192 |     16384 |      8192 |    2048 |      4096 |      2048
>      8 |      8192 |     16384 |      8192 |    1024 |      2048 |      1024
>     16 |      8192 |     16384 |      8192 |     512 |      1024 |       512
>     32 |      8192 |     16384 |      8192 |     256 |       512 |       256
>     64 |      8192 |     16384 |      8192 |     128 |       256 |       128
>    128 |     16384 |     65536 |     49152 |     128 |       512 |       384
>    256 |     24576 |    122880 |     98304 |      96 |       480 |       384
>    512 |     49152 |    229376 |    180224 |      96 |       448 |       352
>   1024 |     90112 |    327680 |    237568 |      88 |       320 |       232
>   2048 |    180224 |   1376256 |   1196032 |      88 |       672 |       584
>   4096 |    352256 |   2228224 |   1875968 |      86 |       544 |       458
>   8192 |    696320 |   3751936 |   3055616 |      85 |       458 |       373
>  16384 |   1384448 |  13254656 |  11870208 |      84 |       809 |       724
>  19875 |   1679360 |  15466496 |  13787136 |      84 |       778 |       693
> (16 rows)

As expected, the bytes cost per table row sans index (bpr_row) tends
down slowly toward a stable value. The index size per row (bpr_index) is
all over the place, but seems consistently pretty huge.

At 19875 rows the index is 90% of the total size, or over 8 times the
size of the table its self.

While I realize that a general-purpose index for multi-dimensional
structures like cubes may not be hugely efficient, is this kind of huge
storage consumption expected? The memory hit will be so bad that it's
likely to be faster to use full table scans.

Run the attached test as:

   psql -f test.sql -q regress

where "regress" is the name of the database to do the work in. The
original poster's "output_vehicle.sql" must be in the same directory.

--
Craig Ringer

Attachment: test.sql
Description: text/x-sql (1.9 KB)

In response to

pgsql-general by date

Next:From: AsiaDate: 2011-05-31 07:40:54
Subject: Universal certificate for verify-full ssl connection
Previous:From: Pavel StehuleDate: 2011-05-31 04:09:18
Subject: Re: trigger - dynamic WHERE clause

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group