Re: pg_stats.avg_width differs by a factor of 4 on different machines

From: Craig de Stigter <craig(dot)destigter(at)koordinates(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_stats.avg_width differs by a factor of 4 on different machines
Date: 2009-06-01 23:57:24
Message-ID: e260aad40906011657m9251f99g5cefbcd37c919464@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

This query was giving us good-enough results on our old system. The
estimates don't have to be absolutely accurate, just ballpark figures.
Also we are estimating the size of zipped shapefiles, not textual
geometries.

Our tests show that such sizes are quite accurate for medium/large
datasets when we multiply the avg_width by a precalculated constant
factor.

Does anyone have any idea why these numbers would be 4 times as big in
Postgres 8.3.7 ?

If not I may end up with a dirty hack along the lines of:
if POSTGRES_VERSION >= '8.3':
size_estimate /= 4.0

;)

On 5/29/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Craig de Stigter <craig(dot)destigter(at)koordinates(dot)com> writes:
>> We are using the PostgreSQL pg_stats view to estimate file sizes for some
>> geodata exports. However, the following query gives us totally different
>> results on different servers:
>
>> select avg_width from pg_stats where tablename='some_geodata' and attname
>> =
>> 'GEOMETRY';
>
> I'm afraid that query is pretty much completely useless for what you
> want to do. What it should be giving you is the average width of the
> field values on-disk, which is to say after compression and toasting.
> It would probably be all right for narrow columns but it's likely to be
> a huge underestimate of the external textual size for wide field values.
>
> Having said that, though, these numbers make no sense to me:
>
>> PostgreSQL 8.3.7 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real
>> (Ubuntu
>> 4.3.2-1ubuntu11) 4.3.2
>>> 81803
>
>> PostgreSQL 8.2.9 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
>> (Ubuntu 4.1.2-0ubuntu4)
>>> 20450
>
> It should be impossible to get a value larger than the block size, or
> even more than about a quarter of the block size because that's where
> TOAST will start doing its thing. Are you running modified source code?
>
> regards, tom lane
>

--
Koordinates Ltd
PO Box 1604, Shortland St, Auckland, New Zealand
Phone +64-9-966 0433 Fax +64-9-969 0045
Web http://www.koordinates.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2009-06-02 00:30:48 Re: pg_stats.avg_width differs by a factor of 4 on different machines
Previous Message Tom Lane 2009-06-01 23:16:32 Re: waiting for ExclusiveLock on extension of relation