Re: Curious about dead rows.

From: Jean-David Beyer <jeandavid8(at)verizon(dot)net>
To: pgsql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Curious about dead rows.
Date: 2007-11-14 02:26:08
Message-ID: 473A5CC0.3060003@verizon.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Merlin Moncure wrote:
> On Nov 10, 2007 1:38 PM, Jean-David Beyer <jeandavid8(at)verizon(dot)net> wrote:
>> Tom Lane wrote:
>>> Jean-David Beyer <jeandavid8(at)verizon(dot)net> writes:
>>>> I am doing lots of INSERTs on a table that starts out empty (I did a
>>>> TRUNCATE on it). I am not, AFAIK, doing DELETEs or UPDATEs. Autovacuum is
>>>> on. I moved logging up to debug2 level to see what was going on, and I get
>>>> things like this:
>>>> "vl_as": scanned 3000 of 5296 pages, containing 232944 live rows and 1033
>>>> dead rows; 3000 rows in sample, 411224 estimated total rows
>>>> A little later, it says:
>>>> "vl_as": scanned 3000 of 6916 pages, containing 233507 live rows and 493
>>>> dead rows; 3000 rows in sample, 538311 estimated total rows
>>> Well, *something* is doing deletes or updates in that table. Better
>>> look a bit harder at your application ...
>>>
>> OK, you agree that if I am doing only INSERTs, that there should not be any
>> dead rows. Therefore, I _must_ be doing deletes or updates.
>>
>> But the program is pretty simple, and I see no UPDATEs or DELETEs. I
>> searched all the program source files (that contain none of them) and all
>> the libraries I have written, and they have none either. Right now the
>> programs are not to the state where UPDATEs or DELETEs are required (though
>> they will be later). I am still developing them and it is easier to just
>> restore from backup or start over from the beginning since most of the
>> changes are data laundering from an ever-increasing number of spreadsheets.
>>
>> Am I right that TRUNCATE deletes all the rows of a table. They may then be
>> still there, but would not autovacuum clean out the dead rows? Or maybe it
>> has not gotten to them yet? I could do an explicit one earlier.
>
> what does pg_stat_all_tables say (assuming row level stats are on)?

# - Query/Index Statistics Collector -

#stats_start_collector = on
stats_start_collector = on

#stats_command_string = off
#stats_block_level = off

#stats_row_level = off
stats_row_level = on

#stats_reset_on_server_start = off

>
It says stuff like this:

relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins |
n_tup_upd | n_tup_del
----------+----------+--------------+----------+---------------+-----------+-
ibd | 75 | 9503850 | 11 | 2350555 | 2416845 |
0 | 0
vl_cf | 139 | 38722575 | 22 | 5392609 | 5692814 |
0 | 0
vl_li | 139 | 39992838 | 22 | 5569855 | 5885516 |
0 | 0

I removed the relid and schemaname and squeezed the other columns so it
would not be quite so wide. Is this what you might like to know?

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 21:10:01 up 21 days, 14:28, 3 users, load average: 6.20, 5.69, 5.11

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2007-11-14 03:20:41 Re: dell versus hp
Previous Message Merlin Moncure 2007-11-14 00:28:24 Re: Curious about dead rows.