Skip site navigation (1) Skip section navigation (2)

Re: SELECT INTO large FKyed table is slow

From: Mario Splivalo <mario(dot)splivalo(at)megafon(dot)hr>
To: Pierre C <lists(at)peufeu(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SELECT INTO large FKyed table is slow
Date: 2010-11-30 23:22:33
Message-ID: 4CF58739.7050904@megafon.hr (view raw or flat)
Thread:
Lists: pgsql-performance
On 11/29/2010 05:47 PM, Pierre C wrote:
>> realm_51=# vacuum analyze verbose drones;
>> INFO: vacuuming "public.drones"
>> INFO: scanned index "drones_pk" to remove 242235 row versions
>> DETAIL: CPU 0.02s/0.11u sec elapsed 0.28 sec.
>> INFO: "drones": removed 242235 row versions in 1952 pages
>> DETAIL: CPU 0.01s/0.02u sec elapsed 0.03 sec.
>> INFO: index "drones_pk" now contains 174068 row versions in 721 pages
>> DETAIL: 107716 index row versions were removed.
>> 0 index pages have been deleted, 0 are currently reusable.
>
> As you can see your index contains 174068 active rows and 242235 dead
> rows that probably should have been removed a long time ago by
> autovacuum, but you seem to have it turned off. It does not take a long
> time to vacuum this table (only 0.3 sec) so it is not a high cost, you
> should enable autovacuum and let it do the job (note that this doesn't
> stop you from manual vacuuming after big updates).

Yes, you're right. I was doing some testing and I neglected to enable 
vacuuming after inserts. But what this shows is that table drones is 
having dead rows, and that table does get updated a lot. However, I 
don't have any performance problems here. The UPDATE takes no more than 
10 seconds even if I update 50k (out of 150k) rows.

I disabled autovacuum because I got a lot of "WARNING:  pgstat wait 
timeout" and I could see the autovacuum job (pg_stat_activity) running 
during the run of the plpgsql function that handles inserts.

I left the autovacuum off but I do VACUUM after each CSV insert.

> good
>
>> 0 index pages have been deleted, 0 are currently reusable.
>> CPU 0.38s/0.12u sec elapsed 16.56 sec.
>> INFO: "drones_history": found 0 removable, 16903164 nonremovable row
>> versions in 129866 out of 195180 pages
>> DETAIL: 0 dead row versions cannot be removed yet.
>> There were 0 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 2.00s/1.42u sec elapsed 49.24 sec.
>
> good
>
>> INFO: vacuuming "pg_toast.pg_toast_2695510"
>> INFO: index "pg_toast_2695510_index" now contains 0 row versions in 1
>> pages
>> DETAIL: 0 index row versions were removed.
>> 0 index pages have been deleted, 0 are currently reusable.
>> CPU 0.00s/0.00u sec elapsed 0.00 sec.
>> INFO: "pg_toast_2695510": found 0 removable, 0 nonremovable row
>> versions in 0 out of 0 pages
>> DETAIL: 0 dead row versions cannot be removed yet.
>> There were 0 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 0.00s/0.00u sec elapsed 0.00 sec.
>
> same as above, no toast

Yes. Just to make things clear, I never update/delete drones_history. I 
just INSERT, and every now and then I'll be doing SELECTs.

>
>
>> realm_51=# select version();
>> version
>> ---------------------------------------------------------------------------------------------
>>
>> PostgreSQL 8.4.2 on i686-pc-linux-gnu, compiled by GCC gcc (Debian
>> 4.3.2-1.1) 4.3.2, 32-bit
>> (1 row)
>>
>>
>> Mario
>
> ok
>
> Try this :
>
> CLUSTER drones_pkey ON drones;
>
> Then check if your slow query gets a bit faster. If it does, try :
>
> ALTER TABLE drones SET ( fillfactor = 50 );
> ALTER INDEX drones_pkey SET ( fillfactor = 50 );
> CLUSTER drones_pkey ON drones; (again)
>
> This will make the updates on this table less problematic. VACUUM it
> after each mass update.

Is this going to make any difference considering slow insert on 
drones_history? Because INSERTs/UPDATEs on drones tables are fast. The 
only noticable difference is that drones is 150k rows 'large' and 
drones_history has around 25M rows:

realm_51=# select count(*) from drones_history ;
   count
----------
  25550475
(1 row)

	Mario

In response to

pgsql-performance by date

Next:From: T.H.Date: 2010-11-30 23:23:56
Subject: Re: Question about subselect/IN performance
Previous:From: Kevin GrittnerDate: 2010-11-30 22:54:55
Subject: Re: Question about subselect/IN performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group