Quick Links

Re: Long Running Update

From:	Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Long Running Update
Date:	2011-06-24 11:45:15
Message-ID:	4E0478CB.8080002@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

24.06.11 14:16, Harry Mantheakis написав(ла):
>
> > EXPLAIN the statement
>
> Here is the EXPLAIN result:
>
> ----------------------------------------------------------------------
> QUERY PLAN
> ----------------------------------------------------------------------
> Hash Join (cost=2589312.08..16596998.47 rows=74558048 width=63)
> Hash Cond: (table_A.id = table_B.id)
> -> Seq Scan on table_A(cost=0.00..1941825.05 rows=95612705 width=47)
> -> Hash (cost=1220472.48..1220472.48 rows=74558048 width=20)
> -> Seq Scan on table_B(cost=0.00..1220472.48 rows=74558048 width=20)
> ----------------------------------------------------------------------
>
> The documentation says the 'cost' numbers are 'units of disk page
> fetches'.
>
> Do you, by any chance, have any notion of how many disk page fetches
> can be processed per second in practice - at least a rough idea?
>
> IOW how do I convert - guesstimate! - these numbers into (plausible)
> time values?
No chance. This are "virtual values" for planner only.
If I read correctly, your query should go into two phases: build hash
map on one table, then update second table using the map. Not that this
all valid unless you have any constraints (including foreign checks,
both sides) to check on any field of updated table. If you have, you'd
better drop them.
Anyway, this is two seq. scans. For a long query I am using a tool like
ktrace (freebsd) to get system read/write calls backend is doing. Then
with catalog tables you can map file names to relations
(tables/indexes). Then you can see which stage you are on and how fast
is it doing.
Note that partially cached tables are awful (in FreeBSD, dunno for
linux) for such a query - I suppose this is because instead on
sequential read, you get a lot of random reads that fools prefetch
logic. "dd if=table_file of=/dev/null bs=8m" helps me a lot. You can see
it it helps if CPU time goes up.

Best regards, Vitalii Tymchyshyn

In response to

Re: Long Running Update at 2011-06-24 11:16:32 from Harry Mantheakis

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Harry Mantheakis	2011-06-24 12:39:16	Re: Long Running Update
Previous Message	Claudio Freire	2011-06-24 11:39:54	Re: Long Running Update