| From: | Kenneth Marshall <ktm(at)rice(dot)edu> | 
|---|---|
| To: | david(at)lang(dot)hm | 
| Cc: | Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> | 
| Subject: | Re: [HACKERS] Slow count(*) again... | 
| Date: | 2011-02-03 13:41:42 | 
| Message-ID: | 20110203134142.GO24931@aart.is.rice.edu | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers pgsql-performance | 
On Thu, Feb 03, 2011 at 02:11:58AM -0800, david(at)lang(dot)hm wrote:
> On Thu, 3 Feb 2011, Vitalii Tymchyshyn wrote:
>
>> 02.02.11 20:32, Robert Haas ???????(??):
>>> Yeah.  Any kind of bulk load into an empty table can be a problem,
>>> even if it's not temporary.  When you load a bunch of data and then
>>> immediately plan a query against it, autoanalyze hasn't had a chance
>>> to do its thing yet, so sometimes you get a lousy plan.
>>
>> May be introducing something like 'AutoAnalyze' threshold will help? I 
>> mean that any insert/update/delete statement that changes more then x% of 
>> table (and no less then y records) must do analyze right after it was 
>> finished.
>> Defaults like x=50 y=10000 should be quite good as for me.
>
> If I am understanding things correctly, a full Analyze is going over all 
> the data in the table to figure out patterns.
>
> If this is the case, wouldn't it make sense in the situation where you are 
> loading an entire table from scratch to run the Analyze as you are 
> processing the data? If you don't want to slow down the main thread that's 
> inserting the data, you could copy the data to a second thread and do the 
> analysis while it's still in RAM rather than having to read it off of disk 
> afterwords.
>
> this doesn't make sense for updates to existing databases, but the use case 
> of loading a bunch of data and then querying it right away isn't _that_ 
> uncommon.
>
> David Lang
>
+1 for in-flight ANALYZE. This would be great for bulk loads of
real tables as well as temp tables.
Cheers,
Ken
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Kenneth Marshall | 2011-02-03 13:55:58 | Re: LIKE, CHAR(), and trailing spaces | 
| Previous Message | Thom Brown | 2011-02-03 13:32:51 | Re: Issues with generate_series using integer boundaries | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jon Nelson | 2011-02-03 14:20:01 | Re: [HACKERS] Slow count(*) again... | 
| Previous Message | Laszlo Nagy | 2011-02-03 11:40:12 | Get master-detail relationship metadata |