Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Greg, Tom,
> > But for most users analyze doesn't really have to run as often as
> > vacuum. One sequential scan per night doesn't seem like that big a deal
> > to me.
> Clearly you don't have any 0.5 TB databases.
Actually I did not so long ago.
Sequential scans in an OLTP query would be a disaster. But a single sequential
scan run at a controlled time wouldn't concern me as long as *all* of the
following constraints are met:
a) You can run them at your leisure at off-peak times when your i/o bandwidth
isn't in short supply.
b) You don't need the results urgently so you don't care if it takes a while
c) You don't need many of them at the same time.
Even on your production system surely you occasionally, say, take a full
backup or run "select count(*)" or other sanity checks on the data?
> > > I'd still be worried about the CPU pain though. ANALYZE can afford to
> > > expend a pretty fair number of cycles per sampled tuple, but with a
> > > whole-table sample that's going to add up.
> Agreed. Despite conventional wisdom, most PostgreSQL databases ... even
> those with high level OLTP or very large DW ... are CPU-bound. We
> really don't want an ANALYZE which is an order-of-magnitude increase in
> CPU activity.
I don't think Tom was actually expressing concern about ANALYZE becoming more
expensive, but about tying ANALYZE and VACUUM together and making VACUUM more
expensive. VACUUM is something we want to encourage people to think they can
run all day long, not just occasionally.
In response to
pgsql-hackers by date
|Next:||From: Rod Taylor||Date: 2006-06-02 23:05:15|
|Subject: Re: More thoughts about planner's cost estimates|
|Previous:||From: Rodrigo Hjort||Date: 2006-06-02 22:55:02|
|Subject: Re: Connection Broken with Custom Dicts for TSearch2|