Re: Re: RC2 and open issues

From: <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: RC2 and open issues
Date: 2004-12-21 09:38:01
Message-ID: 28292295$110362168341c7ee33e502b3.21907066@config3.schlund.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote on 21.12.2004, 07:32:52:
> Gavin Sherry writes:
> > I was also thinking of benchmarking the effect of changing the algorithm

"changing the algorithm" is a phrase that sends shivers up my spine. My
own preference is towards some change, but as minimal as possible.

> > in StrategyDirtyBufferList(): currently, for each iteration of the loop we
> > read a buffer from each of T1 and T2. I was wondering what effect reading
> > T1 first then T2 and vice versa would have on performance.
>
> Looking at StrategyGetBuffer, it definitely seems like a good idea to
> try to keep the bottom end of both T1 and T2 lists clean. But we should
> work at T1 a bit harder.
>
> The insight I take away from today's discussion is that there are two
> separate goals here: try to keep backends that acquire a buffer via
> StrategyGetBuffer from being fed a dirty buffer they have to write,
> and try to keep the next upcoming checkpoint from having too much work
> to do. Those are both laudable goals but I hadn't really seen before
> that they may require different strategies to achieve. I'm liking the
> idea that bgwriter should alternate between doing writes in pursuit of
> the one goal and doing writes in pursuit of the other.

Agreed: there are two different goals for buffer list management.

I like the way the current algorithm searches both T1 and T2 in
parallel, since that works no matter how long each list is. Always
cleaning one list in preference to the other would not work well since
ARC fluctuates. At any point in time, cleaning one list will have more
benefit than cleaning the other, but which one is best switches
backwards and forwards as ARC fluctuates.

Perhaps the best way would be to concentrate on the list that, at this
point in time, is the one that needs to be cleanest. I *think* that
means we should concentrate on the LRU of the *longest* list, since
that is the direction in which ARC is trying to move (I agree that
seems counter-intuitive: but a few pairs of eyes should confirm which
way round it is)

By observation, DBT2 ends up with T2 >> T1, but that is a result of its
fairly static nature. i.e. DBT2 would benefit from T2 LRU cleaning.

ISTM it would be good to have:
1) very frequent, but small cleaning action on the lists, say every 50ms
to avoid backends having to write a buffer
2) less frequent, deeper cleaning actions, to minimise the effect of
checkpoints, which could be done every 10th cycle e.g. 500ms
(numbers would vary according to workload...)

But, like I said: change, but minimal change seems best to me for now.

Best Regards, Simon Riggs

Browse pgsql-hackers by date

  From Date Subject
Next Message simon 2004-12-21 09:48:01 Re: Re: RC2 and open issues
Previous Message Zeugswetter Andreas DAZ SD 2004-12-21 09:22:59 Re: Shared row locking