Re: Distributing data over "spindles" even on AWS EBS, (followup to the work queue saga)

From: Sam Gendler <sgendler(at)ideasculptor(dot)com>
To: Gunther <raj(at)gusw(dot)net>
Cc: Jeremy Schneider <schnjere(at)amazon(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Distributing data over "spindles" even on AWS EBS, (followup to the work queue saga)
Date: 2019-03-19 15:18:10
Message-ID: CAEV0TzALNhT0UFo7BRz_XFCCd9wQJh4twW6gRpR=mrN=-Mo3oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You do have a finite amount of bandwidth per-instance. On c5.xlarge, it is
3500 Mbit/sec, no matter how many iops you buy. Keep an eye on yur overall
EBS bandwidth utilization.

On Sun, Mar 17, 2019 at 11:42 AM Gunther <raj(at)gusw(dot)net> wrote:

> On 3/14/2019 11:11, Jeremy Schneider wrote:
> > On 3/14/19 07:53, Gunther wrote:
> >> 2. build a low level "spreading" scheme which is to take the partial
> >> files 4653828 and 4653828.1, .2, _fsm, etc. and move each to
> another
> >> device and then symlink it back to that directory (I come back to
> this!)
> > ...
> >> To 2. I find that it would be a nice feature of PostgreSQL if we could
> >> just use symlinks and a symlink rule, for example, when PostgreSQL finds
> >> that 4653828 is in fact a symlink to /otherdisk/PG/16284/4653828, then
> >> it would
> >>
> >> * by default also create 4653828.1 as a symlink and place the actual
> >> data file on /otherdisk/PG/16284/4653828.1
> > How about if we could just specify multiple tablespaces for an object,
> > and then PostgreSQL would round-robin new segments across the presently
> > configured tablespaces? This seems like a simple and elegant solution
> > to me.
>
> Very good idea! I agree.
>
> Very important also would be to take out the existing patch someone had
> contributed to allow toast tables to be assigned to different tablespaces.
>
> >> 4. maybe I can configure in AWS EBS to reserve more IOPS -- but why
> >> would I pay for more IOPS if my cost is by volume size? I can just
> >> make another volume? or does AWS play a similar trick on us with
> >> IOPS being limited on some "credit" system???
> > Not credits, but if you're using gp2 volumes then pay close attention to
> > how burst balance works. A single large volume is the same price as two
> > striped volumes at half size -- but the striped volumes will have double
> > the burst speed and take twice as long to refill the burst balance.
>
> Yes, I learned that too. It seems a very interesting "bug" of the Amazon
> GP2 IOPS allocation scheme. They say it's like 3 IOPS per GiB, so if I
> have 100 GiB I get 300 IOPS. But it also says minimum 100. So that means
> if I have 10 volumes of 10 GiB each, I get 1000 IOPS minimum between
> them all. But if I have it all on one 100 GiB volume I only get 300 IOPS.
>
> I wonder if Amazon is aware of this. I hope they are and think that's
> just fine. Because I like it.
>
> It also is a clear sign to me that I want to use page sizes > 4k for the
> file system. I have tried on Amazon Linux to use 8k block sizes of the
> XFS volume, but I cannot mount those, since the Linux says it can
> currently only deal with 4k blocks. This is another reason I consider
> switching the database server(s) to FreeBSD. OTOH, who knows may be
> this 4k is a limit of the AWS EBS infrastructure. After all, if I am
> scraping the 300 or 1000 IOPS limit already and if I can suddenly
> upgrade my block sizes per IO, I double my IO throughput.
>
> regards,
> -Gunther
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Maracska Ádám 2019-03-20 11:05:15 Performance issue with order by clause on
Previous Message Peter J. Holzer 2019-03-18 21:19:23 Re: Facing issue in using special characters