Re: CUDA Sorting

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Gaetano Mendola <mendola(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CUDA Sorting
Date: 2012-02-12 12:13:33
Message-ID: Pine.LNX.4.64.1202121611440.19065@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm wondering if CUDA will win in geomentry operations, for example,
tesing point <@ complex_polygon

Oleg
On Sun, 12 Feb 2012, Gaetano Mendola wrote:

> On 19/09/2011 16:36, Greg Smith wrote:
>> On 09/19/2011 10:12 AM, Greg Stark wrote:
>>> With the GPU I'm curious to see how well
>>> it handles multiple processes contending for resources, it might be a
>>> flashy feature that gets lots of attention but might not really be
>>> very useful in practice. But it would be very interesting to see.
>>
>> The main problem here is that the sort of hardware commonly used for
>> production database servers doesn't have any serious enough GPU to
>> support CUDA/OpenCL available. The very clear trend now is that all
>> systems other than gaming ones ship with motherboard graphics chipsets
>> more than powerful enough for any task but that. I just checked the 5
>> most popular configurations of server I see my customers deploy
>> PostgreSQL onto (a mix of Dell and HP units), and you don't get a
>> serious GPU from any of them.
>>
>> Intel's next generation Ivy Bridge chipset, expected for the spring of
>> 2012, is going to add support for OpenCL to the built-in motherboard
>> GPU. We may eventually see that trickle into the server hardware side of
>> things too.
>
>
> The trend is to have server capable of running CUDA providing GPU via
> external hardware (PCI Express interface with PCI Express switches), look for
> example at PowerEdge C410x PCIe Expansion Chassis from DELL.
>
> I did some experimenst timing the sort done with CUDA and the sort done with
> pg_qsort:
> CUDA pg_qsort
> 33Milion integers: ~ 900 ms, ~ 6000 ms
> 1Milion integers: ~ 21 ms, ~ 162 ms
> 100k integers: ~ 2 ms, ~ 13 ms
>
> CUDA time has already in the copy operations (host->device, device->host).
>
> As GPU I was using a C2050, and the CPU doing the pg_qsort was a Intel(R)
> Xeon(R) CPU X5650 @ 2.67GHz
>
> Copy operations and kernel runs (the sort for instance) can run in parallel,
> so while you are sorting a batch of data, you can copy the next batch in
> parallel.
>
> As you can see the boost is not negligible.
>
> Next Nvidia hardware (Keplero family) is PCI Express 3 ready, so expect in
> the near future the "bottle neck" of the device->host->device copies to have
> less impact.
>
> I strongly believe there is space to provide modern database engine of
> a way to offload sorts to GPU.
>
>> I've never seen a PostgreSQL server capable of running CUDA, and I
>> don't expect that to change.
>
> That sounds like:
>
> "I think there is a world market for maybe five computers."
> - IBM Chairman Thomas Watson, 1943
>
> Regards
> Gaetano Mendola
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2012-02-12 18:31:16 Re: CUDA Sorting
Previous Message Vik Reykja 2012-02-12 02:06:24 Optimize referential integrity checks (todo item)