Re: GSoC - Idea Discussion

From: hitesh ramani <hiteshramani(at)hotmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC - Idea Discussion
Date: 2015-03-20 11:26:13
Message-ID: BAY176-W2178F5333B792CA7753536DC0E0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello devs,
Thank you so much for the feedback, to answer to your questions:
Tomas:>So you've created an array of 1M integers, and it's 7x faster on GPU >compared to pg_qsort(), correct?
No, I meant general sorting, not on pg_qsort()
>Well, it might surprise you, but PostgreSQL almost never sorts numbers >like this. PostgreSQL sorts tuples, which is way more complicated and, >considering the variable length of tuples (causing issues with memory >access), rather unsuitable for GPU devices. I might be missing >something, of course.>>Also, it often needs additional information, like collations when >sorting by a text field, for example.
I totally agree with you on this point, my current target area is very confined as this is the beginning, I'm only considering integer values in one row.
>Why don't you show us the source code? Would be simpler than explaining >what it does.
You can have a look at the code here: https://github.com/hiteshramani/Postgres-CUDAThis is a compiled code, you can see the call to CUDA function in src/port/qsort.c and .h files - qsort_normal.h and qsort_cuda.h. The hello world program is in src/port/qsort_cuda.cu. Compilation happens in 2 phases - compile and link, I compiled the cuda file with nvcc and for linked I edited the makefile of src/timezone/ because zic build needed the linking of the cuda file.
Suggestions are welcome.
>I'd recommend discussing the code here. It's certainly quite complex, >especially if this is your first encounter with it.
Yes, I felt it's a little complex but couldn't find a lot of help resources online. I'm looking for help.
>PostgreSQL uses adaptive sort - in-memory when it fits into work_mem, >on-disk when it does not. This is decided at runtime.>>You'll have to do the same thing, because the amount of memory available >on GPUs is limited to a few GBs, and it needs to work for datasets >exceeding that limit (the amount of data is uncertain at planning time).
Yes, I thought of that too. A call could be made with the integer array as an input to the GPU. The GPU then returns the result with a sorted array. I want to proceed step by step, as there are methods to sort amount which exceed the GPU memory.
Álvaro Herrera:I downloaded the zip of the latest custom_join repo I saw 2 days ago. I'll check once again. Thank you. :)
KaiGai Kohei:
>Let me say CUDA is better than OpenCL :-)>Because of software quality of OpenCL runtime drivers provided by each vendor,>I've often faced mysterious problems. Only nvidia's runtime are enough reliable>from my point of view. In addition, when we implement using OpenCL is a feature>fully depends on hardware characteristics, so we cannot ignore physical hardware>underlying the abstraction layer.>So, I'm now reworking the code to move CUDA from OpenCL.
That's great, I'd love to help you with that and contribute in it.
>It seems to me you are a little bit optimistic.>Unlike CPU code, GPU-Sorting logic has to reference device memory space,>so all the data to be compared needs to be transferred to GPU devices.>Any pointer on host address space is not valid on GPU calculation.>Amount of device memory is usually smaller than host memory, so your code>needs a capability to combined multiple chunks that is partially sorted...>Probably, it is not all here.
Aren't there algorithms which help you if the device memory is limited and the data is massive? I have a rough memory because I did a course online, where I saw algorithms to deal with such problems I suppose.
Thanks and Regards,Hitesh Ramani

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2015-03-20 11:39:38 Re: GSoC 2015: Extra Jsonb functionality
Previous Message Dmitry Dolgov 2015-03-20 11:21:38 Re: GSoC 2015: Extra Jsonb functionality