Hacker News new | past | comments | ask | show | jobs | submit login

> MapD uses Vega to drive the rendering engine directly on the result set of a SQL query without ever requiring the data to leave the GPU

How big is the dataset? If it can't ever leave the GPU then it is at most a few GB? Unless there are several GPUs at play then it's N * (a few GB). If there are a few GPUs at play then this dataset would fit into DDR3 RAM on a single mainstream Xeon node, or entirely into MCDRAM on a Xeon Phi node.

Please correct me if I'm wrong.




MapD customers typically run our product on multiple servers with multiple GPUs per node. So 4 servers with 8 Nvidia P40s each has 4X192GB = 768GB of VRAM. Note MapD compresses data and also keeps data in CPU RAM as needed. Even two servers with these GPUs or 4 servers with gamer GPUs is enough to query and visualize an 11B record shipping dataset without a hitch (https://www.mapd.com/demos/ships), a demo running on four servers with 8 Nvidia 1080 Tis each.

Other customers with smaller datasets (i.e. less than a few hundred million records) are able to run with a single GPU.

We're not going after petabyte size datasets (where <100ms querying is rarely important), so ability to scale has rarely been an issue.


I would love to see some comparisons to other mpp in memory databases as, and I mean this with all due respect, it's difficult to gauge what the impact of GPUs are. Have you benchmarked against anything like Memsql? Also, do you mind if I ask where you got the ship data from?


For some benchmarks, take a look here: http://tech.marksblogg.com/benchmarks.html


Note that although we can run on CPU, CPUs do not have the graphics pipeline and the memory bandwidth necessary for interactive server-side visualizations like this.


Cool, thanks for the explanation.


Do you use Pascal's unified memory for over-provisioning? Since Maxwell only supports unified virtual memory addresses with the host upto the VRAM limit.


MapD predates good virtual (unified) memory support on GPUs and so we built our own caching mechanism where each GPU has its own buffer pool and pulls from a CPU buffer pool (i.e a network of buffer pools). This approach still has the advantage of giving us a lot of control of where we put data, how much we leave for other processes, as well as allowing us to pin data in VRAM on a specific GPU.


AMD are working[1][2] on a graphics card that includes a 1TB SSD, which appears to your application as having 1TB of RAM and then just swaps to the SSD. I would say that's a pretty good approach to enabling virtual memory on the GPU - you get really fast swapping and you don't tax the PCI bus and CPU with moving data to and from GPU RAM. No release date AFAIK sadly.

[1] https://www.amd.com/en-us/press-releases/Pages/amd-radeon-pr...

[2] https://www.youtube.com/watch?v=g-8pMM2wV7k




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: