Snowflake seems like a unique product and I can only imagine the complex math th...

ruang · on Oct 1, 2020

What kind of math is involved in building a faster database? Genuinely curious. I would guess maybe linear algebra, indirectly.

acidbaseextract · on Oct 1, 2020

Not at all. I'd highly recommend CMU's 15-445/645 Intro to Database Systems course (sponsored by Snowflake lol) because they put all their lectures online on YouTube [1]! Here's what's involved in making fast databases from the syllabus [2]:

This course is on the design and implementation of database management systems. Topics include data models (relational, document, key/value), storage models (n-ary, decomposition), query languages (SQL, stored procedures), storage architectures (heaps, log-structured), indexing (order preserving trees, hash tables), transaction processing (ACID, concurrency control), recovery (logging, checkpoints), query processing (joins, sorting, aggregation, optimization), and parallel architectures (multi-core, distributed). Case studies on open-source and commercial database systems are used to illustrate these techniques and trade-offs. The course is appropriate for students that are prepared to flex their strong systems programming skills.

[1] https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_...

[2] https://15445.courses.cs.cmu.edu/fall2020/syllabus.html

ponker · on Oct 1, 2020

Oof... CMU courses directly sponsored by Snowflake. Gross.

red_admiral · on Oct 1, 2020

Please elaborate? I can see a lot of ways a sponsored course could go badly, but I can't immediately see which ones apply here.

ponker · on Oct 1, 2020

I'm not qualified to evaluate this particular course. But any time there is a corporate sponsor of a course, it provides strong incentives to the professor to not harm that sponsor at a minimum. If there's a methodology that the professor would like to teach, but that sidesteps, or calls into question, the sponsor's main offering, then that content is in jeopardy. The corruption will always take root given enough time, so that's why editorial and advertising, or academic content and corporate sponsors, etc. should always be at arm's length. Snowflake should give money to CMU to fund "database-related research and teaching" and the university should decide what to do with it. There's still a possibility of improper influence, but it's harder to achieve. This is particularly bad because it's CMU and not University of Phoenix... CMU is in the highest echelon of computer science universities, so it's sad to see it so debased.

What if Kodak sponsored an imaging class in 1990... what do you think they would have said about film vs. digital photography?

Rochetshipz · on Oct 1, 2020

A lot of ML classes at CMU (and probably other prestigious campuses) are sponsored by AWS or GCP through cloud credit donation, including the popular Cloud Computing class. Is that any different ?

ponker · on Oct 1, 2020

Not really. Cloud computing has a lot of benefits, but a lot of risks and drawbacks. Who is sponsoring a class to teach about those? About keeping users’ data private by building your own infrastructure? CMU is actively tilting their students, who are the top CS students in the world, towards cloud computing, based on the choices of these sponsors.

mensetmanusman · on Oct 1, 2020

Sounds kind of conspiratorial.

I think any increase in educational content is good, even if ‘bad actors’ are funding it.

ponker · on Oct 1, 2020

Bad actors funding it always leads to bad actors writing it. Then it's hard to argue that an increase in its quantity is good.

javajosh · on Sept 30, 2020

>I can only imagine the complex math they're doing under the hood to achieve these incredible query times

Maybe its cynical/paranoid, but in this age of Theranos I must ask: is it possible their algorithm excels at showing you a reasonable looking number, rather than an accurate one?

dumbfounder · on Sept 30, 2020

It's SQL, if they were giving wrong answers people would notice.

kgbdrop1 · on Oct 1, 2020

It's not too terribly difficult to load test Snowflake to get a sense of scaling. Jmeter does the job well. Heck I can pass you along some sample projects I've done against them if you really wanted.