Hacker News new | past | comments | ask | show | jobs | submit login
ML Code Exercises (deep-ml.com)
221 points by mchab 6 months ago | hide | past | favorite | 64 comments



No one (well, few professionals at least) will reinvent the wheel when it comes to standard scientific computations and methods. Like numerical math, linear algebra, etc.

Looking through the problem sets in the link, the majority seems to be asking for just that.

If you're wondering whether or not someone knows how to transpose a matrix, or find the eigenvalues, let them do that on the whiteboard. No need to leetcode-ify such problems, because with 99.99% probability they'll provide you with solutions that are subpar compared to industry standard packages. There's more than time and space complexity when it comes to these problems.

EDIT: Also, you'll potentially lose a lot of high-quality candidates if you suddenly start to test people on methods they haven't worked with or seen in quite a while.

If you ask something like "please show us the equations for a support vector machine, and how you can compute a SVM" you could fail even world class ML scientists, if they haven't touched those for 10 years. Which is a very real possibility in the current ML scene.

I'd say that almost every ML interview I've had, or been part of, have been more big picture whiteboard interviews. Specific programming questions have ranked quite low on things to prioritize.


I really enjoyed andrej karpathy’s zero to hero videos and I like the concept of you don’t know something till you build it, so I made this site, probably should of come up with a better title because it is made as a learning tool not as interview prep like leetcode


First off, thanks! This does look like a fun way to learn things.

Secondly, FWIW, when I read the the term 'excercises' in the HN title I interpreted that to mean exactly a learning tool and not interview prep. The term "Challenges" in the website title is maybe a little less specific.


I can appreciate that. It wasn't until I implemented a few matrix factorization routines that I appreciated the decisions that go into Eigen, etc. It wasn't until I tried it with SIMD that I appreciated the speedups and knew where to look to coax them out.


> No one (well, few professionals at least) will reinvent the wheel when it comes to standard scientific computations and methods. Like numerical math, linear algebra, etc.

> because with 99.99% probability they'll provide you with solutions that are subpar compared to industry standard packages.

Somebody did last week with only a modest amount of effort: https://news.ycombinator.com/item?id=40870345


Most orgs need drivers but they interview like mechanics. If I am a driver, I am expected to drive different vehicles. Sure I know to do basic stuff like change tires/oil etc, but I am not going to know how to fix the engine or something else under the hood, right?


So it is a leetcode equivalent ;)


lol


This kind of interview questions come from the mind of software developers, because that's the only thing they know how to do. When faced with some new area of knowledge, their instinct is to try to implement that in Python or some other language and imagine they have "learned" it. It doesn't occur to them that implementing things is not that helpful when it comes to most math topics.


This is quite a nice way to learn about ML, props for this!

Edit: I see a lot of people complaining about interviews, but instead I consider this a good resource for checking you understand fundamental principles.


Exactly, I think putting leetcode in the title triggered a lot of people


The hard part about ML isn't the implementation but the theory. If you're not sure what SVD is how is this going to help? https://www.deep-ml.com/problem/12


It gives you an impetus to learn and a question to test your understanding. I'd say there's a pretty good track record for this style of teaching.


The learn section should help, but I think I need to spend more time improving the learn section


I would say that "learn" button is a little unclear. It might be better to just have the whole learning section beneath the question, always visible. That will also help drive home the intent of the page, since so many people think this is some weird interview questions prep site.


Who the fuck asks about this useless garbage in an ML job interview. This is such a waste of time and gives you absolutely zero insight into the candidate, how they think, how they’re able to dissect and handle complex issues, their seniority, etc. Whoever expects people to regurgitate this garbage during a job interview is a loser themselves, and will only end up recruiting similar losers to hang out with and get NOTHING done ever. ML job interviews specifically are bottom of the barrel standard.


Thank you for your work! Is there something wrong with: https://www.deep-ml.com/problem/7 ?


Yes, it seems like there is an issue with that question will try and fix that as soon as possible, thank you for the catch


Looks like a decent problem set to accompany an introductory ML class. No need to get so defensive. However, I thought leetcode meant algorithmic problem solving while the problems here simply ask to implement the various elementary operations.


Yeah I think I miss titled my post it is more of a learning tool and less of a leetcode/ interview prep site


I think as a learning tool this is pretty great! I want to implement the most common ML and stats algos over the next few months to review how they work on a deeper level and your website will help a lot. I like that you explain all terms in your equations.

Personally, I would probably enjoy even more explanations and/or links to good resources, e.g., visualizations, etc. as well as more information in the solutions (e.g., via comments or doc strings). Good job anyway!


Ok, we've changed the title above. I hope that helps!

(Submitted title was "Leetcode but for ML".)


I would, but do not see the option to change the title


We already did! I was just letting you know.


Nice, thank you


The issue with leetcode type questions is that formally trained and experienced people often could not answer these questions without specifically practicing for them. Most of the topics on this list could be covered in an introduction course.


If you have to "study" something for interviews every single time because it's absolutely not relevant to your day job - it's probably bullshit.

Everyone copies the FAANG interview process because it looks cool - except that FAANG is just a welfare program for recent graduates, who indulge in peer interview hazing because they are not doing anything else. They don't study for Leetcode because they want to DO something - they study because of the money. But in a real company you have to DO things.

What has Google done in the last decade that is REALLY useful? Google Gmail and Docs can be maintained by probably 50 people, their search has gotten useless and all they do is kill their own products because maintenance toil is a total drag.

Like the dumb brain teasers that Google "pioneered" in 2000s. How many golf balls can fit in a 747? I don't know, but I can estimate how many can fit up your a...

This Leetcode nonsense will go the way of THAT, in time.

Just no.


It was Microsoft who started with the “golf balls in a plane” style questions.

Google iterated to the standard DSA questions that are common now.

And I don’t think they’re entirely without merit. However, people think you should be testing to find the ceiling. That’s impossible. Not only do you have the issue of whether or not the candidate just got lucky by getting a question they just happen to know, if you are hiring for a more junior position, it’s likely you don’t need them to know it in the first place.

Our goal should be to test the floor, not the ceiling. Find questions that can be answered by anyone with the skill set you desire. Sometimes that floor is: can you write runnable code.

We’ve just completed a hiring cycle where several candidates couldn’t transform a simple circuit diagram into a Boolean statement. One candidate who professed SQL knowledge who couldn’t write a simple query. And I mean “how many buckets do you have?” level of simple.

On paper, these candidates seemed good. Several even had GitHub repositories. But, end of the day, I’m going to ask you to do a task. I’m going to need it by a date. I’m going to need that completed without having to comb over it and possibly rewrite chunks of it.

I don’t need the next Linus Torvalds, but so many candidates come with greatly exaggerated resumes and we have to winnow somehow.


They're very busy reinventing the same product over and over, so they can kill it again next month!


Google invented AI


Machine learning? They did not. They iterated on it, and then dropped the ball, losing the race to OpenAI.

My point exactly.


Generative ai came from efforts to improve search via text embeddings


Nice project! I have a few qualms with the instructions (sometimes misleading or unclear) and the implementation. For instance some problems fail, because 0. is considered different from 0.0

Using np.testing.assert_allclose in your asserts would solve this I think (https://numpy.org/doc/stable/reference/generated/numpy.testi...).

Happy to contribute / elaborate if you think it's be useful! :)


Thank you for the help! Will definitely try this instead of my current method, if you’d like you could join the discord https://discord.gg/s4uVTQwk and let me know if you have any other ideas


I like Code Kata approach, it allows to learn and practice.

But dislike siloed websites like Leetcode where they ask you to bear with their awful web experience, I want to keep my code and notes offline and close in case I need it in a year or 10 years.

Approach with simple test files and exercises is more appealing to me https://github.com/dabeaz-course/python-mastery

So what is the goal here, to be like Leetcode ? or spread knowledge ? If latter, put material as plain markdown and .py files on github repo, we will say thank you.


Originally I started this as an open source project, and currently thinking of a similar system to what you shared where I make the problems open source and keep the site close sourced. Here was my original project https://github.com/moe18/DeepMLeet


While this might be helpful to gain a deeper understanding, but adding a time constraint and making it something that can be asked in an interview sounds painful. Please make this a github repo instead like python_koans


Typically what happens for ML engineering roles is that you have a regular Leetcode round as for any other SWE position and an additional round with ML questions without coding - there's no ML-specific LC questions. Which is nice as a candidate because it's yet another thing to prepare for, even if the questions are relevant and being able to solve them is kind of neat.


I've definitely had ML questions involving coding e.g. implement k-means


Created a discord for anyone that had any recommendations or wants to stay up to date on new questions we are working on https://discord.gg/s4uVTQwk


The first example is a bit confusing.

Example: input: a = [[1,2],[2,4]], b = [1,2] output:[5, 10] reasoning: 11 + 22 = 5; 12+ 24 = 10

Which 1 and 2 correspond to the 1 and 2 from a and b?


That is a good point, thank you for the input I will change up the example problem to clear things up


I haven’t seen anyone ask these types of questions for interviewing for ML positions. They feel like ChatGPT or straight from a textbook. Can you share how you arrived at these questions?


I created these questions from a mix or resources, some from libraries like numpy linalg docs, and sklearn docs. Some from textbooks like https://www.deeplearningbook.org/ And others I asked chatgpt about


Edit: previous title was "Leetcode for ML" or somesuch...

I like the idea and might try some! But as a warning: leetcode is specifically aimed at prepping for interviews, and I've never seen questions like these in an interview (I'm somewhere between an MLE and ML researcher FWIW). The most common kinds of ML-specific things in my experience are:

- ML system design (basically everyone does this)

- ML knowledge questions ("explain ADAM etc.")

- probability + statistics knowledge

- ML problem solving in a notebook (quite rare, but some do it)


Probably should have titled it something else, I made it more as a learning platform for people to get better at ml by implementing algorithms from scratch. I’m currently a data scientist but wanted to become a machine learning researcher or engineer and I thought these types of questions would help


I saw the k-means one a couple times


This website is super buggy. Sign up with Google doesn't work. The code editor keeps running in to tabs vs spaces issues. Defaults to 2 space tabs like it is Javascript.


Thank you for the feedback will look into that

Edit: the sign up works for me, but the spacing is an issue


I'm curious how you run the python code in the browser


Is it down for anyone else too?


can you not get to the site or when you run your code it does not run?


Great resource!


It's sad how a lot of people see this as "a bad way to test job candidates" rather than a "fun way to practice ML skills".


Those comments are based on the original title introducing it as an ML Leetcode. The title is more accurate now.


thanks! I think having leetcode in the title angered a lot of people


It doesn't matter. I would have preferred that the title mentions that it is Leetcode-like anyway.

But thanks for giving Leetcode yet another idea to test AI Engineers who do not know how to write a multi-layered perceptron or a softmax activation function from scratch with yet another repository of already solved puzzles to making it easier for interviewers. I'd say its pretty useful myself.

And so it begins with the complaints of "The AI interview is broken", "We are the only industry that does this" frequently being preached here.


Please don't.

Leetcode already ruined so many coding interviews by asking people to do bullshit like

"Output data from a stream in order, make the solution performant"

Why would you ruin ML for us too?

Looking at your site, problem #1 is Multiply a matrix times a vector..... in no universe is that a legitimate ML interview question.

Also ML is such a huge field (everything from statistical learning through to transformer neural networks), I fail to see how you could say your solution tests core skills. If I'm hiring for an ASR Role, it's going to be very different than for a CV role.


> in no universe is that a legitimate ML interview question

Why not? This seems like the ML equivalent of FizzBuzz. If you don't know how matrix multiplication works well enough to implement it, I would argue that you don't know what you're doing at all.


My nightmare has finally come true.


Ok, but please don't post unsubstantive comments to HN, and especially not shallow dismissals of someone's work.

https://news.ycombinator.com/showhn.html

https://news.ycombinator.com/newsguidelines.html


Sorry for the judgement of the lack of "substance" of the comment, but to my defense I see this kind of comment all the time under almost every post (including this one), and it is not always obvious unless pointed out.

And this is in no way dismissive of the work. I can definitely see the value in this -- I am just saying many people don't wish to see this, which many people apparently agree based on the number of votes.


Yes, too many people post that sort of unsubstantive comment—the cheap one-liner is maybe the biggest forum cliché there is—but that doesn't make it ok.

I believe you that you intended something more thoughtful, but the rest of us don't have access to your intention (or the real meaning of the comment in your head). We can only go by what you actually post, so if you want to make a more thoughtful point, you need to do so explicitly.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


Inverting a binary tree became implementing SVD with arrays only.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: