HLearn: A Machine Learning Library for Haskell (2013) [pdf]

jackpirate · on May 24, 2017

Hi everyone, author of HLearn here :)

This is a bit awkward for me as I've paused the development of HLearn and emphatically do not recommend anyone use it. The main problem is that Haskell (which I otherwise love) has poor support for numerical computing. I've tried developing an alternative standard library to improve the situation (https://github.com/mikeizbicki/subhask), but Haskell's type system isn't yet powerful enough to do what I want. I'm sure the type system will have the needed features in 5-10 years, and I'd rather wait and do it right.

If you have any questions, I'd be happy to answer them.

bos · on May 24, 2017

Let me put Mike's comment into what I think is its proper context. "Poor support for numerical computing" really means "relative to Mike's dream, which is not actually realisable by any programming language today" :-)

Most readers seem to be misinterpreting Mike as anchoring off other popular programming languages of today, whereas he's looking for language features for which there's (a) no consensus that they'll actually be good when they exist, and (b) don't yet exist. (I'm highly skeptical of dependently typed programming.)

I think that there's a case to be made that numeric programming in Haskell, relative to the state of the art of today rather than the year 2100, really isn't so great – but my concerns are very different than Mike's, and revolve around libraries rather than type system features.

Source: have done a bit of Haskell in my day.

jackpirate · on May 24, 2017

You're 95% correct about my view.

I do think that matlab/python are a bit better numerical programming languages than Haskell as-is, but only marginally. This is not just due to the library ecosystem, but also because I think that dynamic languages really are better than the best Haskell2010/GHC8.2 library theoretically possible. There are just some things that the existing type system makes a bit more awkward.

YeGoblynQueenne · on May 25, 2017

That's kind of sad that you've given up on this project. I visited this thread to comment that I'm working on a similar project, but for Swi-Prolog (a statistical NLP module - it's my own initiative and about a month away from sharing with the world).

I think it's a big shame that traditional AI and computer-sciency languages like Haskell and Prolog have lagged so far behind the mainstream ones in terms of machine learning and as machine learning gets more popular I'm worried this will cause them to fall by the wayside even more than they have already.

What is it that's making Haskell bad at numerical computing? I would have thought it's not much worse than e.g. Julia or Python but even if it is, I always figured there's other benefits to programming in Haskell- otherwise we'd all be geeking over FORTRAN, I guess.

With Prolog the big issue is that statistical AI algorithms tend to go a lot faster with mutable, indexable data structures and those don't have a lot of support in Prolog. What is it that's really bothering you with Haskell? Could you give an example?

[Note: I'm a Haskell noob, but I should be able to handle code examples]

bjz_ · on May 24, 2017

> Haskell's type system isn't yet powerful enough to do what I want.

What still needs to be done? Does Idris have enough of the power that you need?

jackpirate · on May 24, 2017

I've not used Idris too much, so I can't say for sure. My guess is that the answer is that it could do everything I want type-system-wise, but there are things I want outside of the type sytem that I don't think it could do.

For example, I want the compiler to automatically rewrite my code to be much more efficient and numerically stable (see the HerbiePlugin to the GHC compiler which goes this https://github.com/mikeizbicki/HerbiePlugin). My understanding is that the Idris compiler gets much less engineering work done on it (outside of the type system), and so getting efficient running code will be too difficult.

juxncxrlos · on May 24, 2017

I think it's pretty interesting considering the level of optimization that Hlearn has that the author mentions the poor support for numerical computing. I have two questions. 1. Which are the things that Haskell is missing for numerical computing. Is it something related to the language standard or to the compiler? 2. I have read the info for SubHask but I haven't got enough context to really understand why the alternative Prelude might help with numerical computing. Could you explain it a bit more, please?

jackpirate · on May 24, 2017

It's common in machine learning to define a parameter space $\Theta$ that is a subset of Euclidean space with a number of constraints. For a simple example, $Theta$ could be an elipse embedded in $R^2$. In existing Haskell, it is easy to make $R^2$ correspond to a a type, and then do automatic differentiation (i.e. backpropagation) over the space to learn the model. If, however, I want to learn over $\Theta$ instead, then I need to completely rewrite all my code. In my ideal language, it would be easy to define complex types like $\Theta$ that are subtypes of $\R^2$, and have all my existing code automatically work on this constrained parameter space.

readpunch · on May 24, 2017

I am learning Haskell. Can you explain why you think Haskell has poor support for numerical computing?

jackpirate · on May 24, 2017

See bos's comment above. It's mostly just poor relative to what I wish it had. You can see the subhask library for a small taste of what I wish it had.

supernumerary · on May 24, 2017

What are you using instead?

jackpirate · on May 24, 2017

Everything (octave, python, julia, c++, various probabalistic programming langugaes) depending on the task and my mood.

supernumerary · on May 24, 2017

Thanks and go UCR: http://positron.ucr.edu/ I bet doing ML in this context would be fun...

neutronicus · on May 24, 2017

I remember repa being a pretty awkward experience when I tried it out

platz · on May 24, 2017

a newer library targeted at deep learning: https://github.com/HuwCampbell/grenade

amelius · on May 24, 2017

What are the advantages of Haskell at deep learning, given that

1. Graph structures are notoriously difficult to model in functional languages.

2. The software-engineering side of deep learning is not all that difficult (e.g. using Keras is quite simple).

amelius · on May 24, 2017

Also, correct me if I'm wrong, but the state of the art seems to be happening in Tensorflow/Keras; so committing to a different platform could mean you are systematically lagging behind in this new field.

platz · on May 24, 2017

Then don't use Haskell! I'm having a hard time understanding if your question is actually genuine or if you're just dropping by the Haskell thread to explain why you're not into using Haskell.

madsbuch · on May 24, 2017

We have to remember that machine learning is much more than just ANN, or, for that sake, matrix manipulation:

http://probabilistic-programming.org/wiki/Home

agrafix · on May 24, 2017

You can for example prevent "runtime" bugs due to combining layers/vectors/matrixes of the wrong dimensions.

posterboy · on May 24, 2017

regarding compiling

>Grenade layers are normal haskell data types which are an instance of Layer, so it's easy to build one's own downstream code. We do however provide a decent set of layers, including convolution, deconvolution, pooling, pad, crop, logit, relu, elu, tanh, and fully connected.

it's called a README for a reason.