Hacker News new | past | comments | ask | show | jobs | submit login

> We present a Newton-type method that converges fast from any initialization and for arbitrary convex objectives with Lipschitz Hessians. We achieve this by merging the ideas of cubic regularization with a certain adaptive Levenberg--Marquardt penalty.

I feel insignificant just by reading the first phrase.




Newton method: find an error and attempt to fix it by approximation

Convex objective: a bowl-like function, we want inputs that put us at the bottom of it

Lipschitz: a space that isn’t too stretchy

Hessians: second partial derivatives of the terms we’re optimizing (inputs we want)

Regularization: make the “fix by approx” procedure a bit more well behaved

Levenberg–Marquardt: another adaptation to make the fix procedure better


It's funny how many complex ideas are "just" simpler ideas mapped to the right sequence of steps. The jargon and naming conventions can obscure the idea from outsiders, but allows for brief and accurate communication. If they'd explained the idea Barney-style, they'd have needed twice as many pages.

Thanks for the breakdown, it makes the paper more accessible.


Math is about formalizing "simple ideas" so that they can be reasoned about exactly.


Every idea in math is like that, because the rules of math are to do everything in little steps.


Well a simpler idea is in way like the initial guess for Newton-Raphson. Sometimes you get convergence and sometimes you don't


Would be great to have a resource written like this that covers the rest of maths, (or in fact jargon in general).


The problem would be the resource would be too sloppy to be useful. For example, the phrase "Lipschitz: a space that isn’t too stretchy" doesn't tell you about the precise definition of Lipschitz, which is absolutely essential, and there are tons of names in math that would all be sloppified to "a space that isn’t too stretchy".

Not knowing precisely which definition you need for an application or which one you need for a proof to hold is essential for progress.

Math doesn't make up "jargon" for no reason. The names are to denote to professionals the least needed elements to make something true.


I always thought biology would be easier to learn if instead of entity names based on old languages they were instead verbose descriptions

i.e. Mitochondria->PowerHouseOfTheCell


Wow, it'd make the textbooks look enterprise-y. On the other hand, if you didn't capitalise every word in the compound it'd look like German.


I teach a class on the subject: https://www.youtube.com/playlist?list=PLdkTDauaUnQpzuOCZyUUZ...

Understanding optimization at this level and being able to formulate your own problems in one of the canonical forms gives you super powers.

You don’t even have to build your own solvers. The commercial/open source ones are typically good enough.


>> You don’t even have to build your own solvers. The commercial/open source ones are typically good enough.

But as a maintainer of solvespace (Open Source CAD software with geometric/algebraic constraint system) I am left wondering if this can be applied to our constraint solver, which I didn't write. It solves systems of algebraic constraints using some form of Newtons method, partial derivatives, jacobian... All stuff I'm familiar with, but not in great detail. Figuring out how best (and weather) to apply this might be a major digression ;-)


After following the course, you should be able to grok the solver code in /src/system.cpp. That looks fairly standard.

The interesting part (to me, as that is not my specialty) lies in the translation of the 3D constraints (including rotation, etc.) into a single objective function for solving Newton-style.


>> The interesting part (to me, as that is not my specialty) lies in the translation of the 3D constraints (including rotation, etc.) into a single objective function for solving Newton-style.

That's funny - I like the geometry stuff and have a good grasp of how to create useful constraint equations. I just don't know too much about the code for solving systems of those equations ;-)


There are probably many ways to formulate the same geometric constraint. For example, what parameterization of rotations do you use? I think the choice has implications for how easy it is to solve the equations.


>> For example, what parameterization of rotations do you use?

For orientation we use quaternions. For other things it's an axis-angle representation. For the equal angle constraint I'm not sure how that's implemented. There was a recent addition of length-ratio between lines and arcs. That implementation was surprising to me.


These lectures look fantastic — thank you for making and sharing them! I've been looking for a good foundational treatment of optimization to digest over the holidays.


Are the exercises that go with the lectures available? Thank you!


> I feel insignificant just by reading the first phrase.

I often wonder about the intentions of those who post of these types of comments. Being charitable one might suppose the parent is offering a backward sort of compliment to the authors. Something like "Great job, beyond my capabilities, I'm glad someone is working on these things".

There are a lot of less charitable formulations; why should other readers care about parent commenters insecurity? Does parent commenter not appreciate the work that goes into studying these topics? Was parent commenter told they were "smart" to often as a child and never learned to exert themselves academically? Etc.

Although this post is mostly in jest, I really am curious what prompts these comments, comments of this type are quite reliable.


> There are a lot of less charitable formulations; why should other readers care about parent commenters insecurity? Does parent commenter not appreciate the work that goes into studying these topics? Was parent commenter told they were "smart" to often as a child and never learned to exert themselves academically? Etc.

You're missing the obvious interpretation -- jargonization of the sciences keeps people out and keeps the masses stupid. Math is particularly guilty of this -- more charitable fields use jargon that is at least semi-self-explanatory, words and phrases that, while jargon, do make sense if you deconstruct them. In Math, people are so arrogant that they discover something and immediately name it after themselves, so if you haven't had a topology class, you won't know that X person's name maps to Y concept. Good jargon can be pieced together with pure reason alone without knowing the names of the people who invented what. This isn't true across the board of course -- "gradient descent" is a particularly well-named piece of jargon, for example.


People generally don’t name it after themselves; other people name it after them later.

Not that many people wouldn’t like for the thing they introduce to eventually be named after them. But, I think it would generally be considered presumptuous to directly name it after oneself.

I remember one story where one mathematician upon hearing something referred to as a [some term named after them], asked what the term referred to. (I want to say it was Hilbert asking what a “Hilbert space” was, but I’m not sure.)


The story is told by Saunders Mac Lane, quoted in the book Mathematical Apocrypha Redux:

> J. von Neumann in 1927 introduced the axiomatic description of a Hilbert space, and used it in his work on quantum mechanics. There is a story of the time he came to Göttingen in 1929 to lecture on these ideas. The lecture started "A Hilbert space is a linear vector space over the complex numbers, complete in the convergence defined by an inner product (a product <a,b> of two vectors a, b) and separable". At the end of the lecture, David Hilbert (by custom sitting in the first row of the Mathematische Gesellschaft), who was then evidently thinking about his definition and not about the axiomatic description, is said to have asked, "Dr. von Neumann, ich möchte gern wissen, was ist dann eigentlich ein Hilbertscher Raum?" [Freely translated this is "Dr. von Neuman, I would very much like to know, what after all is a Hilbert space?"]

https://books.google.com/books?id=8mBdvAjk_gQC&newbks=1&newb...


math at least has __nowhere__ near as much jargon as something like biology or the life sciences.

I think a larger problem in reading math might be polysemy. Symbols have multiple meanings, so context has to be used to infer what meaning is intended by authors.


It's also very difficult to google symbols, even in this post UTF-8 world


If it makes you feel any better, I knew nothing about the field of optimization (convex or otherwise) 3 months ago. But after going through an optimization course in my CS master's program (MCSO at UT Austin) this semester, I can parse this part of the abstract fine, minus the Levenberg--Marquardt, which I had to look up. That is with not much of a prior mathematics background. You could probably learn enough in a semester of part-time study to grok this paper, as long as you're comfortable with differential calculus.


Longitudinal Analysis of an Amygdalocentric Structural Network using Mixed Effects Multivariate Distance Matrix Regression with Multiple Fractionated Polynomials

A working title for a manuscript I am shopping out to the various neuroscience journals right now. I feel like I'm finally the esoteric person I've always wanted to be.


Lolol why? Even if you're a mathematician in a different area you might have no clue what that sentence means.


that popular "mathemathics for machine learning" book will teach you almost everything in that sentence, authors even had/have a mooc on coursera.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: