More

donovanr · 2024-04-07T01:09:47 1712452187

it's uncreative/tired but at least to the point. Too many papers are confused / opaque agglomerations of a year's worth of research shoehorned into a paper. At least with these you can fairly easily assess whether the claim is supported or not.

ravelfan · 2024-04-07T10:43:47 1712486627

Obviously, more people are going to read it.

It is like putting a stupid face on your youtube video to show how shocked and amazed you are at the content.

donovanr · on Aug 4, 2020

This rings _so_ true.

While I was in grad school I got bored sitting behind a computer all day, and my wife and I decided to build a tiny house on a trailer as a way of venting our pent-up DIY urges. We'd just build it in our spare time. LOL.

We started in the late summer of 2013, with a trailer and no plans and a stack of construction books from the library.

Cut to spring 2016, having spent every single weekend and most evenings since (in zero degree winters and brutal Pittsburgh summers) sweating and swearing, really pushing the "divorce cabin" line, and having legitimate discussions late at night about the benefits of burning it all to the ground, where my wife, eight month pregnant, is trying to finish the trim work before I submit my dissertation and we tow it across the country.

The way the article captures the "not knowing what we were getting into" / tiny things that delay you to death / stressed out / losing friends / doing absolutely nothing else with your life / so over budget it hurts / final elation at success is absolutely perfect.

We only made it across the finish line because living in Pittsburgh on a grad student stipend is actually, well, livable, and I could do that while my wife worked pretty much full time on our housing boondoggle.

The main learning experience coming out of it was that you should absolutely pay how ever many thousands of dollars it costs for a good set of plans from someone who's done this before. Learning smaller tasks like framing and roofing etc is easy. Stitching it all together into a plan that you're arguing about because neither of you have any idea what you're doing, all while you're wasting precious daylight is _hard_. We would have finished at least a year sooner if we just had plans to follow.

All that said, building a place to live in was super rewarding (as others have said) type II fun.

We still have it, it's beautiful, and I have not yet burned it down.

donovanr · on April 1, 2020

Allen Institute for Cell Science | ONSITE Seattle | Full-time | software engineer / ML / computer vision | https://alleninstitute.org/

The modeling team at the Allen institute for Cell Science is hiring for two software engineering positions -- a data generalist and a ML / computer vision specialist:

https://alleninstitute.org/what-we-do/cell-science/careers/j...

The Allen Institute for Cell Science aims to impact the entire cell science community. Our goal is to advance understanding of cell behavior in its normal, pathological, and regenerative contexts. Our multidisciplinary team will generate novel cellular reagents, data, models and databases that are informed by and open to scientists around the world. We will produce unique dynamic, visual databases and cellular models that integrate information and data across cellular and molecular sciences.

https://alleninstitute.org/what-we-do/cell-science/careers/

donovanr · on March 24, 2020

Allen Institute for Cell Science | software engineer / ML / computer vision | ONSITE Seattle | Full-time

The modeling team at the Allen institute for Cell Science is hiring for two software engineering positions -- a data generalist and a ML / computer vision specialist:

https://alleninstitutecellscience.hrmdirect.com/employment/j... https://alleninstitutecellscience.hrmdirect.com/employment/j...

ONSITE, Seattle

The Allen Institute for Cell Science aims to impact the entire cell science community. Our goal is to advance understanding of cell behavior in its normal, pathological, and regenerative contexts. Our multidisciplinary team will generate novel cellular reagents, data, models and databases that are informed by and open to scientists around the world. We will produce unique dynamic, visual databases and cellular models that integrate information and data across cellular and molecular sciences.

https://alleninstitute.org/what-we-do/cell-science/careers/

donovanr · on Aug 31, 2018

Very nice. It might be worth worth weighting each data point in proportion to the population it represents.

donovanr · on May 11, 2018

Characterizing cellular variation is exactly what we're interested in, e.g when and why are the mitts clustered around the nuclear vs not. Lots of images in our data have them packed around the nucleus -- you can look at the localizations from our microscopes here (select the Tom20 tag): http://www.allencell.org/3d-cell-viewer.html here. You can also grab the raw data (including bright field images e.g. what you would "see" in the microscope) here http://www.allencell.org/data-downloading.html#DownloadImage...

donovanr · on May 10, 2018

just saw this; I'm part of the computational modeling team that worked on this -- can try to field any questions or find more qualified people to do so.

m1el · on May 11, 2018

This is probably going to be a lot, but...

1. Would you advertise this tool as a visualization to help with future research and understanding cells OR a possible diagnostic aid?

2. Is there any project that aims to apply these tools to find changes in cells of an aging organism? Do you think that would be useful?

3. Is it possible to figure out for any given class of cell how much of its volume is understood? e.g. "there's this little part and we have no idea what's going on there" or "this protein is everywhere and we can't figure out what it does".

4. How can you evaluate the correctness of your probabilistic model? Neural nets and auto-encoders are known to produce bad results. as an exaggeration, you wouldn't want to have this as your model of human face: https://zo7.github.io/img/2016-09-25-generating-faces/random...

And thanks for publishing the source code for training!

donovanr · on May 11, 2018

1. All of the above. The label free tool in particular gives you such a big free lunch at the microscope that the combination of it and good visualization has the potential to massively impact research workflows.

2. We are very interested in how cells change as they divide, differentiate, age, are perturbed by their environment, etc. We study cells in culture right now -- getting good images of in vitro cells from multicellular organisms is way harder. So yes it would absolutely be useful. I don't know if we're going to tackle it ourselves, but one of our core missions is to lay the groundwork for the community to take our tools and run with them -- it's a big win for us if we can bring previously unfeasible research within the realm of the possible.

3. I am a Bayesian at heart, so modeling uncertainty is something that I'm always thinking about. It's high on my list of priorities to do something along these lines.

4. Image similarity is a hard problem. At the end of the day, metrics only get you so far and the proof is in the pudding. Unfortunately there is no ground-truth data to test against -- the probabilistic model was constructed exactly because we can't measure where everything is all at the same time. Some things we do to convince ourselves that we're on track is to see that the variation in the imputed predictions and the actual data are statistically similar, and to see if experts are confounded in differentiating the outputs of our models from actual data. You can read more here https://www.biorxiv.org/content/early/2017/12/21/238378

scentoni · on May 10, 2018

This seems like a great application for VR, and possibly haptic feedback. Are you porting this to VR platforms?

donovanr · on May 10, 2018

not sure about specific plans but it's definitely something we're interested in

macawfish · on May 11, 2018

Are these surfaces basically just level sets of some volumetric density data?

donovanr · on May 11, 2018

That's at least close to true. It might be better to think of it as a smoothed "segmentation" of a 3D image, i.e. some algorithm decided what pixels are officially part of e.g. the mitochondria set, and outlined them. That could be based on level sets or seeded watershed or whatever else works well.

There are some alternate visualizations here http://www.allencell.org/3d-cell-viewer.html of data that came off of our microscopes that we also use to visualize our models in house but wasn't;t included in the video. It's hard to visualize varying density 3D data in 2D -- there's no one good way to do it, especially on the fly over the web -- but if you have any feedback about what would be more informative / easier to understand, let us know.

donovanr · on April 19, 2018

Some feedback on the quiz:

- a few of the questions were very good, and either spoke to key high level concepts, or were specific while being language agnostic. (e.g which one of these layers wouldn't you need, why wouldn't this type of classifier work on this data).

- too many of the questions were hyper-focused on the minutiae of word embeddings, tensor flow syntax, SQL queries, and recommender schemes.

- many of the questions were constructed vaguely enough that "I don't know" would be the technically correct answer even though I don't think that was what you were going for.

metadata: recent PhD with serious grad courses in ML and working in DL/CV for the past year using a non-tensorflow framework (PyTorch).

edouard-harris · on April 19, 2018

This is great feedback. Thanks!

We're constantly iterating on the quiz and it would be great to get more detailed thoughts on it.

If you'd like to do that, please get in touch! (Email in my profile)

donovanr · on April 19, 2018

would be happy to, but I don't see your email there -- mine's in my profile (I think!) if you'd like to get in touch

edouard-harris · on April 19, 2018

Sorry, realized it wasn't public. Just updated, should be there now!

ellisv · on April 19, 2018

Took the quiz and completely agree. Most questions were either overly concerned with detail or too vague.

High-level I don't think a quiz is necessarily the right tool either. Reminds me too much of taking the SAT or GRE.

edouard-harris · on April 19, 2018

Yeah we definitely aren't convinced that a quiz is the optimal format for this evaluation.

Statistically, it does an OK job at being an initial filter. My biggest concern at the moment is that it's too coarse of a tool and it might be mistakenly turning away competent people.

Definitely a work in progress. If you have ideas on alternative formats or better questions, please email me. (Email in my profile.)

donovanr · on Aug 16, 2017

specious extrapolations aside, the plot itself is deceptive -- exponential growth in a log-linear plot should be a straight line, not one accelerating upward super-exponentially.

donovanr · on Nov 29, 2016

Sedgewick's 1978 paper[0] on implementing quicksort has some interesting hand optimizations of the assembly code -- loop rotating, unrolling, etc. I wonder if modern compilers do the same?

[0] http://penguin.ewu.edu/cscd300/Topic/AdvSorting/Sedgewick.pd...

pertymcpert · on Nov 29, 2016

Yep, loop rotation and unrolling are done very commonly.