Hacker News new | past | comments | ask | show | jobs | submit | arhyth's comments login

> Databricks is often managing a million-ish Spark-sub tasks for various users. They couldn't do that using traditional operating system scheduling techniques: they needed something that could scale. The obvious answer was to put all scheduling information into a database. That's exactly what the Databricks guys did: they put it all in a PostgreSQL database, and then started whining about Postgres performance," says Stonebrake.

anyone know of a talk/paper discussing this scheduling "hack" in more detail


https://arxiv.org/abs/2007.11112

The link is referenced in the article.


and these data structures are themselves able to span hundreds/thousands of disks? eg. what happens when a customer's bucket/object are too many such that the skip list "indexing" them cannot fit in a single harddrive?


data structures that span physical media got solved decades ago. the disks or ssd get abstracted into virtual storage. See:

https://en.wikipedia.org/wiki/Logical_disk

This kind of logical media mapped to physical media goes back to magnetic tape, predating disks and solid state storage.

Database systems solve this same problem. A database table may include millions of rows and indexes, too big to fit into RAM all at once. The index trees, for example, get loaded in pages. There’s no need to have the entire index in memory, only the relevant parts of it.


kind of agree but also disagree. the runway for Go to go (pun unintended) from easy to difficult to use because of it consisting few parts is far longer than for a developer using a feature rich language to eventually shoot themselves in the foot. let me throw in another language in the mix to describe what i personally think is _almost_ the best tradeoff between this difficult and easy simplicity. Elixir. it has almost the same amount of keywords (as a proxy measure to API surface) as Go but on the other hand it also exposes metaprogramming and so, if you really want to, you can easily shoot yourself in the foot. but in both languages, as I said for Go, the runway for language usability to go from easy to difficult due to their limited features is very very long.


just new to Golang (in programming in general, actually) and this bug me. in the slide, it's indicated they are semantically the same. but how?

(1) var thing Thing json.Unmarshall(reader, &thing)

(2) var thing *Thing = new(Thing) json.Unmarshall(reader, thing)

in (1) thing is a variable for a value of type Thing, while in (2) thing is a variable for a Thing pointer

shouldn't thing (2) need to be dereferenced first?


You are right that (2) needs to be dereferenced first, however, in go a pointer is autmatically dereferenced when accessing its properties or "methods". What I mean by this is that if you want to access a property on thing, it would look the same.

In C++ it would look like this:

  (1) x = thing.Property;
  (2) x = thing->Property; (or (*thing).Property)
but in go, both would look the same, like this:

  (1) x = thing.Property
  (2) x = thing.Property
So there is a difference syntactically, and the compiler will handle them differently, but whether the difference is semantically relevant is debatable.


That's one of the things that I hate about Go. That and how they force me to write an 'else' on the same line as the closing brace of the 'if'.


Rust does this too, I think.


(Yes, it does.)


json.Unmarshall takes a pointer as its second parameter. In (1), a pointer to thing is created using &. In (2), thing is already a pointer and can be passed as-is.

They're semantically the same. The type of thing isn't, but that's syntax.


Well, the second arg of Unmarshall is expected to be some kind of pointer.

> Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v. If v is nil or not a pointer, Unmarshal returns an InvalidUnmarshalError.[1]

In (1) the '&' in '&thing' means roughly "the address of", so you are passing "the address of thing" i.e. a Thing pointer that points to the thing variable.

1: https://golang.org/src/encoding/json/decode.go


thanks for the outline. article was behind a paywall from my end.


can someone please give this outsider-noob an eli5 explanation why golang do not adopt a same policy/attitude/implementation towards package management?


Go is being developed by Google and as such is designed to meet their needs. Google's view of package management is to bring every package in-house and manage it themselves. So, if a Go program needs package "A" then Google will directly import that package into their workflow.

This is actually one of the reasons that I don't think Go is a very good language for most people/organizations. It was conceived with a specific set of guidelines that Google needed; easy to learn, performant, etc. Go is also designed to be used by teams of thousands, so it's much easier to adopt misc packages into the fold and maintain them than it is to manage links to outside requirements.


i love elementary OS. but gahd Calibre is just ugly and for some (probably noob) reason, can't get it to work on my machine. i'll try this one out.


You could probably also give BookFusion a try. You will be able to read on iOS, Android and Web(Linux,Windows,OSX). All your eBooks will be synced across all devices, comments, highlights and reading progress.

More at https://www.bookfusion.com/reading/cloud-library . We also plan to release a native app later this year for desktops.

PS: I am the founder of BookFusion


use an Appimage for Calibre.


+1. even for "non-believers", there are practical gems that will benefit anyone who cares to really read it.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: