> Benchmarking cutting-edge graph-processing algorithms running on 128-core clusters against a single-threaded 2014 Macbook Pro. The laptop consistently wins, sometimes by an order of magnitude.
LOL, this hits close to home. My company had a modeling specific VM set up to run our predictive modeling pipelines. Typical pipeline is about 50,000 to 5 million rows of training data. At best, using an expensive VM, we managed to get 2x training speed from lightgbm on the VM vs my personal work laptop. We tried GPU boxes, hyper threaded machines, you name it. At the end of the day, we decided to let our data scientists just run models locally.
Haha! Back in ~2014 or so my company was spending nearly $30,000/month on an EC2 "compute-optimized" cluster to transcode live video streams to multiple renditions. One of our engineers said hey, why don't we try to colo some real hardware? We did a test with a single bare-metal 8-core Xeon server and it completely destroyed the performance of the EC2 "compute-optimized" cluster!
After that we colo'd 4 big Xeon servers for about $1,600/month total. Looking back on it now it's just so insane...$30,000? no way.
I don't use AWS for a damn thing because of exorbitant costs. I just don't get why people think that it's necessary other than that they're the types to get drawn into marketing hype.
There's just so many better things for your company to be spending the money on.
I agree that cloud compute is usually (too) expensive, but it is sometimes useful and cost-effective. Not long ago, I need to very quickly run a one-off analysis, which required over 200 GB of RAM. It was much cheaper and faster to spin up one VM on GCP for a day than having to order parts, etc.
Similar - wanted to see how much ram gron would need for a particular file and needed much more than the 16GB or 32GB available at home. Quick and easy to spin up a 192GB EC2 to verify it needed ~95GB all told. No way I could have done that without something like AWS.
Depends... set the pagefile to something large, run the program. If it doesn't thrash too hard you can observe it to its largest growth. You'd probably want an SSD. Maybe that would work. Maybe.
The argument is - engineers are expensive so why pay for the expertise to setup and run machines? There's just so many better things for your company to be spending the money on.
We don't let pencil pushers with MBAs anywhere near what we're doing, and it's going great.
I know this isn't the most usual configuration but if undervaluing my skills and trying to bottom dollar on them is going to be their rules, then I'm just going to do my own thing, and they're just going to have to scrape the bottom of the barrel for talent.
I hope the zeitgeist changes any time soon. God knows how many unicorns have been sacrificed with that kind of paradigm which could be successful companies by now.
I don't know how that changes the equation. No one is undervaluing your skills; it's would you rather spend your time driving to a colo center to replace a RAID array or working on $product.
With AWS you are outsourcing an IT team, not just processors and how you approach pricing should reflect that.
I hear this bad argument often (“replacing hard drives”) and I don’t understand why. It’s as if we’re mentally stuck in a bad hacking movie from 1999.
If you’re doing colocation to save money, you’ve also figured out that going to the datacenter sucks and it’s a terrible place to do work.
You’re not building your own servers from scratch, you’re generally purchasing them from a vendor who offers a warranty and optional on-site service.
Or you’re leasing them from a hosting company who will take care of those pesky RAID alarms for you.
You (or your hosting providers) have likely outfitted your server with remote out-of-band access to allow you to get into BIOS or the RAID controller without physically being in front of the server.
And finally, you have remote access to power cycle the server (or a batphone at
your hosting provider to do it on your behalf).
I want to say that these datacenter-visit-prevention techniques have been near standard practice for a decade-and-a-half.
Nope, this seems to be the norm. I've worked on a couple colo servers that nobody at the company had ever actually seen in person. They figured out colo in Germany was the best deal, so they had some servers delivered straight to the DC and the staff there installed them and plugged them into an IP KVM. Not sure if this is a standard service most providers offer, but I'm sure a big enough cheque would convince most - and considering the cost of transporting both the hardware and engineer to install it, that cheque can be quite large.
So you've just explained why 'the cloud' is better than DIY.
Take all those things you just talked about, and expand them horizontally and vertically up the stack, and you have 'AWS'.
So not just 'a guy to replace the hardware' - but now it's software configurable, has all sorts of other, fancy things.
Time is money, and it's expensive to pay people to mess with things if they don't have to.
It's like this:
If your company needs 3 cars, you rent/lease them. You do not hire your own mechanics, even if technically speaking "we could change the oil for so much less!"
If your company is in the business of transportation, and you have thousands of trucks, you may want your own repair/maintenance team etc. instead of paying some service company a fat margin to change the oil.
The original discussion was around price-performance of physical servers vs cloud VMs. That being the case, it's not a clean a distinction as you describe it. It would be more along buying a few trucks and taking them to the garage when needed (which is rare in small numbers) vs renting many more vans, for higher margin, just to avoid the garage.
Maybe you need an MBA to help you understand that in many cases, it's incredibly more cost effective to use the cloud, because the marginal savings that could be achieved with on prem hardware are dwarfed by the cost of labour, and especially lost opportunity cost.
For most things 'local prem' is an optimization that usually needs on some degree of scale to justify, or, you have a peculiar setup i.e. a couple of well versed hardware and networking guys who have no problem with a bit of a physical setup. Which can be a bonus.
"I hope the zeitgeist changes any time soon."
No, it won't, it's going in the 'other direction' forever, because the 'economies of scale' at Amazon, it's incredibly difficult for individual engineers to compete with those efficiencies.
Just the opposite of 'being a problem for startups' , the 'cloud' has basically made entire swaths of types of startups possible where they wold not otherwise.
Like everything, you have to use think about it a bit but their costs are really, really transparent (imagine Oracle trying to do it ...).
I love that you say “demand” like somehow engineers are forcing companies at gun point to pay their salaries. No, stop. It’s the result of market pressure and actual engineering degrees + peng certifications being hard to acquire and desirable.
Never understood that argument. How exactly an expertise of setting up and running (your cloud provider) instances is cheaper than expertise of setting up a physical machine?
So you saved, $28,400 a month. Did that make a material difference to the company? I often tell people above me I could save us $10k a month at AWS and generally the response is, "Yeaaaaaaah that's great, could you do XYZ instead to help us land an additional $1M in ARR?"
The people above you suck. That's a terrible attitude. Signalling that saving money or just generally improving systems doesn't matter will not build a culture of technical excellence and ambition.
Also, it's not peanuts. How many extra developers are $28400 per month?
No, actually they don't. In the time I've been there they've 10x the size of the company. Maintained majority control through multiple rounds of funding. Significantly increased salaries. Provided a great work life balance. Etc.
Why are developers obsessed with how many additional developers they could hire with hypothetical savings?
no idea of $ but say they'd get 5000€ a month (quite OK for most places europe), the company would need to pay out a bit more here (for some tax/social benefit things), 7000€ would be pretty realistic, so 4 developers could be hired.
This then means that while op wouldn't immediately add 1M ARR, the original dev and the new devs could soon add 5M ARR (stupid extrapolation, it won't be as much in practice, at least normally)
Please note that in many places 5k€/month is a huge sum of money for a salary. Even in western Europe, and in France more specifically, many devs i know are payed 1-2x minimum wage (i don't know a single IT person earning 5k€/month although i'm aware they exist). In France, minimum wage is about 1500€/month (before taxes), to which you add about as much professional taxes and contributions from the employer.
According to my napkin calculus, you could get about 4.5-9 developers (for 1-2x minimum wage) onboard here in France for 28k€/month. I'm betting in other countries with low salaries and a vast talent pool, 28k€/month would get you even more employees.
I'm working currently in Vienna, and 5k is a wonderful salary for Vienna living standards, don't get me wrong, but it's also really not unheard of if you're good and/or working in a senior role.
In Austria, the "IT Kollektivvertrag" [0][1] (think contract for the whole IT collective/unions) demands a minimum of €2503 brutto (before taxes) per month for developers (those are normally falling under ST1 category), and that 14 times a year, and that's for entry level (as in, not first job but starting at a company). Note also the 14 times a year, where the last two extra salaries (Christmas pay for December and vacation pay for June) is taxed much less (note, we can get bonuses on top of that too).
Note for above PDF (only the short money table the full one can be easily found via searching "Austria IT Kollektivvertrag 2022"):
- The ST1 is for devs, and LT1 for leadership roles.
- "Einstiegsstufe" is Junior, "Regelstufe" is "normal" and "Erfahrungsstufe" is Senior
So if you get hired as senior in a leadership role you'd be entitled to €5521 Brutto salary, 14 times a year, or more depending on your experience/knowledge and your negotiation skills.
I think you underestimate European taxes as an employer. In general it costs 2x to employ people in my experience. Taxes and social security contributions are massive
No I don't, I'm employed in Europe in a leadership role and thus talk with the CEO about salaries and general money flow, and I'm also able to read my pay slip, which is quite detailed and also lists all "Lohnnebenkosten" (side/extra costs for employer on top of my brutto money), iow. what the employer really pays.
$340,800/year. If that doesn't make a material difference to your company or department you've never worked where cash is tight. More bluntly: you've been spoilt with excess resources. That's a lot of cash to waste.
I can't speak for him, so for myself: some years ago I had to almost beg for a new disk for a server. A disk. I didn't even bother to ask if we could have a bigger server, no point trying.
For some use cases though, the cloud is just amazingly cheap and fast. We are currently scaling our (computationally cheap) batch re-training jobs on AWS Lambda and it‘s quite incredible that you can train thousands of models in parallel with TBs of RAM. There is no on-premise alternative.
There appear to be slightly weird commercial reasons behind this, because gaming GPUs have great CUDA performance but NVIDIA won’t let you put them in a datacentre. So buying your data scientists gaming laptops (RGB and all) generally works out faster for any reasonable price point. That said, a dedicated server with a decent Xeon and MKL set up correctly generally outperforms CPU-bound stuff.
I think it really depends on your data size. All the benchmarks I can find are on massive datasets, with tens of millions of rows or thousands of columns. I’m sure there are significant performance gains in these situations. Our data just wasn’t big enough.
With larger data it really depends on the algorithm. If you must iterate over more than a few GB at a time, GPU memory capacity and bus speeds become prohibitive, while a dead-simple implementation on a single CPU with 100+ cores and TBs of RAM goes brrr.
That would only be an advantage if you had to do multiple passes over the data, otherwise the data would still go through the CPU RAM before getting loaded onto the GPU, no?
When the models get sufficiently big, even a 40GB A100 is not sufficient. Unless you can feed the core quick enough, your performance drops considerably.
GPUs are like heavy flywheels. Getting them up to speed takes some time (copy data, compile and copy the kernels, kickstart everything, etc.), so you need to start them once to get the performance benefits. Otherwise CPU is much more nimble since they're closer to RAM and made to juggle things around.
> The updated end-user license agreement (EULA) states: “No Datacenter Deployment. The software is not licensed for datacenter deployment, except that blockchain processing in a datacenter is permitted.” [0]
I guess it's time to invent a blockchain that trains ML models as PoW :)
As a sidenote, I have rented servers with GeForce cards from multiple providers in multiple countries, so this rule doesn't seem to be respected very much. And since it's part of the driver EULA, nvidia can't legally go after server providers, since they don't install any drivers, just build and rent out the hardware. For all they know, all their customers are running noveau.
My rule-of-thumb is that if you have less than a terabyte of data, you're better off processing it locally, and even that is pretty conservative. Big data is for when you have problem sets that simply will not fit on a single machine. With 4TB hard drives going for about $60, a lot of problems are better solved by simple algorithms in efficient programming languages on a single box.
There are some data sets where you really do need big-data tools, but it's for when you have petabyte-scale data, not megabyte/gigabyte-scale data.
Also depends on the complexity of the algorithm (specifically thinking of large neural networks). We have a model that requires 8 A100s for training due to the size of the activations. No way to replicate that on a local machine and have it train successfully in any reasonable time frame.
However the complexity of the algorithm many times scales with the size of the dataset, either the full corpus or the size of individual examples.
It get's you ~10x speedups for batch predictions, more if your model is big. It's not complicated, it ended up being <1K lines of Python code. I heard a couple of stories like yours, where people had multi-node spark clusters running LightGBM, and it always amused me because by if you compiled the trees instead you could get rid of the whole cluster.
Wow, very interesting, thanks for this. Daily batch predictions is all we do. I’m the maintainer of miceforest[1], do you think this would integrate well into the package at a brief glance? I’m always looking for ways to make this package faster.
I had a brief look at your package, and my impression was that it's only changing model training. If this is correct then the format of the model.txt (calling `lgbm.save(model, "model.txt")`) is the same as regular lightgbm. This would mean you can use my library for inference.
I found the same thing when doing video transcoding. The VPSs were all woefully underpowered. Netcup bare metal (root servers) ended up getting pretty close and were by far the best bang for the buck of anything I found.
Curious what the setup of VPS' was and why you would expect better than real hardware, video transcoding is quite a beast from what I remember and I just can't imagine there's a VPS solution that expects to keep up
The Intel Xeon processors that cloud providers typically use don't have the Intel Quick Sync core that provides hardware A/V encoding/decoding on typical desktop/laptop CPU SKUs. So the software has to fall back to CPU-based codecs, which are much slower.
AWS EC2 has a VT1 instance family that enables high-speed A/V encoding via a Xilinx media accelerator card.
Oh yeah I love simply avoiding memory allocation at all costs and keeping things to the processor cache and streaming APIs. awk/sed is fantastic for this if you're just working with CSV data, but I've done it in my own custom code processing hundreds of gigabytes of JSON in seconds as well.
I think data scientists just aren't really hugely concerned with programming optimizations or bottlnecks or whatever. Most of them are just intermediate-level python programmers, and that's completely fine until they think they need a hadoop cluster for whatever they're doing and the costs start piling up.
At least to me, the big advantage of static typing is not that it (allegedly) reduces bugs, but that it aids my understanding and helps in navigating the program. It's a tool for thinking and communicating.
I'm not against studies and research - I have a computer science degree myself - but I'm a little tired of being told my personal anecdotal evidence is not sufficient to conclude that water is wet.
As a professional software developer of 30+ years, the doubts on static types puzzle me.
8 years ago, I started dabbling more in javascript, for one of my continuous pet projects. I had it running and grew it to a considerable size, but after a year or two, I lost patience with debugging runtime issues, hunting for where I had forgot to update or initialize or remove stuff, during refactoring. I swore an oath never to use raw javascript again, and rewrote it from scratch in typescript. I am still working on it to this day, and I don't remember being angry at typescript a single day in the intervening 7 years.
My working day jobs have been mostly C++, and these days C#.
Periodically, I will temporarily inherit some of my younger colleagues' projects, if they move on to greener pastures in different companies, with the charter of "can you do something about the long-running issues this software has been having?"
My go-to solution is to go through their typescript and add return types to their functions, and replace their anys with interfaces.
After having done that, I fix the bugs that revealed,
and then I'm usually done.
Recently when I did that, I came across a central class/data structure, which turned out to exist in no less than 5 slightly different variants. i.e. different parts of their code adhered to 5 different assumptions about what fields would exist and be populated (but all expressed on the blank canvas of 'any').
I think we need to be careful with trusting even our own anecdotal evidence because it's simply riddled with biases and bugs. 30+ years of experience is certainly impressive but I'd say you probably never worked in a senior team on a larger Clojure codebase for example which would give you quite the opposite impression. You should read the anecdotal evidence in that community, it's very different.
> [...] I'm a little tired of being told my personal anecdotal evidence is not sufficient to conclude that water is wet.
The problem is, other people with just as many credentials as you have the opposite experience. From an outsider's perspective, two people with equal authority say opposite things, what can they possibly do except an independent study?
Also, note that there's a reason anecdotal evidence is not always reliable. E.g. the famous story about fighter pilots and the "regression to the mean" hypothesis.
I suppose what they can do is write some code and figure out where their specific situation lands them on the Static Typing is good/bad spectrum.
In this scenario, I honestly don't think it matters whose objectively right. Software is not a clean, normalized and organized set of use cases after all, maybe static typing works for person X and doesn't for person Y because of their background, or preferences, or codebase requirements, and so on.
Maybe one day we can conclusively prove that on aggregate static-typing/{insertThingHere} is overall less buggy, but even if we did, it'll still change depending on circumstances.
I have far less experience than you and I've definitely experienced drowning in runtime issues when working with a large project in a dynamic language.
On the other hand, though: have you worked with large, thoroughly tested projects in a dynamic language? Personally, I find that good tests catch 99% of the bugs that static types do, plus quite a lot of other bugs as well. Arguably, you ought to write tests anyway to find those other bugs. Since they also find your typos etc, you get to enjoy the ergonomics boost of dynamic typing almost for free.
That's my (also anecdotal) argument for doubting static types.
JavaScript is a fantastic language once you understand how it really works. If you do truly understand the language then maybe you should use something that compiles to JS.
It’s basically self-evident that static analysis reduces bugs. It’s trivial to construct an example of where type information would catch a bug. Unless there is some reason that including type information increases bugs, the existence of a single example where type information catches a bug would prove that overall type information reduces total bug count.
This reminds me of the studies done related to traffic lights and stop signs.
Removing traffic lights and stop signs actually reduces accidents because drivers are more careful when driving through intersections which reduces speeds and drivers become more alert.
Developers will adapt to their toolset. If you have a statically typed language, you trust it will deal with type related issues and you become more lax with testing things related to types. When you develop in non-typed languages like Ruby, you tend to write more tests and not trust your compiler (because you don't have one). This is why you will find most Ruby developers are really good at writing tests and embracing TDD.
You're point is valid, but you really quickly move past just how slowly drivers have to be when there aren't traffic lights. As with everything they're a helpful tool for efficient traffic, just like static compilation.
I can't speak for all Ruby developers but I found that I could read a pull request from just about anyone I worked with a and find a spot where they hadn't covered a possible nil with a test. And yes, we had coverage checks.
A type system can keep you from having to write those tests.
Those languages don't have null safety, but plenty of languages do. Rust, Kotlin, Swift, Haskell, etc.
The claim is true: a type system _can_ prevent null-related issues and eliminate the need to account for them in tests. That's not the same as saying every type system does.
That's a good analogy, because just like when an intersection gets enough throughput, relying on drivers to navigate their way through becomes unrealistic. Once a codebase reaches a certain size or complexity, it starts becoming really time consuming to follow untyped logic all over the place and you run the risk of a rockstar developer putting a scooter object into the side door of your minivan object.
Static typing gives you assurances and tools with which to test your assumptions in the code, for those times when reading the whole stack is cumbersome, and you need to defend against less careful developers. It also transfers a bit of knowledge between developers in a trivial way that would otherwise be a pain to communicate.
I think this analog is close to the dynamic vs static debate. However, there are probably more factors to consider, such as competence of the driver (will the driver even care to slow down?), location of the intersection (an intersection just around a shallow corner) and value of the driver's car (does the driver care about a little damage?).
In my experience similar arguments hold for software developers. Especially caring can be a big factor; i.e. the "move fast, break things" mentality.
I've been back and forth between typed and untyped languages (somewhere in the range of haskell and tcl) and personally prefer less typing when hacking things together and more typing for high quality software. I'm currently working an infra job where we use both ansible and terraform. They're not direct competitors, but I tend to prefer terraform over ansible when possible, as terraform gives me more "static" guarantees, which translates to more confidence when we apply our code.
One could argue that dynamically typed code is often shorter, and therefore both easier to reason about, and possessed of fewer bugs on a bugs-per-line basis. Not really keen to push that line of reasoning myself, just helping picture one possible argument.
This is true in a local context, but entirely breaks down when a codebase becomes larger than a single person can fit into their brain-RAM. Not arguing or saying you're wrong - just presenting the very quickly reached boundary where the argument breaks down.
It's not just local context. Reading a dense book is still more difficult than reading a less dense book, given a fairly similar amount of information and style in conveying that information. Larger codebases suffer the same problem you mention in a different way, and cargocults in most static languages tend to advocate very verbose writing styles.
Where this falls apart, the more verbose writing style hasn't been proven to convey more information or in a better way. That's an assumption still tossed around.
I would even argue that shorter can do the opposite. You can squeeze an awful lot of information into a tight space in dynamically typed languages that allow functional programming and especially with terse syntax for often used constructs.
This can make it much harder to actually reason about the code, while making it seem easier to reason about. Most people would agree w/ your reasoning on a short piece of logic, which then at runtime spectacularly fails because the inputs don't adhere to the types you expected. In a statically typed language you would not even have gotten it to compile and while it might not feel like a bug is being prevented and actually feel tedious, every time your IDE (or compiler) tells you that the type on something is wrong, you've prevented a potential bug.
Easy, right? Well, does param actually have `property`? No idea. What type is `property`? Does the function `doSomethingWith` take that kind of input? No idea. Now I have to check that function, which might be coming from I don't know where, I might not even have an IDE that can reliably determine where `doSomethingWith` is coming from exactly. Even if I can navigate there now I have to check that piece of code and any other code it calls with `property`. Maybe `property` itself is an object and `doSomethingWith` assumes it has yet another property. This can easily go quite deep and I will not be able to easily reason about this at all. You can't tell me that someone can have all possible runtime combinations of this in his head for any reasonably sized program.
Now let's take something that is almost equal but slightly longer to read and write, same thing in Typescript. I've had to define the types of these things somewhere once. Big deal.
Notice how this is really not much of a difference. Just a type declaration and it gives me a lot of safety. Let's assume SomeType defined `property` as non-null, so no `?` needed, I know my inputs have already been checked. `doSomethingWith` also defines its parameter type correctly and we know what `property` is or isn't. No need to know anything from the top of my head or spend time digging through code myself. The compiler knows that I am passing the correct type of object along and I won't get a runtime error (well, OK, it's Typescript, so let's also assume I'm not in a mixed TS/JS code base where I might very easily get `any` kind of object.
Now syntax will be a little bit different, but I would argue the exact same thing in say Java or Kotlin is equivalently short and readable (yes even in Java!) while benefiting from even more type safety:
public myFunc(SomeType param) {
doSomethingWith(param.getProperty());
}
Didn't really hurt much, did it?
But these are super simple example. It get can arbitrarily complex.
As a Haskell programmer, this argument does not resonate with me. I find most dynamically typed languages (e.g., JavaScript) verbose compared to what I'm used to. Of course, plenty of statically typed languages are verbose too. But static typing is not a sufficient condition for a language to be verbose.
I associate verbosity with object-oriented programming, whether statically typed or not.
As a clojure programmer, I'd say the same of Haskell. Oop is less expressive than FP, and static typing is less expressive than dynamic typing. These are usually just tradeoffs people choose for their problem domain
> static typing is less expressive than dynamic typing
Here's something I can express with static typing that I can't express with dynamic typing: "this function returns a function which returns an integer for every input". There's no test you could write to verify this property. So I'm inclined to say that static typing is more expressive, since it gives me a way to express and verify properties like this.
Shorter how? The typing can often be implicit in many languages like Scala which makes it pretty short compared to something like Java. While there is a bit of explicit typing, I think it’s well into diminishing returns to force even shorter code.
Because the studies are constructed by equally fallible humans, and almost always badly.
Cold shower attempted, but the plumbing was busted?
Such studies invariably wholly miss the point: when you have a language with powerful type support, error checking is the least valuable work you get out of them. Types do serious heavy lifting expressing semantics.
As a general rule for dev work, trying to make evidence based decisions is fairly difficult. There's just not that much evidence around yet that can make it obvious as to if in your particular situation what the best choice might be.
And at the end of the day you have to contend with being in a work environment where politics and personalities rule, not science (or engineering).
That said I do wish more devs would take an interest in the available quality literature. Unfortunately I'm far more likely at work to run into an Uncle Bob recommendation at work, than a recommendation of ACM's Digital Library.
It is not evident to me. Having used both statically typed and dynamically typed languages my experience is that I can't remember ever seeing a bug in our fairly large rails app that a type system would catch. Nobody's passing strings where hashes are expected, or Widget instances where User instances are expected. The thing to pass to the function is nearly always self evident. If you did it would immediately be caught when a test runs anyway.
However, refactoring code in C# is much easier than refactoring ruby because you can lean on the type system there. However writing new code in C# is often much harder to do in C# because of the constraints of the type system. So really, it ends up being a wash for me.
Even trivial things like a typo in the method name in a method call are not detected at compile time by languages like JavaScript or Ruby (since their "method calls" are in fact just lookups in a runtime hash table...).
If you have not seen them, the reason is probably that the code was tested well enough before you looked for the bugs.
> If you did it would immediately be caught when a test runs anyway.
That's the point though. With dynamic typing you would only (hopefully) catch this with manually written tests. With static typing you get that feedback for free at build time.
Not true because anyone can implement just part of an interface and throw "method undefined" for the methods they can't figure out how to implement. This happens all the time.
> Not true because anyone can implement just part of an interface and throw "method undefined" for the methods they can't figure out how to implement. This happens all the time.
How would that pass any code review, regardless of static or dynamic typing?
The only time I have ever seen something like this is using `todo!()` while initially writing code. I have never seen someone check in code like this.
What kind of clown show of a programming org are you working at?
This is morally equivalent to "There's no point to having a safety on a gun, because the safety won't stop you from bashing someone in the face with the gun." If you really want to, you can throw exceptions or crash the process or call exit() or call system("shutdown -h now") anywhere in your codebase. That has nothing to do with a type system.
>Nobody's passing strings where hashes are expected
See, When I'm throwing together apps to clean up configurations, I am Pythonifying XML often. And when handling different return values, reshaping it into the useful components I need and trying to analyze data (and dealing with different return formats depending on number of results, aka a dict if there is one value, or a list(dict) if there are more) I have to constantly remember if I am going to be getting a list(dict(dict(dict(str)))) or just a dict(dict(string)), and so on. But that's me cobbling together scripts and not understanding the API by heart well enough.
Yes - in every case it is calling a method on a null reference. And no commonly used statically typed language helps here because they all allow null references. And languages that disallow nulls, if you are one of the 10 programmers on earth working in one of those languages, don't help you because you are dealing with real world data where inputs to your system can be null or not so you end up using some type system escape hatch anyway.
> no commonly used statically typed language helps here because they all allow null references
This is false.
TypeScript, Swift, and Rust are commonly used and support non-nullable references.
> those languages, don't help you because you are dealing with real world data where inputs to your system can be null or not so you end up using some type system escape hatch anyway.
You don't need escape hatches to deal with "real world data" that may be missing some values. This blog post is my favorite detailed comparison of handling "real world data" in static vs dynamic languages: https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-typ...
The only requirement for working with "real world data" in a static non-nullable language is to choose whatever kind of behaviour you want when working with the data. Everything you can do with null references, you can do better with option types; there is nothing that null references uniquely permit.
Dunno what to tell you dude; my imagination stretches there just fine. Maybe consider going to imaginary yoga class.
There have been people writing at least two of those languages everywhere I've worked for a while. Most of my professional colleagues can write at least one of these comfortably. I'm extremely confident in being able to hire programmers for all of these. They're all in use at every major tech company.
If you really want to stick your head in the sand and cry about how nothing can be better until they're literally top of the charts, I can't stop you, but they're certainly not rare. There's good stuff out there. Lots of people are using it. You can too.
If you'd rather trade links to charts, I trust Stack Overflow's developer survey's methodology a lot more than TIOBE's. 30% of respondents said they've worked with TypeScript, and that jumps to 36% in the professional developer subset. Rust is 7%/6%. That's a hell of a lot more than 10 developers.
My country has about 15% black people about about 7% asian people. My country has about 4% LGBT people, and my city has about 15% LGBT. It would be really weird to hear someone say that black, asian, and LGBT people are not common, especially after knowing and working with plenty of them.
> don't help you because you are dealing with real world data where inputs to your system can be null or not so you end up using some type system escape hatch anyway.
You clearly have little experience with such languages, then.
Python (with mypy) has strict optional on as default and is the most widely used language, according to the latest stackoverflow ranking. Assuming you use mypy of course, which probably takes the usage rate down a factor of 100x but still.. ;)
It makes you actually have to consider scenarios where a variable can be none or not and try to push the validation up closer to where it entered the system.
Static typing doesn't mean type information being available. Most statically typed language allow some version of `let x = 5`. Similarly static types doesn't mean unsafe casting are not performed.
Also in the opposite direction, many dynamically typed language allows specifying types if you want to including python.
x still has static type, the compiler just infers it based on the assignment, the type information is still there. Agree that implicit/unsafe casting is still and issue in some languages though.
Allegedly? Have you ever written code in a dynamically typed language? I'm forever fixing TypeErrors and AttributeErrors and the like.
I suppose it's not even necessary to argue about experience fixing them or not, just the fact that those are runtime errors rather than compile-time (and so we presume not shipped) shows it reduces bugs doesn't it?
I always notate my functions with JSDocs and my DTOs as jsdoc types which in any modern IDE gives you the same advantages that you would get out of the explicit typescript interface/type.
And Unlike typescript my code doesn't need to be transpiled at all since it is already vanilla JS.
The presence or absence of a compilation stage has nothing to do with static typing. Flow.js is static typing. MyPy is static typing. It sounds like your JSDoc comments are static typing, if your IDE ends up passing them through the TypeScript type checker in JS mode.
It may not be very comprehensive static typing but it is static typing none-the-less.
Reduces bugs in size, or only leaves smaller bugs behind?
These are two very different outcomes.
If the remaining bugs are unrelated to the class of bugs that were eliminated entirely, then the difficulty in finding them has little bearing on the outcome, since we’re now talking about an entirely different class of bugs.
This is conjecture, but in my experience you'd get the following:
- in JS, your code will run with the bug then do something catastrophic during runtime that you can then notice and trace to the core issue
- in Java it won't compile, so you fix it so it compiles and runs, then it'll hit you in like 2 hours of runtime with a NPE or something and you'll have no idea what caused it
Maybe Kotlin, Rust, and the like solve that sort of thing better but I've yet to be convinced.
But is there any evidence that the NPE would have blown up catastrophically without type checking? Is the NPE even related to type checking?
It seems you're describing an orthogonal issue, and it's unclear why type checking is a Bad Thing or even related to the NPE at all.
Let's say I work on an assembly line, and must place physical parts into a machine that assembles a larger part. There are many ways this machine can break down - I could put the wrong parts in, leading to a complete failure, or some part of the machine could malfunction independently.
- We could implement part validation on the assembly machine to make sure it's impossible to insert the wrong parts. This eliminates failures related to incorrect part insertion.
- Unrelated to this, a drive belt starts to wear out and slips every so often, leading to a slight slowdown in a conveyor belt, which ultimately leads to a botched item.
The way I read your argument, you would say that part validation is bad, because it's easier to diagnose a meltdown when incorrect parts are inserted by the operator than it is to determine that the drive belt is slipping.
Except the drive belt slipping is not related to operator error, and would have happened whether part validation was happening or not.
This is hopefully obviously nonsensical - better to reduce the overall error rate by implementing part validation than to leave two avenues for error. Before part validation, the machine could fail because of operator error (common) or drive belt failure (uncommon). After part validation, only the uncommon error occurs.
This is better than no validation at all, even if drive belt failure is harder to identify than the machine screeching to a halt when the wrong parts are inserted.
Static typing has an undisputed benefit, performance. If i need to add dynamic thing a and thing b, I’ll always have the overhead of figuring out what add means first in this context, an overhead i dont have when asked to add some ints.
All the other claims from readability to understandability to refactoring to less bugs, all come with an “it depends” caveat. Sometimes the claims are true, sometimes they’re not. It’s also not possible to say “but in most cases claim X holds”.
The thing I’ve never understood yet in this debate is in my experience, the people who have argued about correctness have universally been below par at getting to the bottom of requirements. Which leads to “great, you correctly built the wrong thing. And you took forever to do it.” Which isn’t doing our profession any good in the eyes of other professions who depend on us.
Not necessarily. This increased ease of understanding could also simply result in faster development speed - so same number of bugs, but more features in less time.
I agree. At least for JavaScript I would always use TypeScript now. The main reason is understanding of code as well as tooling, which means communication in the end.
I remember working 2012 on a SaaS app, and I wasn't the only guy anymore doing frontend stuff with JS. I knew my objects, but my colleagues did not. How to you document object APIs? TypeScript really shines in large projects with lots of devs.
What is an item in the second declaration? That it has type "Item" doesn't help you unless you have contextual information. And if you have contextual information you can probably figure out what an item is in the first declaration too.
An item is an IItem, as it says in the definition. You can always ‘figure out what an item is’ in a dynamic typing system, that’s not the problem. There’s an incredible amount of mental overhead in any non-trivial project which employs dynamic typing. Engineers who can work around this have my respect, but I find static typing to be the easiest solution to this problem by far.
That fact is useless without any context. An "item" could have been an "Ifdkjsj" and you'd be none the wiser. "incredible amount of mental overhead" needs a citation and, as shown in TFA, no citation exists.
Maybe I’m not understanding where you’re coming from because as far as I can tell a ‘lfdkjsj’ and a ‘skfjwb’ which are both an IItem, or both a IWhcjwp, is easier to work with than the dynamic alternative. Regardless of how poorly named a variable is, in a static ruling system what you see is what you get, where as in a dynamic typing system what you see could be anything at runtime.
The only citation I have is the tenuous grip I have on my own sanity - I could have more correctly talked about the incredible amount of mental overhead this has _for me_, but read the rest of the thread and you’ll see that this isn’t an uncommon experience. As I said, if you can work around this then you have my respect.
> in a static ruling system what you see is what you get, where as in a dynamic typing system what you see could be anything at runtime.
What you see in a statically typed language: IBlaha blah. What could blah be? An IBlaha. What could IBlaha be? Anything! The type has not gained you anything.
> The only citation I have is the tenuous grip I have on my own sanity
That's an argument from authority where you are the authority. It doesn't work on HN since we are all skilled developers. I've also been a software developer for decades and I can count on one hand the times static types has provided a tangible benefits.
Now we are just talking about using a sane naming convention for your types. Depending on the paradigm you are working in your instance of IBlaha has a consistent definition - in a trait or a class or whatever else you like. It also likely makes simple work for your IDE during refactoring as others have observed. You don’t get this from dynamic typing.
Of course you could say the same thing about sane naming dynamic naming conventions for your declarations in a dynamically typed language - and you wouldn’t be wrong, but a compiler won’t help you in the case of human error. All I’m interested in is offloading as much complexity onto the tools at my disposal, so I can focus on what’s important.
On my citation … that was tongue in cheek and I thought it was obvious. I don’t have a citation, this is all my own experience. For the third time, if you can work your way around this you have my respect.
Sure, the type IBlaha is defined somewhere just as the object(s) passed to add_item_to_cart are also defined somewhere. Again: "That it has type "Item" doesn't help you unless you have contextual information." It has nothing to do with naming conventions. Whatever simplistic tools does is irrelevant since this sub thread was about the meaning of two declarations in HN comment.
Dynamically typed languages are very popular so it seems that many developers can work their way around dynamic typing.
- Sure, the type IBlaha is defined somewhere just as the object(s) passed to add_item_to_cart are also defined somewhere.
That definition is rarely as accessible as an explicit type though. For example take an API response or any third party library. Determining the data type isn't as quick as simply scanning a function for the object definition.
- Dynamically typed languages are very popular so it seems that many developers can work their way around dynamic typing.
As someone who has spent a fairly even mix of their career using typed/untyped languages, I think this is due to a few reasons:
- Lower initial learning curve.
- Lower barrier to entry.
Those are real benefits, but I would argue most projects quickly hit a point where they benefit from static analysis.
Having worked with 100s of devs at this point, I'm yet to meet one that after learning a typed language and using it for a sufficient period of time (more than a few months) wants to use an untyped language for anything outside of small scripts.
>What could IBlaha be? Anything! The type has not gained you anything.
You're confusing Java-type extreme (and also mostly strawmanned) application of OOP with static typing.
Not every type in your program has to be AbstractFactoryProxyBeanInterface, and if you don't write code like that it's either obvious or some kind of extension interface for non-core code.
If it were an item id the argument would be item_id. It is an object. What type of object? The type that can be added to a cart. You don't just drop a programmer into the code and have them call a function in a vacuum. Nobody just throws random objects at a function. They are familiar with the code in general and they know what to do.
- If it were an item id the argument would be item_id. It is an object.
You've never worked with vaguely named variables? What you are suggesting is guessing the data type based off the name.
-
What type of object? The type that can be added to a cart.
Okay sure, but what precisely is that?
-
Nobody just throws random objects at a function.
I couldn't agree more - so the follow up question is what is the fastest way to get familiar with what type of input or output this function returns?
-
They are familiar with the code in general and they know what to do.
For very small projects with very small teams after some onboarding time perhaps, but outside of this I would disagree.
Code changes over time, parts that you use to know intimately get changed subtley and erode knowledge away. Having types in place highlights these changes if your assumptions are incorrect.
If you're working with badly named variables you have more problems than a type system can help with.
It doesn't matter "what precisely" is the the thing that you are adding to the chart, and a static type system won't tell you that either. There could be be any of 1000 things that implement IItem. And probably half those things just throw exceptions for methods they aren't actually able to implement.
If you only use trivial examples, types seem silly. But in real examples they become more useful. In this case looking at the function signature give you immediate information about the implementation that is missing from the untyped version.
EDIT: please excuse formatting, I'm on mobile and cannot get it to add spaces before the last code block
Depends on the specifics, but i'm betting if `IITem` is close by i know how to interact with it. I have no clue what the hell fields or methods may or may not be on `item`. Nor will i, ever. At best i have to enforce method/fields myself, at worst i subscribe entirely to duck typing and let the gods sort it out.
> I don't need tests to check what methods or fields are on my types though.
You do though, because invariably people violate the LSP and just "throw Unimplemented" in the methods required by the interface they can't figure out how to implement. In other words all system are duck typed in reality.
I don't, though. If it compiles, it exists. Saying that a method might actually be a nuke is a bit besides the point. It could also contain a virus.
Not sure what typing system you're referring to, but it sounds very half-baked at best. I'm using Rust fwiw.
I cannot access any field or method that does not exist. Even dynamic traits are compile time enforced, but i think we can largely have this discussion around static dispatch.
But suppose you were unfamiliar with the code, the 2nd tells you what fields/methods are available for "item", and furthermore most IDEs will use that info to populate autocomplete suggestions and such.
How would a "language engine" know what you can do with `item` if it has no type information?
You can do that with Python (sometimes) because many libraries have type hints today, so even if you don't use types yourself, the type checker can infer them in your code and help you out.
I still don't see how you can determine an object's attributes without type information. If you're inside the function, all you know is there's a parameter named item. How can you provide autocomplete there?
All of the examples you gave are from static languages, where the information is known at compile time (except for valgrind, which requires a runtime). The parent to my original post was claiming that you can have the same tooling for Ruby.
Also, you're wrong about Rust. Lifetimes are part of the type system.
What about return types? Do you generally deal with voids in your line of work? Having code that could return different types depending on branching is pretty self-evidently worth preventing.
Static typing does not imply type annotation. For example in C++:
void add_item_to_cart(auto item)
will still statically verify that item has the required syntax; C++ is very poor on this aspect on only doing the verification at instantiation time, languages with more sophisticate typing systems can infer the correct type from tome add_time_to_cart definition alone.
And when we understand this, we can weigh it up with alternative tools for thinking and communicating!
Would this 10 line shell script be better in a statically typed language? Well maybe not because I can hold all of 10 lines in my head, there's nothing else to communicate.
Would this CRUD app using Django/Rails be better with static types? Well the framework has defined a structure that communicates properties of the code to me, I don't need types written down because I already know them.
Would this complex parsing process of untrusted data into a trusted and verified format benefit from static types? Yeah probably, testing will be tricky and code review for security is hard, types will help reason about the possible states of the system.
There are lots of alternatives to static types: documentation, testing, frameworks, design patterns, code review, pair programming, error messages, and so much more. I'm generally a fan of static types and find them very useful in a lot of development, but they are a tool in a big toolbox.
This is the major one. I work on a 10 year old rails app and it has got to the point where we are terrified of making any change that has the potential to affect areas outside of the visible git diff. It's easy enough to manually verify regular changes by looking at the code. But something like a library update is impossible to verify and we constantly end up with production issues because of something changing in a library that wasn't mentioned in the upgrade guide and would be impossible to have considered beforehand. But that a type system would catch.
Sounds exactly like every dynamically typed code base I’ve ever worked on. Even if it’s not big. If you can’t grow out all the parts that matter (which is a lot of the time), you are screwed.
I find that is a far less common problem than the documentation being wrong. Even if someone doesn't add documentation for some library, static types provide a lot more insight into how it works than dynamic languages (Racket style contracts are even better since they can check way more than static types while still working in a first class way with docs).
They can be consumed by static analysis tooling, which assuming properly configured etc. makes it sort of 'dynamically typed language with the guarantees of a statically typed one', at least so far as the hints are complete.
Most statically typed languages compile down to object code which runs in one of the most dynamic runtime environments imaginable. What are the types in the source code but “comments”?
It’s only hype because it’s imprecisely stated. Static type systems make entire classes of bugs impossible at runtime. The stronger (read less permissive) the type system, the more classes of bugs cannot occur.
On the other hand it increases development time and makes modifications and new features much harder to implement. If you took it to the extreme, you could also mathematically prove your code is correct for every input variable.
Everything's a trade-off, the question is which approach is best for your application. Your average website doesn't warrant as much rigor as a Mars rover.
Are you sure all statically typed languages are slower to develop in than comparable dynamic ones? I used to be thoroughly convinced this was true, but 3 things are now making me doubt it:
1) Statically typed languages with inference don’t require time spent writing signatures.
2) I know I’ve spent time chasing down bugs in dynamically typed software that would have been caught by a type checker.
3) I also know I’ve spent time writing tests for conditions in dynamically typed code that wouldn’t pass a type checker.
> bugs that would have been caught by a type checker
Type checking also introduces its own set of additional bugs by the virtue of object incompatibility, that do not exist at all in dynamically typed languages (or are handled correctly every time by the compiler/interpreter automatically).
Take as an example exchanging objects over sockets, rest, files, etc. Whenever the object definition changes in another piece of the software stack the statically typed parts will crash upon receiving the updated objects, even if it's just one new param added that would've been fine otherwise if dynamically typed (or say change from a float to a double which can be irrelevant). A nightmare in systems with lots of moving parts.
One might say that's working as intended, and it of course is, but it also forces you to fix and recompile all of that for no real net gain. Hence the longer dev time I mentioned.
I've spent years working with statically typed languages, and I honestly don't think I'll ever go back to them in any professional capacity.
> exchanging objects over sockets, rest, files, etc. Whenever the object definition changes in another piece of the software stack the statically typed parts will crash upon receiving the updated objects, even if it's just one new param added that would've been fine otherwise if dynamically typed
Anecdotally, this is not true for C++ using JSON or msgpack, since those are self-describing formats where extra fields are safe to ignore.
And it's not even true for Rust using serde, which writes the serializing / de-serializing code for you. serde_json will also ignore unknown fields when parsing, and you can preserve the original object as a `serde_json::Value` in case you want to pass unknown fields downstream as opaques.
Protocol buffers and Flat Buffers also have solutions to this, and all 4 of these formats are pretty popular in both static and dynamic languages.
Even if you write a custom TLV format, this is not that hard to deal with.
Was this common in the static code you worked with? You weren't just casting objects to `char *` and doing a `memcpy`, I hope?
The protocol buffers approach is quite interesting. With proto3, there's basically no such thing as a "required" field anymore. All fields are now optional.
Now, being able to assert that a field is present in an object is a basic and valuable use-case for static typing. However, the developers felt that even this basic level of static type checking added too much friction whenever they had to update systems.
You can't typecheck across system boundaries - that's the OP's point.
(Well you could, but then you'll be introducing a whole set of problems if the other system has a static type system that behaves differently from your application's static type system)
What are some strong examples of this? Haskell does an amazing job with it. Java technically supports some amount of inference, but it doesn’t reduce verbosity by all that much. Apart from those I haven’t run into it.
Dynamic typing only increases development speed for the first few thousands lines of a solo programmer project. After that it is, in my personal experience, a significant drag on development speed.
Furthermore, dynamic typing makes modifications and new features significantly harder to write. Turning compile-time bugs into runtime bugs is a catastrophic decrease in development speed.
taps temple That's why you keep code encapsulated as much as possible in separate files/classes up to like a thousand lines. Any more will be unreadable anyway.
> dynamic typing makes modifications and new features significantly harder to write
Depends on what you're doing I suppose.
Adding a parameter to an object? With dynamic typing you just add it to the object in literally any location, no issues. With static typing you might just need to refactor half your codebase if you have lots of interfaces. Have fun resolving merge conflicts with your team.
Not only that, but (in Java as an example) serialization codes for objects will change unless you planned for that previously (you didn't), making old objects impossible to load. A completely new bug that's created solely by static types. And it's not the only one.
Maybe it's an issue with Java. In Rust if I add a new field, I just wrap it in an Option.
When de-serialized, old objects have a None, new objects with the field have a Some. When serializing it's just `Some (5)` instead of `5`, if it's an int.
My personal hell is having some nested objects/dictionaries and wanting to either add or remove a layer. Good luck and God speed in your attempt to simply find every single line of code that needs to be updated.
> On the other hand it increases development time and makes modifications and new features much harder to implement
I am not sure that that is correct. Type inference significantly decreases development time imho, and access to compiler errors means significantly less testing time because a certain class of errors is avoided.
Personally, I find that my productivity w/ Haskell was significantly higher than with python exactly because of the type system even though I have 3x the experience in years w/ Python.
I have inherited (hah) a project at work and had to introduce all sorts sanity checks (mypy, pylint, etc) as pre-commit hooks to make my life easier wrt bug hunting.
Ahaha I've had the exact same issue recently, fortunately it was only in a few spots. Fair enough.
Interestingly enough, py2 was smart enough to automatically load things in the correct format. Loading a binary file? Get bytes. Loading a text file? Get a string. Py3 has worse functionality now, all for the sake of consistency.
The articles from the review (as far as I've scanned them) seem to take a common view of static typing, but the studies themselves are obviously weak. There's almost no way to test such statements, because how are you going to generalize that? Some studies use restricted tasks, which would provides some indication of the usefulness of static typing for e.g. beginners, but then the reviewer starts mixing up all kinds of problems and concepts. E.g., who cares that there's more variation between programmers than between languages?
To use that review as an authoritative statement is ingenuous, to say the least.
> Hype: "Identifiers should be self-documenting! Use full names, not abbreviations."
> Shower: Researchers had programmers fix bugs in a codebase, either with all of the identifiers were abbreviated, or where all of the identifiers were full-words. They found no difference in time taken or quality of debugging.
That's a very weird take on the statement. The downside of using abbreviations is probably dominated by the difficulty of fixing the bug. The problem I have with abbreviations is that they just become another thing that you have to spend your mental resources on. “What is this variable exactly? Oh, I see.” is just wasted effort.
Maybe I could get it done in the same time, but I'd be really annoyed, and less certain what things are intended for. An IDE can really carry the burden figuring out what things are, but you're going to lack some context.
Usually it doesn't matter WHAT things are, I need to know WHY you have a variable, and what it's intended use is. That can be explained in the variable name a lot of the time.
I have the same argument against the no-comments evangelists, and wanting to squash all commits into one when they're merging. Yes you can read the code diff, but that only tells you what, not why, something was changed. Why did the API endpoint change? Why do we have to call the payment processor before this event rather than after all of a sudden? Why did the add to cart button move up one div? All very useful information when you have to come back to fix things.
For the no-comments evangelists, I understand the idea is to make the code as self-documenting as possible, and that's awesome, but you can still miss out on the why, and sometimes the Why is entrenched in external business requirements that aren't in the code.
This stuff all feels better suited for a commit message. Conversely if I found all this in code I'd delete it.
I'm an almost never comment person (I use them when things are weird or inconsistent, but otherwise view them as noise) but I've been persuaded by John Ousterhout that code cannot adequately describe abstractions, and you need comments to fill in the gap.
Similarly with variable names, if you don't know i is index, please don't try working on this code. I get some code is a soup of inscrutable variable names and flow control to the degree it looks decompiled, but that's a higher level composition problem, not a "use more nouns" problem.
I also almost never comment code architecture and logic, but code is usually solving some business problem and documenting that in code is just not always reasonable or even possible. Also bugfixes and compat related naming in live software gets you things like fixForObserverOverrideBugIn364
For me it's more about not wasting people's time. Names are opportunities to divulge minor contextual information that you spent time figuring out, and it would be going backwards to then compress that into shortened names.
i for index is fine, but if you have nested loops or multiple loops it gets annoying.
Oh yeah totally get it. I feel like our area of agreement is that whenever you're writing code, you should be writing it to be read and understood, not simply executed. Whether that's through naming, comments, docs, structure, etc.
An example of a comment I'd value is something like a "why bother" at the top of a file doing a lot of in-depth algorithmic work, like a note accompanying a Chesterton's Fence. It makes some sense (though I quibble with this all the time) to design things so as to be as understandable as possible, and then only start optimizing when necessary, and when that happens it's useful to put a "this is pretty complicated, but it cuts CPU time down by 30%. [here](link) are the tests, and [here](link) is the naive equivalent implementation".
People nearly always gravitate towards abbreviations in natural language. The more a word is used, the more likely it gets shortened. LA for Los Angeles, Frisco for San Francisco, Vicki for Victoria, Jay for Jason, Dub for George W Bush, Doozy for Duesenberg, and on and on. Why should programming be different?
Because abbreviations are nice when there's a shared context that is perpetually "in memory" - I don't have to think to know what LA stands for.
However, that is not the case when debugging. It is almost certainly the case for the person who wrote the code originally, but it is certainly not the case for the next person to come read it afterwards. IMO code is meant to be read, so I try to never use abbreviations (other than e.g. _i_ as a for loop index, which, given it's pervasive use, falls under the category of "shared context").
I read somewhere, "The length of a variable name should be proportional to the size of its scope".
Usually I don't have long loop bodies, so if the loop body fits in 24 lines, `i` is perfect for the index.
If I'm locking a mutex, doing something, and quickly unlocking it, `l` for the lock guard is fine.
But if it won't fit on screen, it needs a longer name.
And if there's something like a top-level `App` struct, I just call it `a` because its scope is really just `main`, even though its _lifetime_ may be the entire process.
Shouldn't you have some shared context when working on code together, even at different times? Maybe they terms of art, maybe it's a style guide, but there oughta be something.
Like, isn't it fine to change:
var basicAuthenticationController = new BasicAuthenticationController()
to:
var ctrl = new BasicAuthenticationController()
I find that if code authors do work upfront to contextualize things for me that it helps tremendously. Like do I need to know at every point that this is a BasicAuthenticationController? Or is this the basic auth module and we're always dealing w that controller? I prefer it when engineers set the table for me like that, it helps me narrow my focus to the purpose of the code.
Abbreviations that are widely understood are fine. sql_string is superior to standard_query_language_string. On the other hand, I would prefer customer_id to cid in most cases.
Kinda going off on a tangent here, but it was a mild "ohhh" moment for me when I first realized that the company Cisco is actually named after San Francisco, followed by a second such moment when I realized that Cisco's logo represents the Golden Gate bridge.
When I worked as an analyst in DoD, it was widely understood that acronyms would appear everywhere in everything we did, but also that it can confound a reader who bumps into an unfamiliar acronym if it appears out of nowhere in a report. The caution and best practice was to always give the full name the first time it is used, together with its acronym in parentheses. Then after that you are free to use the acronym as much as you want.
I doubt it would be controversial if programmers adopted a similar convention.
The problem is everyone thinks they're good at naming when they're the one writing the names, but it turns out everyone is terrible at naming when you're the one reading the names.
Names evoke ideas, and this is very subjective. So "self-documenting names" can turn into "misleading names" in a hurry. The name "x" never misleads because it evokes nothing.
How new devs modify the code depends on its names too. So does how we further consume code modules. Architecture and system complexity are virtually emergent from names. So I agree. Kind of a nonsense one.
This matches my experience. People say otherwise might need to put more effort on writing better tests.
That said static typing usually catch minor problems like typo faster. But for dynamic typing it's going to catch by tests a bit later. But usually tests run faster with dynamic languages, so it's a tradeoff.
My favorite approach is the mixed approach like TypeScript:
1. Faster feedback loops: Usually statically typed languages compiles slower thus slower feedback loops. But languages like TypeScript can skip type check and emit only to the runtime, making test watcher very fast - as soon as I hitting Cmd-S I can see the result.
2. Optional typing: Sometimes a function signature is going to be 10x larger than the body, which is really a hassle. Sometimes I just skip them, or skip typing module private functions as long as it's properly tested.
I like types, but I like types even more when they are at the edge of two systems and both systems understand them. For instance, in the past we ran Rails APIs and a lot of mobile apps consumed them, and there was a lot of math involved. We put Google's protocol buffers between them. The data's type info is thus shared. I liked this pattern a lot, and it did reduce A LOT of bugs (for instance ints vs floats).
Another thing: people working with dynamic typing often forget that often times they are interacting with a system with extreme opinions on types, namely their relational database, which is often times literally the heart of the business. If your language of choice has typing, you can sync that type info (smart ORMs, codegen), and thus reduce friction. With that said, I develop in Ruby and my understanding of where there might be "type friction and potential for bugs" has improved a lot over the years, so I don't mind the lack of types until we get optional typing sorted out.
Most Times I can't bring myself to get into a cold shower. If I start warm I can cool down from there. Is there a way to start cold and go even colder?
The way to start cold is to stick your head/face directly in to start the shower. That gives you the strongest mammalian dive response. Not sure if that offsets any of the cold shower benefit though.
Edit: This is also the time of year to do this, while the tap water is probably warmer and your house is warmer. I wouldn't try to acclimate in winter.
It's pretty easy to start cold if you've just done a huge amount of cardio, otherwise there's nothing wrong with just turning off the hot water completely during a shower if you want a brief cold period.
For these people that claim taking cold showers, I want water temp. Out of the tap, in summer, my water is 53 degrees (f), upper 40s in winter. I'm not sure I could get through a whole shower sans hypothermia.
Edit: looked it up. 30min-2hr for hypothermia at that range. Typically I take a 15-20 minute shower, so only flirting with hypothermia.
I love a hot, hot shower. Best advancement in technology in the last thousand years
My showers are about five minutes, a habit from growing up on tank water during a drought. Water on, jump in, wet your hair, soap, shampoo, wait a minute, rinse, conditioner, wait a minute, rinse, jump out. Cold showers are great motivation to be even faster.
Interestingly, one of the theories behind why cold showers might be good for you is that they induce a heat shock response, same as a very hot shower.
I don’t think there’s anything wrong with starting warm and cooling down. The only thing that matters is that you can go cold enough for a decent amount of time.
For me, there is some value in facing the cold water head-on in the morning ("Wow, this is going to be cold, but here goes!" Great way to start the day. I have to admit though, the cold water is not super cold here this time of year.
For me the biggest advantage of static typing is that it allows to safely refactor code. Without it even with extensive unit test coverage refactoring often is just not an option.
So many people in the thread saying this. But in my experience refactoring large code bases in both Java and Javascript, it's roughly the same. Ultimately the real answer will have to come from a peer reviewed study, because as the post suggests, these sorts of things are not as intuitive as people think
I have done multiple times both refactoring of untyping JS and JS fully typed with Flow. With the latter after the Flow compiler stopped complaining one can expect things to work. With the former despite extensive tests sometimes it took weeks to fix bugs caused by the refactoring.
Granted this is personal experience and I may be simply not careful enough to do untyped refactoring, but few people I talked about that shared the experience.
Anecdotes don't really count for much, of course, but the balance is against you, here. It is clearly and evidently true that when you refactor, static typing systems find a lot of errors that you would discover at runtime in dynamically-typed systems. It's true almost by definition.
It can still be fine if you're being meticulous and know the codebase well enough, but otherwise, you'll miss corner cases in parts of the code with less test coverage or typical interaction. When you accidentally occasionally have a string in place of a number deep in some data structure, you might not notice for a long time.
How are you defining strongly typed here? Wikipedia says the definition is loose [1] but also says that Java is usually considered stronger than many other languages
Java's type system is only passively useful. There is no way to put it to work on behalf of designers to deliver more powerful libraries, unlike C++, Haskell, and, to a lesser but still significant degree, Rust.
Hard to measure, the javascript I worked with was mostly front-end, which is harder to write automated tests for. Not impossible, but the higher friction naturally leads to less tests, especially given how frequent the front-end changes. "Level of safety" would also need better quantification. Something that a peer-reviewed study would be better suited for
IMO a lot of truths in programming are not amenable to "scientific studies".
Static typing is objectively better than dynamic typing for the vast majority of cases, but you can't capture this in a scientific experiment or study.
The only thing you can do is find people who are similarly experienced, break them up into groups (n=1 is also ok) and ask them to complete a specific project, and see how much time it takes. But even then there are so many caveats. The experiment must be set up such that the presense of extensive library support for a particular task is not a confounding factor. There's also the open question of how do you measure these people's skills before the experiment begins?
Your feelings are often a good heuristic. If something makes you feel depressed, it's because your mind has gathered enough information from past experience to know the situation is hopeless. It's your mind discouraging you from wasting your energy on a pointless activity.
This is actually the case, lol. Whether it is the extra cognitive load of not always knowing the types, or whether it actually makes things go faster - IDK, but I definitely want the types.
This one definitely hits close to home. By empirical measure, static typing reduces bugs, but maybe it's just me. I definitely like the comfort of trusting my types.
My only experience with dynamic typing is really node, which improves a bit with typescript. But I'm still extremely wary of it because it doesn't give you runtime guarantees. It irks me a lot that you can just do JSON.parse, declare any type on it and call it a day. I firmly believe that static typing increases code readability and catches low-hanging fruit, but I'm going to have to sieve through that research it seems.
Well, how is this measured? My experience working over 3 decades with large teams also says this is so (static typing does prevent bugs) but that is not science. How do they measure this 'when you go and check'? I am really curious as I don't even understand how people make large systems without static typing and all massively complex systems that I personally have worked with that run for decades transacting billions$ etc without flaws are all statically typed.
We don't seem to be able to measure it, despite people trying several wacky approaches.
Either that means it's hard to measure, or it means the effect is not actually there.
Either way, this means that we cannot say that static typing reduces bugs.
> all massively complex systems that I personally have worked with that run for decades transacting billions$ etc without flaws are all statically typed
Did you try building the same systems without static typing as well and see what the difference was?
Or are you just saying that you built systems with static typing and they were successful? That tells you nothing about static typing except that it doesn't prevent successful programs, which is a very weak thing to be able to claim!
Lots of people, like you, think static typing is very important for developing software, but when someone challenges this and says 'can you actually show that?' they have never been able to. At some point you need to reconsider if it's actually the case.
I do reconsider it all the time; we only have short lives and I cannot build a lot large systems in my life. So that is why I asked if how it is measured. I see that my teams are more effective with static typing, but that also is my management as I, as I said, cannot really imagine writing large systems in not statically typed systems. I write a lot of scheme and k and bash and wrote a lot of tcl and Perl but never could get to scale like I can with statically compiled languages.
I lean more to dependently typed languages than dynamic as I simply saw only misery, but, again, this is not saying anything, it is just my experience. I am a bit afraid though, because there is no clear measurement, it is just anyone’s experience.
Edit; which actually would be fine; if it works for you and your team, company and business goals…
> I see that my teams are more effective with static typing
If you think you really can see evidence for it then I'd encourage you to write up a paper and submit it for peer review. I guess when you sat down to write it you'd suddenly realise that you don't actually have any hard evidence.
> cannot really imagine writing large systems in not statically typed systems
Ok but lack of imagination is not science. Some people cannot imagine a spherical Earth, but that doesn't make it untrue.
For example I work on a system in a dynamically typed language (Ruby) that successfully handles tens of billions a year, so we know that it is possible. (We are adding optional static typing to it, but it was written without it.)
Sure, but that’s what I said in the first place. It all seems there is no evidence, not even empirical for either way. I was looking for any, if there was. MS or whatnot must have something no?
Maintainability, tooling, and some people would argue for reducing bugs - but I'd challenge them in the same way I challenge you - can they prove it? And I would guess that they cannot.
If the person I was replying to was saying that they could not imagine a large system without types, and well they don't have to since I gave them a real example.
Something else I find rather likely is that different people are working most effectively with different methodologies (a believe that is grounded in the repeated experience of being shocked at how other people program and still be effective). So it is entirely plausible there is a self-selection bias that people that work best with strong types don't work on projects with weak types and vice verse. I guess its really hard to control for that effect when you want to look at big projects, since people need to be willing to work on that for a long time.
I agree it seems hard to prove; disappointed not more tried. Especially large companies that have skin in the game (MS with ts and c#/f#, Google with go and dart, oracle with Java, Mozilla with rust). Guess it’ll be a religious argument for some time to come and I will steer clear.
Perhaps it's like the LHC. Static typing reduces bugs above a certain project size, but we haven't been able to perform a study at large enough project sizes. Now if only we could get a bunch of governments all over the world on the case...
A lot of the studies have been problematic because they look at toy problems (like leetcode problems). Static typing really starts to show its value in large projects. One huge project I'm aware of that uses a lot of python is the Sims 4 which uses it for a lot of the game engine and mods.
Your argument is analogous to the one used by homeopaths and those believing in telekinesis. "There is an effect but it disappears in a laboratory setting, but it's still there!"
I don't think a scientific study is needed. There exist different classes of bugs. The stronger static typing, the more classes of bugs become impossible to make. You will not see a NullPointerException in Haskell.
The only way in which strong static and dynamic typing could produce the same number of bugs would be if strong static typing resulted in introducing other bugs, ones which wouldn't be introduced in a dynamically typed language. Proving that would probably require a scientific study ;)
Considering how many null pointer bugs I've seen, I think that this is a case where absence of evidence is not evidence of absence (at least for null safety).
Almost all of the null pointer errors I've seen came from not fully thinking things through. I can see the same mistakes happening with non-nullable types because someone got lazy and passed a default value or something. You get the benefit of the compiler shouting at you for not initializing a variable, but that won't necessarily protect you from not thinking every situation through / lazily passing default values / making poor assumptions.
I'm with GP, I'd really like evidence of this before people continue shouting things which aren't proven.
I’m not really sure what you’re arguing. You seem to be saying that since we can’t prove that one approach is better than the other, we should just assume they’re all equal.
We do have proofs that certain classes of bugs are impossible given a certain type system.
Does that plan include the client changing their mind 10 times before delivering? Or finding out that some approach doesn't work and that pivoting is needed halfway through?
Likely depends on how smart the planning is.
> Does estimating lead to faster software development?
Unlikely, because engineers are then busy ass-pulling useless time figures over and over instead of actually working on the project.
All of these points have high random factors attached to them regardless, so you'd need a pretty big sample to say what generally works best.
I am curious for why people say this as in my experience when writing a function in a dynamic language that takes a variable as a parameter:
The function will generally only work on a subset of types for given variable.
If I don't check the type of the variable in the function, the function will not behave as you might expect, e.g. silently fail or crash.
If I do check for every possible type for a given variable:
I may not have a good way of handling certain types being passed in, I may be forced to either log something out, create a run time crash or even have the function silently fail. All 3 are bad run time behaviours.
If I am checking for every type in the functions, then using static typing would cause the failure at compile time so the bugs could never exist, but also being significantly less verbose than the dynamic language equivalent.
> The function will generally only work on a subset of types for given variable.
This is true for many functions defined in statically typed languages too. Just because your function says it works with an integer, doesn't mean it's necessarily going to work with _any_ integer (think `1/n`). Very often the type used isn't narrow enough.
Modern dynamic languages understand this, see for example Clojure's spec[1].
I like libraries and languages that create a bunch of types which are essentially just renaming existing types. So you'll have a type PostID which is just an int but the language won't let you give a PostID to a function that takes UserID even though its both just integers.
I'm saying it because it's a fact. I'm not giving you an opinion - it's a falsifiable fact that you can verify for yourself - we have as an industry not been able to give any good evidence for static typing reducing bugs that has stood up to peer review.
You're presenting arguments for why you think there should be evidence... but when people look there isn't actually any evidence. Maybe your arguments are not sound for some reason that we don't understand, or maybe we are unable to measure the effect.
It does seem to me that you are conflating lack of evidence with non-existence though.
We haven't measured an effect but that doesn't mean that we can't reason about this in other ways. The lived experience of people that do this thing for a living can be an important resource. You could argue that we are getting in to soft social science here, but is there value in what the collective of trades people think about their tools?
We didn't know the science behind steal for a long time, but we still figured out how to make it and that it holds an edge really well.
Empirical measurement is not the only way to discover value.
Fwiw a similar pattern is seen in cooking, where lots of chefs have reasoned ideas of how things work based on tradition and lived hands on experience, yet are scientifically disproven.
Lived experience works to some extent, eliminating failures and making progress, like with your steal example, but that doesn’t mean it can pick optimal options from a list of options that all work to some reasonable extent; aka dynamic typing does also build software.
> It irks me a lot that you can just do JSON.parse, declare any type on it and call it a day
You should look at zod [0], which validates data with inferred types based on the validation having passed, so you do have the guarantee of the type being correct at runtime.
Thorough understanding of the problem and thorough testing prevent bugs. Static typing allows you to more succinctly express your thorough understanding of a problem, but it does not fix the root problem.
Quality of life benefits are where static typing (and in some cases performance) really shines.
Static typing makes refactoring easier. Your compiler can instantly tell you what you've broken.
Bugs are reduced by monitoring, testing, and reducing how much code there is so there's less surface area (which also makes monitoring and testing easier).
> Bugs are reduced by monitoring, testing, and reducing how much code there is so there's less surface area (which also makes monitoring and testing easier).
>
> You reduce code by refactoring.
This is a great point.
I wonder if there's a study on bugs-per-line-of-code, how this changes across project sizes, and whether refactoring changes bugs-per-line-of-code.
> Your compiler can instantly tell you what you've broken.
For some definition of "instant".
I can often run my Python or JavaScript test suite faster than Haskell/GHC or Rust/rustc can type-check a module.
And it can take a while to understand Haskell, Rust, or C++ error reports.
(Of course, Python and JavaScript startup time and execution speed can hurt refactoring too. And AttributeError-s and random undefined-s aren't always easy to debug.)
Really, like everything, there’s nuance. E.g. not all type systems are equal, not all programmers are equally effective at taking advantage of the type system, etc.
You can pry typescript from my cold dead hands. All these “inconclusive” studies but has anyone ever tried a trial where 20 nearly identical teams tried implementing the same spec using a typed or untyped language? It’s inconclusive because a post hoc review of projects is going to be spurious at best.
Same here. Clojure offers a bunch of different invariants than ‘this field is a string’. Things like ‘nobody is modifying this complex data structure out from underneath you’ or ‘this complex value is trivially printable, readable, inspectable, comparable, diffable, etc’. And if you want to know for sure that field will be a string, you can spec it and enforce it where it matters. And ignore it where it doesn’t.
I occasionally missed types even in Clojure. “What keys are supposed to be in this map again? Hmm, turns out some callers are providing foo and some are providing bar…”
That said, I certainly agree it comes up less than it does in, say, JavaScript
I agree. In another comment I mentioned how I think Elixir’s pattern matching and annotations (type specs) would take clojure to that next level and strike a perfect balance between between ergonomics and correctness.
Clojure has a few options here. Typed Clojure is a set of macros for enforcing typing, conceptually similar to Typescript (and like Typescript, I found it hard to incrementally retrofit in a large existing project), and core.spec, which is a way of declaring DSTs validations, which I found useful without cramping the Clojureness too much. But it’s been a few years
We’ve been moving steadily to malli over the last 6 months. I have to say, programmatically composing malli schemas as values (rather than macros) is really powerful. I like being able to define a base set of fields on a map, and then merge in other schemas and different sets of fields that are required in different cases. Nice for modeling apis with reused structures but with varying field requirements.
Clojure is the only dynamic language I’ve ever loved. It is in its own ballpark, really, though maybe Erlang or Elixir would win me over if I spent enough time with them.
I’ve actually been spending the last year with elixir professionally. I’m embracing some of it… I would love to see its pattern matching and annotations come over to clojure, and would make my clojure code even more “correct.” I don’t super fully “get” all of its message passing plumbing, and I don’t like the quality of its library ecosystem.
But clojure’s REPL, babashka, and clojure.spec are sorely missed.
I’m not smart enough to program quickly and correctly in Clojure. I think it follows the same ethos Rich Hickey has about unit testing being “guard rail driven development”. I watched him talk about this a decade ago at Strange Loop.
To understand your whole program all the time at scale is probably something Rich Hickey is capable of but I definitely am not.
Working in a huge Ruby codebase, I know not the same, but still super super dynamic my approach is
1. If I don’t understand what code does just delete/noop it and see what tests fail.
2. Write my assumptions into new tests to be very loud people if someone wants to break them. Easier said then done with frameworks/callback patterns that want to call my code from god knows where but really powerful when you’re pulling a value.
3. In the case of callback spooky object at a distance stuff be strict about what you accept and fail loud.
Where I work, static analysis has found quite a few bugs. It's primarily through CodeQL and I think it's greatest strength is how flexible it is. Our code base is weird (it's C++ and has its own memory allocators and schedulers).
> Hype: "Identifiers should be self-documenting! Use full names, not abbreviations."
Most names aren't particularly good - especially when someone tries to make them sound like a full sentence. My experience is that at around four words they start getting less accurate. At seven I would probably read more from an interpretive dance about the function in question.
> Hype: "We need big data systems to handle big data."
I do not get this one. It's incorrect by definition: "Big data refers to data sets that are too large or complex to be dealt with by traditional data-processing application software. "
If you do not need a big data system, then it's not big data.
That definition assumes the code is reasonably performant or that they didn’t just begin on the cloud for whatever (often not well thought out) reason.
> Shower: A review of all the available literature (up to 2014), showing that the solid research is inconclusive, while the conclusive research had methodological issues.
Which research is supposedly solid?
Skimming through the article it seems to me that proper controlled study would be prohibitively expensive and that no controlled study comes close to real world conditions of multiple people developing and maintaining a larger codebase. Conditions, where supposedly (according to anecdotal evidence) static typing would make a notable difference.
Was hoping to see, and would like to see, one of these on Functional Programming.
It’s been an immensely enjoyable experience getting into it, but with a non trivial startup cost.
My team has generally adopted it (at least to some degree, and in a language which half supports these patterns), but I’m sure this coding style erodes in favor of something that feels more imperative when we move on.
> Compared to other languages, Go's concurrency system of goroutines and channels is easier to understand, easier to use ...
True and not a hype: Compared to other languages, Go has a runtime and built-in concurrency primitives that makes it easier to write concurrent code.
> and is less prone to bugs and memory leaks.
Who is actually claiming this? I've got a collection of good Go resources and nowhere something like this is stated. In contrast, some actually make it very clear to be very careful when sharing memory or actually better no sharing memory at all.
Concurrency is conceptually hard in Java, C#, Javascript, Python, C, C++ ... there's no reason to bully Go.
Rust makes the claim of "fearless concurrency" and the compiler indeed saves us from race conditions, but those are not the only concurrency related bugs, e.g. deadlocks are very well possible in Rust. Therefore Rust would be a proper candidate on this "Cold Showers" list.
Overall a questionable list of unproven claims. E.g.:
> Hype: "Static Typing reduces bugs."
Just because the author has not found proper research proving it doesn't mean it's false.
A great type system with sum types prevents tons of bugs. I wonder how anyone can question this.
> Hype: "Identifiers should be self-documenting! Use full names, not abbreviations."
Same as above. To fix bugs you need to read and understand the code. Not having to map abbreviations to your mental model reduces overhead. I've seen code bases where u was used as an abbreviation for user and users in different methods. That was not a fun code base.
Some developers and teams don't need the crutches. Type systems are also an encumbrance that comes with a non zero amount of issues. They slow development velocity down for a theoretical trade-off boon.
I've seen a type system take down a production system where a simple coercion would have functioned just fine. Literally the only thing wrong was the type defined and the code refused to run.
I would argue that static typing reduces bug complexity/creep. Mainly from working in environments without proper testing during the js days, TS was rough at first, but did help.
I don’t know about static types reducing complexity. I’d argue the opposite. Static types allow you to do things that you wouldn’t want to attempt with dynamic types due to the difficulty of reasoning. At least for me, I try to keep my dynamic systems simple because I don’t have a compiler watching my back.
Complexity in relation to tracking down the bug of course*
Interesting though, I would say that dynamic typing allows you to shoot yourself in the foot a bit more, especially over teams who might need to interact with source later.
I agree with you to keep the dynamic ones (that are inevitable) simple though.
I just watched the whole video and only agree with about 20% of the arguments. It feels a bit like Bertrand Meyer is missing the point of Agile. Yes, a bit of upfront understanding of the problem domain certainly is a good idea, but IMHO he is missing the point, that user stories are an instrument to support communication.
He treats them like an incomplete requirements document, but instead they should just contain the bullet points so that the people who talked about the issue still remember what they talked about. In a typical requirements document the communication form is written, whereas in case of user stories the communication form is verbal and the document is just there to help people to remember.
Shower: Unfortunately, there is no conclusive peer-reviewed evidence that it is, in fact, a good morning. A randomized trial found that many mornings are bad.
Caveats: Only applies to mornings. No rigorous paper exists for evenings, so this remains an unknown.
Caveat: It's unspecified whether the friend wishes reader a good morning, or means that it is a good morning whether reader wants it or not; or that they feel good this morning; or that it is a morning to be good on?
That's not what this is, and if you'd bothered to look for more than a second, you'd seen that too. All the claims listed are very concrete claims that are possible to falsify, unlike "Thing good!", and the research (which, admittedly isn't directly linked, so it might be hard to find what research they're referring to) shows that this is either not true, or that the research is inconclusive. It is important not to take these statements for granted if they cannot be shown to be true; that's simply cargo-culting.
Formal methods has a lot of practical application but the tools and techniques are very inaccessible to the average SME. We need better tools. For example, here is a demo of how to use an SMT solver to write better system requirements: http://slra1.baselines.cloud/
LOL, this hits close to home. My company had a modeling specific VM set up to run our predictive modeling pipelines. Typical pipeline is about 50,000 to 5 million rows of training data. At best, using an expensive VM, we managed to get 2x training speed from lightgbm on the VM vs my personal work laptop. We tried GPU boxes, hyper threaded machines, you name it. At the end of the day, we decided to let our data scientists just run models locally.