Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft warns thousands of cloud customers of exposed databases (reuters.com)
306 points by wsostt on Aug 26, 2021 | hide | past | favorite | 149 comments



This is bonkers.

- Microsoft introduced Jupyter notebook support for Cosmos (who even asked for this?) and it was turned on for all customers automatically February 2021

- Security researchers find a way to break out of the container running the notebook and get long lived credentials to other people’s databases

- The credentials can be used off network. I assume with the public Azure API.

Don’t let your intern projects go into production without thorough security review.


Agree and that is why building a secure Cloud Landing Zone for enterprises goes hand in hand enabling developers to have the autonomy with Guardrails

This is my early and potentially slightly erroneous understanding of the situation:

- Jupiter Notebook support was enabled if you used the SQL API and not the Gremlin or MongoDB API. What is unclear is whether breaking out of the Notbook container gave you access to keys for just instances using SQL API or any API.

- CosmoDB is very weak in enforcing identity perimeters so keys are a weak point. Enforcing something like hourly key rotations is left up to the customer to build.

- Azure is pretty much an "it's public IP" unless you explicitly make it private. And even when they add controls like Private end points they have weird routing mechanisms that can result in traffic bypassing controls like hub firewalls.

- Using things like CosmoDB firewalls, private endpoints, automatic regular key rotations, and only enabling features you need should have helped if this was a hostile breach.

Take everything I said with a pinch of salt :)


"notebook support for Cosmos (who even asked for this?) "

Lots of users. The functionality itself is great. CosmosDB is a nice OLTP datastore with a flexible schema, and the ability to run OLAP queries over that data with SQL makes it a powerful tool.


From the researchers:

> So you can imagine our surprise when we were able to gain complete unrestricted access to the accounts and databases of several thousand Microsoft Azure customers, including many Fortune 500 companies.

This is basically worst-case scenario for a database provider.

https://www.wiz.io/blog/chaosdb-how-we-hacked-thousands-of-a...


It’s worst-case for their customers. They’re the ones who need to figure out what data to consider compromised.

Microsoft gets a temporary mark on their reputation until their pr and marketing departments make us forget about it. That’s it. Nobody is even going to lose their bonus over this.


Also, it's CosmosDB - it has a certain reputation internally in the Azure org...


Same as Azure Functions. Internally at Microsoft everyone knows that it’s completely broken and they don’t understand why customers use it either. Apparently it was due to an extremely dysfunctional team and certain team members.


Why are Azure Functions completely broken?


Have a play with AWS Lambda or Google Cloud Functions, then compare that with Azure Functions and you might wonder if what Azure has done is even remotely “serverless” FaaS or just a huge messy clusterfuck of an App Service which is billed per second and overly complicated to manage and observe.


All 3 clouds have the same fundamental architecture. There's no magic "serverless", everything is just compiled down to a container or some executable package, placed on a server, executed as requests come in, and then paused with varying control and scheduling parameters.

Azure is a little messier than the rest but the end result is the same.


Get something-out-the-door to compete with Lambda and GCF by hacking together an abstraction over Azure App Services - then improve things by version 2?


Please elaborate


Do you love MongoDB, but wish it was slower, more expensive, and even less reliable? Then CosmosDB may be the database for you.


We use comos and we used to use mongo. It's a lot less hands off then mongo.


Frankly, I do not know why people don't build things on Kusto.


I haven't even heard of Kusto until now...


It sucks, is expensive, and was designed by someone who was fired.


Do you have some sources? What was they fired for? I thought they got Leslie Lanport to design it


Let me take a wild guess. I bet they work for GitHub now.


Why would he still work for Microsoft if he was fired?


He probably is jokingly referring to the Nokia acquisition, which at that time is ran by an ex-Microsoft executive, whom rejoins Microsoft after the acquisition.

https://www.google.com/amp/s/www.nytimes.com/2013/09/04/tech...


Sorry but small correction:

s/ran by/ran into the ground by/g


Oh yes, I forgot about that important detail. MS was able to buy at a greatly discounted price due to that.


This sounds like no basic pen testing was performed - quite surprised Microsoft isn't doing that.

Seems like their resources (billions in cash) aren't allocated correctly.


I'm sure it was pentested, but the problem is pentesting quality can vary wildly. Hard to know if you have a great tester or a great dev team.


How are you sure when Windows is tested by the users and they scaled down their testing teams? Genuinely curious?


Late to reply but the answer is services are different than desktop software. Azure services fall under a rigorous compliance regime and pentesting (3rd party even) is part of that. Just goes to show compliance does not always mean secure.


You should use private endpoints and private link services and turn-off public access to every resource except components that need public Internet access. Sure it costs some $ but why would a fortune 500 expose their DB endpoint publicly?


> why would a fortune 500 expose their DB endpoint publicly?

Incompetence. It’s not the exclusive domain of small companies. I’ve dealt with some very incompetent people at Fortune 500 companies. As a matter of fact, the very people responsible for the mishap we are discussing now work at a Fortune 500 company.


I really hate "incompetence" as an explanation for anything, but especially something as complex as software and security in particular. Obviously, it's easy to blame a single person as incompetent, but the reality is that their work was certainly part of a team effort. Are all those people incompetent? They were using Azure, which presumably didn't warn them well enough, so are the Azure team incompetent for deploying a dangerous application? Clearly the technical authors who wrote and reviewed and published the Azure docs did a bad job of explaining CosmosDB's security model. If these Fortune500 companies employed a security team to test and audit their cloud apps then they missed the problem too. Or maybe they didn't, but some manager or product owner didn't read the report properly, so that security team were incompetent for failing to report the problem well.

This is how everything works in pretty much every industry. There are layers upon layers of complexity in everything, and no one has enough oversight to take full responsibility for some mistake that occurred somewhere down the rabbit hole.

Somewhere, somehow you could probably call someone's mistake incompetence, but doing that relieves everyone else of their little part in developing a chain of tools and applications that enabled that incompetence. If we heap all the blame on a single individual then no one else has any reason to improve it. We can just say "OMG, Chris was terrible! I'm so glad he's been fired now everything will be perfect!" until the whole sorry mess happens again.

Instead if we accept that mistakes are inevitable, and we accept that anyone can make one, then we're driven to build systems and processes that include guards against mistakes. Applications that check and validate things automatically, even if it's hard and expensive. That's how you get to robust software that doesn't fail like the thing in the article. Blaming individuals will never get you to that point.


My guess would be, based on The Gervais Principle and on my predjudice that dissenting opinions aren't viewed favourably and may be detrimental <edit>to your</edit> career, that we have a "The emperors' new clothes" situation here. The people authorized to make decisions may be clueless (GP) and the people pointing out possible problems have been gotten rid of. There is a turkish proverb: "The fish starts stinking at the head". The Peter Principle from 1969! might apply too.

https://en.wikipedia.org/wiki/Peter_principle


> Applications that check and validate things automatically, even if it's hard and expensive.

Then you still need to be competent enough to to assess that this is the case. You can't judge code doing something correctly automatically if you don't even know what that thing is and can't do it yourself.


Strong disagree. Incompetence is the reason. It starts with incompetent management who push unrealistic deadlines and hire people who promise the world to pad their resumes and leave in-flight. Incompetence is the reason why the messengers of realistic/bad news are punished and why snake oil salesmen are promoted.


This is exactly the sort of problem though - it's very rare to be able to attribute a mistake to any one person like that. Was it the manager's error for giving an unrealistic deadline, or the team lead for not pushing back on the deadline hard enough, or the developer for not working hard enough to meet the deadline, or the QA for not picking up the error, or devops for not checking what they're deploying, and so on maybe with dozens of people and teams involved.


Not speaking about this particular security problem, but instead, more in general:

Such management and snake oil people need not be incompetent -- they might just have different goals, for example, their personal career and making lots of money -- and building very secure software can be off topic?

Someone might seem incompetent, when the underlying problem is different goals? The principal–agent problem


It's not just incompetence. Apart from costs, there are some strange limitation and very specific feature missing if you use Azure PaaS DBs with private endpoint enabled.

It's like private access has been added as an afterthought and lags behind the "normal" (public) access.


I do apologize, I didn't mean to imply incompetence at small companies. I can understand smaller companies not using private endpoints/private link services for cost reasons. I can't fathom why fortune 500's would not use them to secure their resources.


No apology needed. I’m the one who brought up small companies.

I just wanted to point out that big businesses don’t necessarily know what they are doing. People who don’t work for one assume they are well oiled machines, and people who do work for a big business assume every other big business is a well oiled machine.


Especially these days as OPSec is seen more as a cost-center than a lawsuit buffer.

For some reason they will keep millions on the books specifically for lawsuits, but shelling out $ for proper security is unfathomable (see- T-mobile).


I was in on discussions for a former employer migrating some of their services and data to Azure. Since it was a health insurance company, there was a great deal of care taken about securing the data. I think a big part of that was that a few years back, there was a well-publicized data breach that impacted the company and few things can generate caution better than a really expensive mistake.


The article doesn't make it clear, but likely it would still be effected. This is really a container breakout issue between customer's jupyter notebooks.


I agree with you but you can't just throw around the phrase "sure it costs some $" as if that isn't a driving force in everything a big company does. My company is going hard into Azure and I'm making the same recommendations as you suggest, but it is a hard sell precisely because of the cost.


https://www.wiz.io/blog/chaosdb-how-we-hacked-thousands-of-a... is from the researchers and has more details.


>Microsoft agreed to pay Wiz $40,000 for finding the flaw and reporting it,

A vulnerability on this scale.. and they pay out only 40K ? I don't understand, this is peanuts compared to the damages this could cause no? Why do they not pay out respectable amounts?


This logic could apply to any profession. You just happen to be on the expert side for this one.

Should a surgeon ask you for millions? After all he saved your life.

What about the mechanic that finds an issue with your breaks?

If you paid someone not for the work done, but for the resulting loss if criminals exploited something, a lot of professions would ask for crazy prices.


You're thinking about this in moralistic terms of deservedness. The shock I encountered at the 40K was that it wasn't enough to incentive future security researchers to find bugs. This just seems like a bad business decision.


I think people underestimate how much work it is to find and report security issues... Including the unpaid labor of training...

Anyway, seems like an NFT with the instructions would've sold for far more than 40k...


Aren’t the contents of NFTs public before the sale? I agree with your general point that there were buyers willing to pay more, but NFT might not be the right vehicle for it.


the NFT is just a dumbass complication and doesnt add anyhting. people have been selling exploits forever. its like saying ill paint the instructions onto a canvas and sell it


It could be an NFT of the digitally signed exploit. Or something, NFTs confuse me


The way I see it, they're like a certificate of authenticity. It just proves that the accompanying asset (which can be freely copied) is the original work, by way of the known creator of the original work signing the NFT, and the NFT itself not being forgeable.


If I don't have millions I can't give it to you.

If there are 50 others who can fix my brakes, one of them will likely try to undercut your price so as to get my business.

In this case, M$ could have paid ten times that amount and it would still be pocket money both in terms of their ledger AND vs the damages the PR could cause alone.

M$ does have black market competitors that would have paid far more.

Doing the right thing morally is good. But when people see time and again that simply cashing out once could set you up for life.(Maybe not in this case but maybe in others.) That is an awfully tempting risk to take and I wouldn't blame them.

We're literally saying it doesn't pay to be moral.


If MS forked out $400k for a security flaw, I'm not convinced it would be great thing for PR.


Yes, and, hmm, maybe if $360k of that was in MS stocks it'd seemed more ok? Hmm

At the same time, these discussions about then paying to little aren't good PR either


The person who finds the exploit can give it to some shady group and make lot more money and then if the exploits become public it damage MS. In the examples you mentioned can you do the same?


If I found your wallet on the street with $1k in it I could just keep it, but that would be wrong, so I return it to you. Maybe you give me $50 for my trouble.


If I am in the business of protecting wallets of other people then me losing my wallet is huge reputation risk so I give significantly more than $50 and you make it an official policy so there is no maybe.

There was a story on HN a while some guy just got 100K for an Apple account takeover exploit. That is a huge exploit, paying 100K is peanuts for 2 Trillion dollar company.


Or maybe a biomedical researcher discovers you have a special mutation that makes your kidney worth millions on the black market and informs you instead of the guys with a tub full of ice ready to go


They pay out the amount that researchers find worthwhile for the time they spent. The vast majority of people do not want to commit a serious crime in order to make a little bit more money so Microsoft does not have to pay the same rates.


"They pay out the amount that researchers find worthwhile for the time they spent."

How have you confirmed that this amount is accurate? Have there been bo zero-days on the black market, instead all hackers have gone to microsoft?

Would the number of hack on black market change is amount paid out increased 10x? It seems you have no basis for claiming that the amount paid is optimal


I actually didn't claim it was optimal. I simply disputed the claim that they should pay equal to the damages that could have been caused.


"... this is peanuts compared to the damages this could cause no?"

But if no one can succesfully sue for those damages, then what is it really worth.

The idea that Microsoft with its gargantuan resources cannot discover these mistakes and has to wait for "security researchers" to discover them and then pay them a paltry sum for their "valuable work" makes little sense; a more sensible interpretation is that the company has little incentive to find such mistakes because it has little exposure to potential liability arising from them. It needs to stage "security theater" to avoid reputational damage but does not need to achieve real results. Mistake after mistake, the stock price is not affected. Regarding an ongoing lapse in quality control that spans four decades, it has no skin in the game.


Wasn’t Wiz founded by an ex Microsoft Cloud security engineer?


Yes, second paragraph in the article “ Wiz Chief Technology Officer Ami Luttwak is a former chief technology officer at Microsoft's Cloud Security Group.”


Yes, very interesting strat. Start another company to warn your prior employer who wasn’t listening… and get paid more.


You just reminded me of something I heard on a recent Darknet Diaries episode:

> Some of the best hackers within the NSA turned into independent contractors so they could work faster and make more money, but were on the outside? This is one of those things that someone like Microsoft is afraid of, too. If they pay too much for bugs, then some of their internal bug hunters might decide to quit but keep doing the same thing; just make more money on the outside.

He even used Microsoft as an example. I wonder if they paid less because the researcher was a turncoat in their mind.


They don’t need to pay out anything.


You have a point there and the researcher has also no obligation of disclosing it to them.


But they do have obligations that make it otherwise worthless to the them.


They could contact a three letter agency, that's totally legal.

Or they could release information publically, without informing M$


Apparently it would also be legal to sell it to authoritarian regimes a la NSO


sell it to FBI/CIA< and they use it on Azure China run by 21Vianet ?


seriously if they started paying out something comparable to small ransoms, some folks may be swayed back from the dark side.


multi-millions?


Maybe it's an imperfect market. Is there an auction site for zero days where Microsoft could bid on their own exploits?

I'd be in favor of something like this. Why should the criminal justice system and every tax payer be on the hook for protecting these big companies from the consequences of their bug-ridden software?


zerodium does payout for zero days -- but I don't think this would qualify. Scroll down to see list of prices.

https://zerodium.com/program.html


Because the rule of law?


Do you think laws are sacrosanct and can't be examined or changed? Do you think they don't need to be justified?

There's many examples where the law doesn't stop companies from ripping off individuals, obfuscating terms and conditions to their advantage, hiding prices, etc. Recent example that bothers me is the paperwork at a doctors office, expecting the patient to be responsible for all bills insurance doesn't pay, without letting the patient know the costs involved, basically signing a blank check, even on a signature pad where the contract can't be seen, being told "this is just consent for the exam". The justification that these are standard forms is not a justification.

Too often big companies get subsidized at public expense. That money goes right from taxpayers to shareholders. Microsoft should deal with the fallout from it's less secure software, not taxpayers. Instead these big companies race ahead to make features and sales, playing games with juristictions to avoid paying taxes, etc.

Doesn't seem fair, and just saying "because rule of law" doesn't seem like a good explanation of why we should defend this kind of thing.

Maybe a bug exploit should be considered protected free speech. How about that law?


I think payouts are defined in bug bounty programs upfront, so there are naturally upper bounds.


That was my first thoughts..


I'll just print this off and add it to my pile labeled "reasons we operate exclusively on-prem"...

Its been a long time since someone forced me to break out these articles. Very comforting to see the justification continue with such frequency.

Don't get me wrong - I absolutely love parts of what Microsoft does. Just not this whole cloud thing.

I get, on average, 3 Microsoft major change notifications every day. Can Microsoft even keep up with their own mess internally? I totally stopped trying from the outside. I just keep an eye on .NET and visual studio stack these days. Everything else winds up in my spam folder.


> We rarely see security teams move so fast! They disabled the vulnerable notebook feature within 48 hours after we reported it. It’s still turned off for all customers pending a security redesign.

if this is amazing I'd hate to see bad


When I worked for a shop running windows powered CNC machines, an update broke a bunch of more updated machines one of our partner shops used, dozens of shops were affected by this.

It took a few hours for Microsoft to fix the problem, they responded within a half hour or so of the first complaint.

Two days seems like a long time for Microsoft to respond when commercial customer's businesses are at stake. Say what you want about Microsoft, i don't really like them, but their enterprise/commercial support is honestly pretty top notch, I'm really surprised it took them that long to get on top of that considering the calibre of customers involved.


That’s what I was thinking. I was once responsible for a RCE vulnerability that was reported to hackerone on Christmas afternoon. Even then I took less than 1/48th the time to disable the feature and had a patch ready in half the time it took them to just disable jupyter notebooks.

I’m not claiming to be better at my job than their response team. I’m almost certainly not. The difference is I didn’t have to have meetings with 20 different internal stakeholders before acting.


Yeah, 48 hours doesn't seem too fast - but better then multiple months as with Exchange


> Wiz Chief Technology Officer Ami Luttwak is a former chief technology officer at Microsoft's Cloud Security Group

Now, I bet you anything I could leave my current employer, get a job with a security firm and then find a major hole in my old company's service. Perhaps one that my team warned the CosmoDB team about internally. Or am I just being cynical?


> Perhaps one that my team warned the CosmoDB team about internally.

If they still have that hole 18 months later, they kind of deserve it...


It’s quite amazing what poor practices arise in shared environments.

About 20 years ago I was working with a large company that had dedicated servers in a huge east coast data center. The servers were firewalled, but they also were attached to a secondary network for scheduled backups.

I decided I should explore that further, and sure enough I was able to connect via SSH to other customers’ servers.

I alerted the hosting company, but they didn’t seem to take it very seriously.


Visited a BIG datacenter; top grade security and visitor's passes and all the trimmings. Buddy giving me the tour shows me a couple of the things they show to VIPs, its early Sunday morning and they still caught me trying to take a couple pictures, very high grade and attentive folks.

Then we go back to his office, and load up the 1,000lb of computer gear he's donating and I'm transporting. We wheel 3 cartloads of stuff down the back elevator, through the secured DC floor we saw through glass earlier that i couldn't take a picture of, and out an open back door right near where i parked my truck. no further interaction with security, no badge waving, no questions asked.


Reminds me of Michael Keaton’s line in The Paper:

> A clipboard and a confident wave will get you into any building in the world!


We should never believe a hosting or cloud provider when they say we have a segregated instance. They may even believe it themselves, but there is always some service connecting them all.


From the title, I thought Microsoft was scanning its cloud for unsecured public facing DBs and alerting customers. So nice of them.

Then I read the article...


How can CosmosDB's architecture and design make it possible to have a read-write key that works on all customers' databases? Yikes!


It looks more like they had all the jupyter notebooks for all customers sitting side by side, and each notebook had a master key for the related DB.

...but due to a poor deployment strategy, the instances were not sandboxed from each other.

(and yes, you would be correct in thinking that was a catastrophic failure; it's like having a folder with a directory per customer and a master access key in each folder, and WOOOPS, didn't realize people might look at ../Amazon/master_key.txt

It's pretty hard to brush this one off; it's a colossal screw up)


Several people had to have thought it was a good idea to add the jupyter notebook service with the primary keys to every cosmosdb instance even though a tiny portion of customers would ever even look at it. Yet it’s still fine to add that extra attack surface.

I’ve seen some azure documentation that stresses you should never ever use your primary keys in a deployed service. Yet they do it willy nilly.


Insecure by default by opt-in security and a "should never" allready guarantees that it will happen IMHO.


They seemed to be sandboxed to some extent, the researchers mention a privesc which could either be user -> root and then accessing other users accounts, or it could be a sandbox escape through the hypervisor to the host and a pivot from there. Regardless, shouldn’t be possible.

You see the former with a lot of “online Python interpreter” sites.


It read somewhat like there was a separate DB with everyone's credentials in it and some api call to get your own credentials...with an unintended way to change the parameters from cred_get("me") to cred_get("arbitrary_account").


You got it :)


When I was working with Azure Queue Service years ago, the way for my service to get delegated read-only access to a queue was:

1. My service's users would have to grant my service permission to read all access keys for the queue

2. My service would look through the access keys and try to find one matching some conditions (specific provided name and read-only)

3. If it couldn't find a limited one (which it would not in pretty much every real case due to the defaults) it would grab any read/write key and use that

(Later, we used Managed Identity when that became available, which avoids the song-and-dance but also has performance issues with very large scale sets. Out of the frying pan...)

I'm wondering if Jupyter had similar ability to read the primary r/w key for every database that had it enabled.


I think each database has its own key, but all keys were accessible via Jupiter Notebook.


because there isn't.

There's a vulnerability in some jupyter notebook implementation allowing them access to unrelated notebooks which hold read/write keys to database they're connected to.


Interesting what the weak point was here.

> The flaw was in a visualization tool called Jupyter Notebook, which has been available for years but was enabled by default in Cosmos beginning in February.


While this is bad, it only affects those who intentionally left their firewall open to the outside world. Any half-decent software engineer knows to firewall the database server, no matter the vendor.


I think that's a misunderstanding -- this affects all Cosmo DB users who had gotten Juniper enabled.

The way to avoid this, wouldn't have been firewalls, but using another database


I used to work on a massively distributed system called Cosmos in Microsoft (there are some public slides and blog posts about it), so I was surprised to hear this, but apparently "Cosmos DB" has nothing to do with Cosmos, it's a completely different system.

Azure actually, in my opinion, made the original Cosmos worse via "synergy"; but apparently they ALSO used the name for something unrelated :) WTF? Now I cannot even tell people anymore because the name is forever associated with this debacle.


Speaking of CosmosDB -- other than this security issue -- has anyone here ever used it for anything practical?

To me it seemed vaguely interesting but a little bit overpriced for real-world usage outside of fortune 500s or big government, where budgets aren't always a concern.


I recently set it up as the ingestion point for an IOT pipeline that currently runs steadily at a high volume, and found that Cosmos would charge me $15k/month in RUs to achieve stable ingestion (i.e., no 429s on intermittent spikes). This was fed by Stream Analytics fed by an eventhub.

For comparison, I had the same eventhub feed Kafka Connect to dump it into Mongo 5.0's new timestream collections on a $200/month VM, and after running both outputs at full volume for a couple days, Azure offered me a helpful suggestion to downsize the under-utilized VM and save money.

I haven't quite figured out if the RU rate is just too high to be economical, and counts on large orgs with large data to expect a large bill; or if the RU model simple works badly for high velocity data. I've verified that the pricing model now scales down properly for small/medium loads, at which point the various high availability/geographcial distribution features really add value. I'm trying to conceptualize how the RU cost, amortized across longer term storage and less demanding ingress/egress schedules, can work out. For our scenario, Cosmos functionally offers exactly the horizontal scalability we need.

As a different sanity check, I set up a SQL Server account as the ingestion database and saw comparable costs, so it wasn't that Cosmos itself has an unworkable pricing model.


The CosmosDB serverless pricing model seems very good value for small to medium usage. CosmosDB is also unique in it has geospatial type/query support which DynamoDB etc don't. I believe the serverless pricing is fairly recent so not sure if you checked the pricing before it was released, also on the pricing page you need to specifically click the "Serverless" tab which isn't to obvious as a clickable tab in the UI.

I've recently tested it as needed geospartial query support for a hobby project and the cost is cents per month with the serverless pricing model. I'm also accessing CosmosDB outside of Azure to use low cost VPS hosting and latency is pretty acceptable for web type stuff.

If you are DB read/write heavy obviously pricing is going to be higher due to pay for what you use, pre-provisioned pricing will become the better value option at that point.


This.

I couldn't justify it prior to the serverless introduction; but I'm doing 1m+ ops/mo now for <1usd for some high-concurrency state tracking ops.


Sure, as an entity store for run of the mill CRUD apps.

It has a MongoDB API Layer and a Cassandra API Layer (as well as SQL which is preferred).

Works just fine, apparently cost can blow out though with high usage, I haven't got there yet.

Does have a free tier.


I tried Cosmos via the Mongo compatibility layer and performance was abysmal, even when cranking the provisioned performance way up. I ended up getting significantly better performance from a reasonably sized MongoDB container running on AKS.


Was this taking a MongoDB database app and pointing to CosmosDB or was this developing a app around CosmosDB's partitioning etc but using the MongoDB API for read/write?

The former, lift and shift scenario I can understand as you need to structure your data to work best with CosmosDB/DynamoDB etc. If it's the latter that is useful to know as I'm currently building a POC on Azure so I should watch out for it.


Yes, so not the ideal use case. The compatibility Cosmos offered was nice and didn't present any issues, it's just the performance that suffered.

This was all part of a benchmarking exercise to figure out how to scale an app with minimal changes so we expected some performance penalty, just not as steep as what we observed.


I think that makes sense. Using the technology as designed without the “compatibility” layer is likely a better option. Thanks for sharing as many I think wonder about these DocumentDB and CosmosDB MongoDB imitators.

However, CosmosDB is a managed service, easier than setting up your own database. Did you exclude Atlas from your conclusion for some reason? Ie did AKS do things that Atlas could not?


At the time the client wanted to stay inside of the Azure ecosystem so Atlas and the like weren't seriously considered.


As I understand it (I do not know that much about it), Cosmos is a multitool with features to fit many, many use cases.

You must be diligent to develop your application to use it efficiently from both a cost and performance/latency perspective.

Those that put in the work may not love it, but they are very successful at operating at scale. Massive, insane scale.


We use it to log audit data for all platform events and changes to data. It's one thing we found the schemaless nature to be valuable for. The automatic indexing has worked well for reports.


Have used the SQL and Gremlin(Graph) varieties. Heavy load on the SQL, not so much on the graph. Works well. The main advantage is extremely low "devops". There is also the added benefit of setting up a geo-redundant cluster while keeping the extremely low "devops" theme which I intend to explore.

We use it with a private endpoint and turn off public access. I've been generally happy with it.


Yes. A large 4 TB oltp database as a document store.

As long you have a good partitioning strategy seems work without much effort put into it.


Its graph DB (Gremlin compatible) is pretty great.


My (SQL) database in Azure is protected by a firewall with an IP white list by default. Is it different with Cosmos DB? Or does the Jupyter notebook thing goes around the firewall?


You should be OK, CosmosDB is not the same as MySQL offering.


I am not worried about my DB but just curious of whether they were exposing CosmosDB to anyone on the WAN, or whether the Jupyter notebook circumvented the firewall.


default cosmos db is public ip, and Jupyter is preloaded with your db ip & key, so breaking notebook isolation escalates to db comprimise

the blessing/curse of Jupyter being a university oss project and private companies largely not contributing back their multitenant notebook stuff (as hosting is the $ maker) is this kind of scenario. the project provides some multitenant stuff, but guessing not what MS uses.

Data science envs generally need wide read access across many data sources, yet mostly only used by a few trusted-yet-security-agnostic power users in an org, which leads to relatively low sec eng infra investments. so my guess is Jupyter security flaws (ex: extension vulnerabilities) are increasingly ripe targets for big escalations.


> Luttwak's team found the problem, dubbed ChaosDB, on Aug. 9 and notified Microsoft Aug. 12, Luttwak said.

Then mailing customers on the 26th.

Depending on whether or not this can be classified as a 'personal data breach' (IMO you should treat it as such, since it has such widespread implications) that may be a GDPR violation. The GDPR requires 72h notice to the data protection authorities and notice without 'undue delay' to the controller.


It doesnt sound like a breach (in the legal sense) occured, at least with the knowledge at hand. Microsoft has bug bounty programs, so such hacking is legally covered and processes seem to have been followed (e.g. researcher didn't unnecessarily access data they didnt own etc).

Will be interesting to learn more info down the road


is this why i got a weird notification on my iphone lastnight that a bunch of my passwords were exposed in a data breach? amipwned didnt have anything new and my iphone didnt appear to provide any more details about where and when the breach occurred.


why are the default config / settings so lax ?


Convenience über alles, user friendlyness gone wrong would be my guess.


exactly my thoughts on this


Cosmos DB can't be used for anything real, can it?


Honestly? I don't know for sure.

On one hand, my understanding is that they used TLA+ to validate the model[0]. One would assume that if they went to that trouble there is at least a specification for how it should work.

OTOH, Specifications can be flawed. This security issue appears unrelated to the design of Cosmos DB itself tho.

[0] - https://en.wikipedia.org/wiki/TLA%2B#Industry_use


For formal verification to catch the bug, the bug would have to be at the design level and the model would have to have enough detail to include the design bug.


TLA+ is used to design the database itself, including the storage layer, distributed operations, and consensus/consistency models.

This bug is a security issues in an entirely separate component. No formal logic is going to warn you about exposed read/write keys.


In principle it sounds good, but from what I've heard it is over-priced and far too slow compared to typical IaaS-hosted solutions.

Notably, some of the customers mentioned in the security researcher's blog are the type that have exploding wallets.

As in: "My wallet is bursting open because there's too much money in it! Mr Cloud Sales Guy, can you help me with this problem?"


“ A federally contracted research lab tracks all known security flaws in software and rates them by severity.” …where can one see this list?


Probably the MITRE CVE database



I think they should add a few zeros onto that reward. These bounty programs should be dropping 10x dollars on these bugs.


20 billion over 5 years seems a bit much, no?


I tried using Azure once and couldn’t get it to work for me. No, it’s not me, it’s Microsoft.


Odd, I've used it many times and everything works.


Good for you!


So you couldn't work out how to use something that thousands of other people can use just fine and you conclude it is the maker of that thing, which is the problem.

I'm starting to understand why you found using Azure such a challenge.


I know I didn't make a mistake, and it didn't work. If I remember correctly, something in the UI of allocating or starting VMs crashed. I am not some naive computer user assuming the mistake is with me. I know it's with the software. And I don't have time to work with buggy software. There is better software out there, for example GCP, which has sane APIs and actually works.

To be honest, I don't think you understand anything.


Even accepting all your statements of fact, you are still exhibiting some poor reasoning skills. From your story, it sounds like you briefly tried Azure and something failed and from that you inferred that Azure must fail constantly.

In reality, all cloud providers have hiccups now and then and it seems like you may have caught Azure on a bad day.

Imagine, if the first day you tried GCP you stumbled into one of its many incidents[1] and gave up on it entirely. You'd have been deprived of a lifetime of joy with its mentally balanced API.

[1] https://status.cloud.google.com/summary


Defaults matter.


[flagged]


Azure services... Like a Jupyter notebook? :0




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: