The story that was going around on social media (which I only know because Claude refused to translate it sometimes) was that a particular developer was modifying weights in other developers models and crashing their training runs so that the developers own work looked better in comparison.
how is this any different from starting new projects at google and leaving them in a half-baked state because that leads to a promotion faster? incentives align behavior
I didn't mean to suggest "don't reply" -- I did want to start a conversation on how they are more or less the same thing. I guess I wasn't asking for clarity, right, I just wanted to point out that they're similar.
What we often think of as Insider Threat in the west is just another Tuesday in Chinese business. I have many experiences of this in the video game industry. This industry sabotage and theft is a very real part of getting ahead, even amongst companies that are owned by the same parent company (ex: studios owned in part by Tencent).
10,000 people is as many people as some entire towns, I don't think society would hold together very long if it were true.
100,000 supposes that there are... hmm... about eighty thousand non-evil people in the world, and (odds are) exactly none of them are Marshallese and about 2 are Samoan, to give a sense of how silly this is.
I think you are assuming there are a unique 10,000 people for each gifted individual. I think it safe to say there is overlap and that those 10,000 people are each going after several gifted people.
You also appear to be assuming there are no mid range people who everyone ignores, that is going to be the majority of the population.
It doesn't . But usually that third of the population is busy going for each others throat and ignoring the "fools" while then taking the coins that mysteriously spawn near those as psychopath price mobey.
You can have both at the same time: Say, 15 who want to help them, and 5 who would want to hurt them. And then you can be right both of you. Although the angry ones would be harder to see (sychopants)
Even better imagine people don’t have to be mean/awful all the time. They can be shitty for a coworker and then be great for friends and family or be great for people at other job.
There are really evil people that might be shit all the time but society is rather good at spotting them.
There is probably a high percentage of tearing down, I doubt its so extreme.
I think maybe 1 in 100k is actually anything special, but odds are you aren't special, you just noticed that 20% of the population is as gifted/motivated/constructive as you are (statistically speaking, assuming a bell curve).
And of those, yes, some small percentage will still feel "special" and affronted that other people have the same ideas/goals/desires as them.
The world does not work like that. Sure, for every person, there may be 100_000 that do not share their ideals. But even 10/100_000 would seem ridiculously high as a percentage of people who actively try to destroy and cannibalize the work of others to showcase their own. Another commenter said it here - it's easier to destroy than create. I guess by my vibe-based estimates, it's at least 1_000 times easier to destroy than create, in aggregate.
Yet oddly enough, the vitriol didn't turn up against him when he was creating awesome stuff, but when he was creating awesome stuff and behaving like a monstrous asshole. Curious!
(There are plenty of people bandwagonning on Musk hate, and definitely some for his political bent, but there are also plenty of totally valid and non-political reasons to have disdain for him)
I remember plenty of naysayers claiming that Musk was going to go broke with Tesla and SpaceX, driven more by vitriol than fiscal arguments. Astronomers (and only astronomers) also hate Starlink.
I'm going to agree with Walter that it's just human nature to want to drag down the successful. Envy is one of the seven deadly sins, after all.
That's why you see everyone losing their goddamn minds over Warren Buffet. They just can't stand to see success. Jensen Huang. Michael Dell. John Mars. Alain Wertheimer. Phil Knight. Dustin Moskovitz.
People literally just trawl the most-successful-people list and find folks to hate. Simple as.
Mark Cuban is an interesting in-between case. But yes, you have shown that wealth doesn’t do it; but wealth and an appetite for publicity (especially for wealth-related attributes) seems a solid predictor.
1. Calling a rescue worker a pedophile because the rescue worker saved kids and made Elon look like a useless diva
2. Firing employees at a company he purchased (i.e. people who did literally nothing to him), in as vicious and demeaning way possible
3. Sexually harassing an employee on his airplane
4. Frontrunning a story about sexually harassing said employee by suggesting that it was some political issue, thus making his own sexual misconduct a red vs blue problem in an already deeply polarized society
5. Advancing falsehoods about election security in the US
6. Releasing a product to public roads called "Full Self Driving" which is, in fact, not fully self driving
7. Hiding data required for the public to evaluate the safety of this "Full Self Driving" which is already operating on public roads
8. Was such a hysterical crybaby about rebranding PayPal to X that the board had to fire him from the CEO role while he was on vacation
9. Requesting permission for Bladerunner imagery for his We, Robot event, having that request declined, then stealing said imagery anyway
1. After he was insulted by the rescue person. Besides, have you ever insulted someone? I when I was young a common insult was to call someone "gay". It doesn't mean anyone thought they were gay, it was just an insult. The worker sued Musk over it, and lost, because it was just an insult, and didn't rise to defamation.
2. Firing people is not about them doing nothing to the owner. It is about getting rid of employees who were not core contributors in a company that was losing a lot of money fast. They were all well paid, there's no need to feel sorry for them. If they're competent, they'll have no trouble getting hired elsewhere. Besides, every person I personally knew who were fired thought they were treated unfairly. Even the ones who were embezzling, padding expense accounts, and showing up for work drunk (I'm not suggesting that the Twitter workers were that, just illustrating how everyone thinks they are unfairly treated).
3. He said / she said is not evidence. If it was, he would have been prosecuted. Wealthy people are usually counseled to avoid situations where they could be falsely accused. Did you know Tim Walz is also accused? No evidence there, either.
4. Maybe it was a political issue. A lot of people don't like his politics, and so may think it justified to go after him.
5. Nobody has proved that US elections are secure. In Washington State, the elections department as official policy does not verify that registered voters are citizens. A secure system would welcome audits, not prevent them.
6. Full self driving is a spectrum, not an obvious yes/no line. Human drivers have car accidents all the time. Everyone in my family has been involved in a car accident in one form or another. My grandmother was killed in one, I nearly was killed in another.
7. Don't know about that.
8. So the board fired him in as vicious and demeaning way possible?
9. Oh, the monster! Jeez. You're talking to the wrong guy, I give my IP away for free.
10. The Boring Company is profitable and now valued conservatively at $7 billion and optimistically at about $125 billion. TIME magazine hates him - I wouldn't take what they wrote seriously. Nor do I believe that Musk is responsible for the total failure of California's high speed rail.
On the other hand, the people who invested in his companies have done very well. Every Tesla owner I know loves their car. Starlink has been crucial in helping the Helena disaster victims. He's making science fiction real.
If you can't see the difference between a child on a playground calling someone gay and one of the most powerful people on the planet calling a random civilian a pedophile, you're in a cult.
They insulted him, he angrily insulted him back with "pedoboy". Childish? Sure. So what. Nobody believed he was really a "pedoboy". Musk did not mount a campaign against him.
Are you forgetting the part about hiring a private investigator to go dig up dirt on the man who risked his own life to save a bunch of trapped children? Not jealous of whatever weird reality you're living in that this is all even remotely similar to a child calling another child "gay" on the playground, but that's cults for ya!
It seems from the article that he hired the investigator because of the "imminent lawsuit". So yeah, he went looking for dirt to defend himself against a guy who was suing him over a childish insult.
Both parties behaved poorly here. But stepping back a bit from it, the whole thing was a nothingburger.
Bummer all your comments got flagged. It was actually much better as a cautionary tale of motivated reasoning and cult beliefs.
For future readers, WalterBright here thinks that one of the world's richest (adult) men publicly calling a rescue worker a pedophile is pretty much the same as a child calling another one "gay" on the playground.
Here, he's explaining that a rescue worker calling Musk's useless PR stunt a PR stunt is "behaving badly" the same way as Musk calling a rescue worker a pedophile when that rescue worker risked his own life to actually succeed in rescuing a bunch of kids.
If there was a crisis that captivated worldwide attention, and you offered your time, millions of dollars, and your engineering team to help out, and the response was "useless PR stunt", would you be mad about that? (All they had to say was "no thanks, we'll handle it".)
What do you feel about Musk providing Starlink to the Helene victims?
Do you call people pedophiles when they hurt your feelings? If so, you're an asshole too. Do you do it to your 22.5 million followers? Then you're an even bigger asshole.
> What do you feel about Musk providing Starlink to the Helene victims?
To the extent that he did: great! People believe it was philanthropic to a far greater extent than it was. He didn't give away or even loan any Starlink terminals, he gave people essentially 2-3 free months of service after they purchased the ~$400 terminal. The free service is cool! Generosity is good and I'm thankful he gave away what he did.
And obviously it's great that Starlink exists at all to be able to help out in such a situation, even if victims and/or the federal government are footing the majority of the bill.
And FWIW, I think Musk's heart was in the right place trying to help those kids. I was very excited by his work while it was happening. But yeah, the pedoguy thing was a turning point and, unfortunately, quite indicative of an overall slide into a very, very weird mindset that afflicts him to this day. It's a real bummer, because he's obviously capable of incredible things.
Sounds good dude! You’ve done a fine job illustrating my point. There are plenty of legitimate reasons to have disdain for him and the people who can’t see that are engaged in remarkable gymnastics.
Why do you feel the need to simp for this man? Is it a parasocial relationship? I imagine he wouldn't be making lists about your good points if the shoe was on the other foot
Because the complaints about him do not rise to the level of "monstrous". They smack of someone disliking Musk for other reasons, and going looking for something, anything, to justify their opinion.
Musk has not destroyed anyone, gone on any vendettas against anyone, robbed people of billions of dollars, swindled anyone, funded any terrorist groups, framed anyone, or done anything deserving of "monstrous".
It offends my sense of justice and fairness.
> I imagine he wouldn't be making lists about your good points if the shoe was on the other foot
I can't imagine him making lists of reasons why he hates me, either.
BTW, you tried to insult me with "simp", which means "someone who gives excessive attention or affection to another person, typically in pursuit of a sexual relationship or affection". Is that "monstrous"?
but you are "someone who gives excessive attention or affection to another person, typically in pursuit of a sexual relationship or affection"
you're literally doing it right now. i'm just curious why?
edit: You said offends your sense of fairness and justice, but this can't be the most unjust thing you saw today so i'm just disregarding it. Can't be the real reason
Whether or not you like the guy, which you clearly do, whether or not we “go to nuclear war” was not his call to make. He’s not an elected official, it’s really bizarre that he’s trying to get involved in geopolitics.
Based on our past interactions, you tend to be pretty dishonest in how you respond to these things, so before you say “oh! So you’re in favor of nuclear war??????” and pretend that that’s a win, I will go on record and say “no, I do not want nuclear war”. It doesn’t change anything about what I said.
His company owned those satellites, and so was unintentionally involved whatever he did. Would he have been charged with treason (aiding and abetting the enemy) if he left it operation?
Would he be a horrible person if he didn't allow his starlink to be used to kill people?
Was there time to go through channels?
I hope to never be forced to make such a decision.
OR, and hear me out on this, he defers the decision to elected officials and/or military personnel. You know, the people who we choose to make these decisions.
This is, of course, taking Musk at his word, which I am skeptical of his truthfulness on this, but even taking him at his word makes him look bad.
Ok, so if there were time to go through appropriate channels, would you agree that it’s extremely inappropriate for Musk to be making these decisions on behalf of humanity?
> Nobody has proved that US elections are secure. In Washington State, the elections department as official policy does not verify that registered voters are citizens. A secure system would welcome audits, not prevent them.
Oh my, I really didn’t think that you of all people would start peddling election conspiracy crap.
The claim of “prove there was no election fraud” is trying to prove a negative, which is generally an impossible task. Every lawsuit by the Trump campaign to try and challenge election results was lost, indicating that the courts didn’t see sufficient evidence of voter fraud that Trump and Musk are alleging.
You know, years ago you purposefully pretended to misread some of my comments to make me seem like a nut and kept asserting that I believed in aliens visiting earth (which I don’t, and didn’t at the time either), and I thought that surely it was just a mistake in his end, and that Walter Bright is not lying.
Now I am not so sure, because frankly I really cannot believe that you don’t see how bizarre the claim of “no one has proved that the US elections are secure” actually is.
The claim of “prove there was no election fraud” is trying to prove a negative
I agree, you cannot prove a negative. You also cannot prove elections are secure. But you can make an effort to have the elections auditable.
> election conspiracy crap
"An official list of citizens to check citizenship status against does not exist. If the required information for voter registration is included – name; address; date of birth; a signature attesting to the truth of the information provided on the application; and an indication in the box confirming the individual is a U.S. citizen – the person must be added to the voter registration file. Modifying state law would require an act of the state legislature, and federal law, an act of Congress. Neither the Secretary of State nor the county auditor has lawmaking authority."
I'm reminded of a time that an intern took down us-east1 on AWS, by modifying a configuration file they shouldn't have had access to. Amazon (somehow) did the correct thing and didn't fire them -- instead, they used the experience to fix the security hole. It was a file they shouldn't have had access to in the first place.
If the intern "had no experience with the AI lab", is it the right thing to do to fire them, instead of admitting that there is a security/access fault internally? Can other employees (intentionally, or unintentionally) cause that same amount of "damage"?
From what I've seen in Amazon it's pretty consistent that they do not blame the messenger which is what they consider the person who messed up. Usually that person is the last in a long series of decisions that could have prevented the issue, and thus why blame them. That is unless the person is a) acting with malice, b) is repeatedly shown a pattern of willful ignorance. IIRC, when one person took down S3 with a manual command overriding the safeguards the action was not to fire them but to figure out why it was still a manual process without sign off. Say what you will about Amazon culture, the ability to make mistakes or call them out is pretty consistently protected.
> when one person took down S3 with a manual command overriding the safeguards
It didn't override safeguards, but they sure wanted you to think that something unusual was done as part of the incident. What they executed was a standard operational command. The problem was, the components that that command interacted with had been creaking at the edges for years by that point. It was literally a case of "when", and not "if". All that happened was the command tipped it over the edge in combination with everything else happening as part of normal operational state.
Engineering leadership had repeatedly raised the risk with further up the chain and no one was willing to put headcount to actually mitigating the problem. If blame was to be applied anywhere, it wasn't on the engineer following the run book that gave them a standard operational command to execute with standard values. They did exactly what they were supposed to.
Some credit where it's due, my understanding from folks I knew still in that space, is that S3 leadership started turning things around after that incident and started taking these risks and operational state seriously.
> From what I've seen in Amazon it's pretty consistent that they do not blame the messenger which is what they consider the person who messed up
Interesting that my experience has been the exact opposite.
Whenever I’ve participated in COE discussions (incident analysis), questions have been focused on highlighting who made the mistake or who didn’t take the right precautions.
I've bar raised a ton of them. You do end up figuring out what actions by what operator caused what issues or didn't work well, but that's to diagnose what controls/processes/tools/metrics were missing. I always removed the actual people's name as part of the bar raising, well before publishing, usually before any manager sees it. Instead used Oncall 1, or Oncall for X team, Manager for X team. And that's mainly for the timeline.
As a sibling said you were likely in a bad or or one that was using COEs punatively.
> TikTok owner, ByteDance, says it has sacked an intern for "maliciously interfering" with the training of one of its artificial intelligence (AI) models.
> He exploited the vulnerability of huggingface's load ckpt function to inject code, dynamically modifying other people's optimizer to randomly sleep for a short period of time, and modifying the direction of parameter shaving. He also added a condition that only tasks with more than 256 cards would trigger this condition.
Okay yeah that's malicious and totally a crime. "modifying the direction of parameter shaving" means he subtly corrupted his co-workers work. that's wild!
Usually I hear it in the context of a person outside the team added to an interview panel, to help ensure that the hiring team is adhering to company-wide hiring standards, not the team's own standards, where they may differ.
But in this case I'm guessing their incident analysis teams also get an unrelated person added to them, in order to have an outside perspective? Seems confusing to overload the term like that, if that's the case.
They are the same role different specialties. Like saying SDE for ML or for Distributed Systems or Clients.
you can usually guess from context but what you say is "we need a bar raiser for this hiring loop" or "get a bar raiser for this COE" or "get a bar raiser for the UI", there are qualified bar raisers for each setting.
Bar raisers for COE are those who review the document for detail, resolution, detailed root cause and a clear set of action items to prioritize which will eliminate or reduce chance or reoccurrence.
As I recall the coe tool “automated reviewer” checks cover this. It should flag any content that looks like a person (or customer name) before the author submits it.
I’ve run the equivalent process at my company and I absolutely want us to figure out who took the triggering actions, what data/signals they were looking at, what exactly they did, etc.
If you don’t know what happened and can’t ask more details about it, how can you possibly reduce the likelihood (or impact) of it in the future?
Finding out in detail who did it does not require you to punish that person and having a track record of not punishing them helps you find out the details in future incidents.
But when that person was identified, were they personally held responsible, bollocked, and reprimanded or were they involved in preventing the issue from happening again?
"No blame, but no mercy" is one of these adages; while you shouldn't blame individuals for something that is an organization-wide problem, you also shouldn't hold back in preventing it from happening again.
Usually helping prevent the issue, training. Almost everyone I've ever seen cause an outage is so "oh shit oh shit oh shit" that a reprimand is worthless, I've spent more time a) talking them through what they could have done better and, encouraging them to escalate quicker b) assusaging their fears that it was all their fault and they'll be blamed / fired. "I just want you to know we don't consider this your fault. It was not your fault. Many many people made poor risk tradeoffs for us to get to the point where you making X trivial change caused the internet to go down"
In some cases like interns we probably just took their commit access away or blocked their direct push access. Now a days interns can't touch critical systems and can't push code directly to prod packages.
No. The majority of teams and individuals are using it as intended, to understand and prevent future issues from process and tool defects. The complaints Ive heard are usually correlated with other indicators of a “bad”/punitive team culture, a lower level IC not understanding process or intent, or shades of opinion like “its a lot of work and I dont see the benefit. Ergo its malicious or naive.”
I worked at aws for 13 years, was briefly in the reliability org that owns the COE (post incident analysis) tooling, and spent a lot if time on “ops” for about 5 years.
Precisely, if you ship if, you own it. So ownership isn’t the individual but rather the team and company. Blaming a human for an error that at least another engineer likely code reviewed, a team probably discussed prioritizing and eventually lead to degradation is a poor way to prevent it from happening again.
There is a huge difference between someone making a mistake and someone intentionally sabotaging.
You're not firing the person because they broke stuff, you are firing them because they tried to break stuff. If the attempt was a failure and caused no harm, you would still fire them. Its not about the damage they caused its that they wanted to cause damage.
Large powerful groups lying to save face is not a feature of communism, sadly. Stories about the CIA, FBI, and PG&E caught trying to do so come to mind, among others.
They were just fired, not put in prison or sued. Getting fired is a typical capitalist punishment, I'd bet way more engineers gets fired for mistakes in USA than China.
But for damaging company assets on purpose firing is only first step.
I do not see any mention of other legal action and article is shallow.
It might’ve been that someone in command chain called it “malicious” to cover up his own mistakes. I think that is parent poster point while writing out Amazon story.
Maybe, but without any other info, i kind of have to take the info provided at face value. Like obviously if the article is inaccurate the whole situation should be viewed differently.
I worked at AWS for 13 years. I did “aws call leader” for 7 years, and worked in the reliability org when we rebuilt the coe tool. Ive personally blown up a service or two, and know other PEs whove done the same or larger.
Ive never heard of an individual being terminated or meaningfully punished for making an earnest mistake, regardless of impact. I do know of people who were rapid term’d for malicious, or similar, actions like sharing internal information or (attempting to) subvert security controls.
On the whole I did see Amazon “do the right thing” around improving process and tools; people are a fallible _part_ of a system, accountability requires authority, incremental improvements today over a hypothetical tomorrow.
PAM debacle (17Q4) in Device Econ is a counter example.
And that wasn’t even a mistake the SDEs made — they were punished for the economists being reckless and subsequently bullied out of the company, despite the SDEs trying to raise the alarm the whole time.
Is that devices as in digital/alexa land? Never had too much overlap there. AWS and CDO were discrete for incident and problem management after ‘14 or soz
Yeah — my point was Amazon is very large and standards vary. I won’t pretend I know the whole picture, but I’ve seen retaliation against SDEs multiple times.
I’ve heard mixed things about CDO, positive things about AWS, but where I worked in Devices and FinTech were both wild… to the point FinTech (circa 2020) didn’t even use the PRFAQ/6-pager methodology. Much to the surprise of people in CDO I asked for advice.
I think this is an important distinction and the answer is that it is hard to distinguish. People often bring up the Simple Sabotage Field Manual in situations like these and I think there's something that is often missed: the reason the techniques in here are effective is because they are difficult to differentiate from normal behavior. This creates plausible deniability for the saboteur. Acting too hastily could mean losing someone valuable for a genuine mistake. I'm saying I agree with the Amazon example. (You can also use saboteurs to your advantage if you recognize that they are hunting down and exploiting inefficiencies, but that's a whole other conversation)
But my understanding of this case is that the actions do not appear like simple easy to make mistakes. As I understand, the claim was that the intern was modifying the weights of checkpoints for other peoples' training results in an effort to make their own work better. Mucking about in a checkpoint is not a very common thing to do, so should make someone suspicious in the first place. On top of this it appears he was exploiting weaknesses and injecting code to mess with peoples' optimizers, and to do things that do not have a reasonable explanation for.
So as far as I can tell, not only was he touching files he shouldn't have been touching (and yes, shouldn't have had access to), he was taking steps to bypass the blocks there were in place and was messing with them in ways that are very difficult to explain away with "I thought this might be a good idea." (Things that explicitly look like a bad idea). If that is what in fact happened, I think it is not a reach to claim intentional sabotage. Because if it wasn't, then the actions are represent such a level of incompetence that they are a huge liability to anyone within reach.
It was one of the STEP interns that took down Google prod by modifying some config file by putting something erroneous into an automated tool. Everyone at the company was locked out, and someone had to physically access some machines in a datacenter to recover.
Malicious intent to be precise. Well-intentioned attempts to demonstrate issues for the purposes of helping to fix should generally not be punished, unless there is a wider fallout than expected and that can be attributed to negligence.
No. That was operational modification of system state using existing tools. The “miss” was an intended subset filter that was not interpreted correctly.
> an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.
As of a while back that entire state management subsystem, which dates from the very beginning of AWS, has been replaced.
Source: me. I was oncall for (some of) the incident management of that event.
> If the intern "had no experience with the AI lab", is it the right thing to do to fire them, instead of admitting that there is a security/access fault internally?
This wasn’t an accident, though. The intern had malicious intent and was intentionally trying to undermine other people’s work.
This isn’t a case where blameless post-mortems apply. When someone is deliberately sabotaging other people’s work, they must be evicted from the company.
afaik this was intentional in that they stopped training runs and changing parameters for other employee training runs, and even joined in on the debugging group trying to solve the "issues".
It's a Chinese company, saving face is far more important for them than "teaching lessons" to anyone, particularly employees who are probably considered expendable.
I always laugh when I see these predictable comments about "face" when talking about Asian companies, like they are so beholden to their culture they can't make individual judgments.
I wonder if we applied this culture talk to Western companies how funny it would sound.
The reason Facebook is firing so many people is because individualism "is far more important for them than 'teaching lessons' to anyone, particularly employees who are probably considered expendable."
Kindly, a lot of people are upset about my comment because they're American and have never worked with (or in particular, for) Chinese professionals. Losing face plays a very different role for them than mere embarrassment which is the closest analog in western contexts. For example read this[1].
Individualism does explain many aspects of American culture as do other cultural traits, such as puritanism, a focus on hypocrisy, etc. Face is just one of those aspects of Chinese culture westerns don't really understand unless they've had exposure to it. It however explains many things about modern Chinese culture that are hard to fathom otherwise.
For better or worse, when you have more time to learn how the real world works and make the right connections with the right people, you get much more leeway in what you can get away with.
Naturally, older people had more time to do that than younger people. This is why most young people get their shins blasted while older people just get a slap on the wrist, if they're found out.
It can give you the experience to know how careful you need to be in doing that, if only because you've lived long enough to see many be scuppered because of their failure to do so well enough.
The “reputation washing” behavior of Tian Keyu has been extremely harmful
For the past two months, Tian Keyu has maliciously attacked the cluster code, causing significant harm to nearly 30 employees of various levels, wasting nearly a quarter’s worth of work by his colleagues. All records and audits clearly confirm these undeniable facts:
1. Modified the PyTorch source code of the cluster, including random seeds, optimizers, and data loaders.
3. Opened login backdoors through checkpoints, automatically initiating random process terminations.
4. Participated in daily troubleshooting meetings for cluster faults, continuing to modify attack codes based on colleagues’ troubleshooting ideas.
5. Altered colleagues’ model weights, rendering experimental results unreproducible.
It’s unimaginable how Tian Keyu could continue his attacks with such malice, seeing colleagues’ experiments inexplicably interrupted or fail, after hearing their debugging strategies and specifically modifying the attack codes in response, and witnessing colleagues working overnight with no progress. After being dismissed by the company, he received no penalties from the school or advisors and even began to whitewash his actions on various social media platforms. Is this the school and advisors’ tolerance of Tian Keyu’s behavior? We expect this evidence disclosure to attract the attention of relevant parties and for definitive penalties to be imposed on Tian Keyu, reflecting the social responsibility of higher education institutions to educate and nurture.
We cannot allow someone who has committed such serious offenses to continue evading justice, even beginning to distort facts and whitewash his wrongdoing! Therefore, we decide to stand on behalf of all justice advocates and reveal the evidence of Tian Keyu’s malicious cluster attack!
Tian Keyu, if you deny any part of these malicious attack behaviors, or think the content here smears you, please present credible evidence! We are willing to disclose more evidence as the situation develops, along with your shameless ongoing attempts to whitewash. We guarantee the authenticity and accuracy of all evidence and are legally responsible for the content of the evidence. If necessary, we are willing to disclose our identities and confront Tian Keyu face-to-face.
Thanks to those justice advocates, you do not need to apologize; you are heroes who dare to speak out.
Clarification Regarding the “Intern Sabotaging Large Model Training” Incident
Recently, some media reported that “ByteDance’s large model training was attacked by an intern.” After internal verification by the company, it was confirmed that an intern from the commercial technology team committed a serious disciplinary violation and has been dismissed. However, the related reports also contain some exaggerations and inaccuracies, which are clarified as follows:
1. The intern involved maliciously interfered with the model training tasks of the commercial technology team’s research project, but this did not affect the official commercial projects or online operations, nor did it involve ByteDance’s large model or other businesses.
2. Rumors on the internet about “involving over 8,000 cards and losses of millions of dollars” are greatly exaggerated.
3. Upon verification, it was confirmed that the individual in question had been interning in the commercial technology team, and had no experience interning at AI Lab. Their social media bio and some media reports are incorrect.
The intern was dismissed by the company in August. The company has also reported their behavior to the industry alliance and the school they attend, leaving further actions to be handled by the school.
If you look at what he did it was definitely 100% actively malicious. For instance, his attack only executes when running on >256 GPUs. He inserted random sleeps to slow down training time and was knowledgeable enough to understand how to break various aspects of the loss function.
He then sat in meetings and adjusted his attacks when people were getting close to solving the problem.
Certainly looks malicious, but what on earth would be his motive? He is an MSc student for heaven's sake and this tarnishes his entire career. Heck, he has published multiple first-author, top-tier papers (two at NeurIPS and one at ICLR) and is on par with a mid-stage PhD student that would be considered to be doing outstandingly well. The guy would (is?) likely to be on track for a great job and career. Not saying he did not do what was claimed, but I am unsure about any motive that fits other than "watching the world burn".
Also, what kind of outfit is ByteDance if an intern can modify (and attack) runs that are on the scale of 256 GPUs or more? We are talking at least ~USD 8,000,000 in terms of the hardware cost to support that kind of job and you give access to any schmuck? Do you not have source control or some sort of logging in place?
Rumors said that his motivation would be to just actively sabotage colleague's work because managers decided to give priority on GPU resources to those who were working on DiT models, and he works on autoregressive image generation. I don't know what exactly was his idea, maybe he thought that by destroying internal competitors' work he can get his GPU quotas back?
> Also, what kind of outfit is ByteDance if an intern can modify (and attack) runs that are on the scale of 256 GPUs or more?
Very high. These research labs are basically run on interns (not by interns, but a lot of ideas come from interns, a lot of experiments executed by interns), and I actually mean it.
> Do you not have source control or some sort of logging in place?
Again, rumors said that he gained access to prod jobs by inserting RCE exploits (on unsafe pickle, yay, in 2024!) to foundation model checkpoints.
Thanks, that is at least plausible (but utterly stupid if true) and tells me why I would not be a good cop. Holding off further judgement on the individuals involved until we have more details.
I do understand that interns (who are MSc and PhD students) are incredibly valuable as they drive progress in my own world too: academia. But my point was not so much about access to the resources, as the fact that apparently they were able to manipulate data, code, and jobs from a different group. Looking forward to future details. Maybe we have a mastermind cracker on our hand? But, my bet is rather on awful security and infrastructure practices on the part of ByteDance for a cluster that allegedly is in the range of ~USD 250,000,000.
> my bet is rather on awful security and infrastructure practices
For sure. As far as I know ByteDance does not have an established culture of always building secure systems.
You don't need to be a mastermind cracker. I've used/built several systems for research computing and the defaults are always... less than ideal. Without a beefier budget and a lot of luck (cause you need the right people) it's hard to have a secure system while maintaining a friendly, open atmosphere. Which, as you know, is critical to a research lab.
Also,
> from a different group
Sounds like it was more like a different sub-team of the same group.
From what I heard I'd also argue that this could be told as a weak supply chain attack story. Like, if someone you know from your school re-trained a CLIP with private data, would you really think twice and say "safetensors or I'm not going to use it"?
A lot of ML outfits are equipped with ML experts and people who care about chasing results fast. Security in too many senses of the word is usually an afterthought.
Also sort of as you also hinted, you can't exactly lump these top-conference scoring PhD student-equivalents with typical "interns". Many are extremely capable. ByteDance wants to leverage their capabilities, and likely wants to leverage them fast.
Basic user separation is not asking much though, or are we expected to believe that at ByteDance everyone has a wheel bit at a cluster worth many many millions? Let us see what turns up. Maybe they had a directory with Python pickles that were writeable by everyone? But even that is silly on a very basic level. As I said in another comment, I could be wrong and we have a mastermind cracker of an intern. But I somewhat doubt it.
I think we are converging at an opinion. Internal actors can be hard to detect, and honestly there is a reason at places like Google interns are treated with heightened security checks (my guess -- they learned to do so after some years).
Btw one of the rumors has that it is even difficult to hire engineers to do training/optimization infra at one of those ML shops -- all they want to hire are pure researcher types. We can imagine how hard it will be to ask for resources to tighten up security (without one of these incidents).
That level of security is true for most big tech companies :) You mistake thinking that large and well funded = secure. They clearly have an audit trail but no preventative controls, which is sadly the standard for move fast environments in big tech.
this is closer to occam's since i think the most likely scenario here is malicious reputation damage - it's more likely someone has it out for this intern rather than this intern actually having done literally anything he's accused of
OTOH: ByteDance intern responsible for spamming your web server with crawlers that ignore robots.txt given permanent position with a raise, now in management.
One thing I suspect investors in e.g. OpenAI are failing to price in is the political and regulatory headwinds OpenAI will face if their fantastical revenue projections actually materialize. A world where OpenAI is making $100B in annual revenue will likely be a world where technological unemployment looms quite clearly. Polls already show strong support for regulating AI.
Regulation supports the big players. See SB 1047 in California and read the first few lines:
> comply with various requirements, including implementing the capability to promptly enact a full shutdown, as defined, and implement a written and separate safety and security protocol, as specified
That absolutely kills open source, and it's disguised as a "safety" bill where safety means absolutely nothing (how are you "shutting down" an LLM?). There's a reason Anthropic was championing it even though it evidently regulates AI.
Pull the plug on the server? Seems like it's just about having a protocol in place to make that easy in case of an emergency. Doesn't seem that onerous.
To be fair, I don't really agree with the concept of "safety" in AI in the whole Terminator-esque thing that is propagated by seemingly a lot of people. Safety is always in usage, and the cat's already out of the bag. I just don't know what harm they're trying to prevent anyways at all.
I'm trying to think of whether it'd be worth starting some kind of semi-Luddite community where we can use digital technology, photos, radios, spreadsheets and all, but the line is around 2014, when computers still did the same thing every time. That's my biggest gripe with AI, the nondeterminism, the non-repeatability making it all undebuggable, impossible to interrogate and reason about. A computer in 2014 is complex but not incomprehensible. The mass matrix multiplication of 2024 computation is totally opaque and frankly I think there's room for a society without such black box oracles.
Fwiw, the Amish aren’t luddites, they’re not anti-technology in all facets of life. You’ll see Amish folks using power tools, cellphones, computers, etc in their professional lives or outside the context of their homes (exact standards vary by community). There are even multiple companies that manufacture computers specifically for the Amish. So there’s no reason an Amish business couldn’t use AI.
Yes, the exact process varies by community but it generally involves church elders meeting to discuss whether a new technology is likely to benefit or harm family, community and spiritual life.
Why 2014? Why not 2022 when ChatGPT was released? Or 2019 for ChatGPT 2? Why not 2005 when the first dual-core Pentium was released? After that, the two cores meant that you could be sure what order your program would run things. Or why not 2012 when Intel added the RdRand instruction to x86? Or 2021 when Linux 5.17 was released with random number generation improvements? Or 1985 when IEEE 754 floating point was released. Before that, it was all integer math but after that, 0.1 + 0.2 =
0.30000000000000004. Not that I have any objection to 2014, I'm just wondering why you chose then.
If I was really picky I would stop the clock in the 8bit era or at least well before speculative execution / branch prediction, but I do want to leave some room for pragmatism.
2014 is when I became aware of gradient descent and how entropy was used to search more effectively, leading to different runs of the same program arriving at different results, Deep Dream came soon after and it's been downhill from there
If I were to write some regulations for what was allowed in my computing community I would make an exception for using PRNGs for scientific simulation and cryptographic purposes, but definitely I would draw a line at using heuristics to find optimal solutions. Slide rules got us to the moon and that's good enough for me.
That wasn't ChatGPT, that was GPT-2. It wasn't even designed for "chat" and was purely text completion. If you tried to ask it a question, it was a toss-up over whether you'd get an answer or just a bunch of related questions and statements, as if it was part of what a single speaker was saying.
Like, you could prompt it with "I'm here to talk about" and it would complete it with some random subject.
I don't even know if any of the well-known LLMs (Mistral, Llama, what else?)can even operate in this mode now. Seems they're all being designed for being an assistant.
SAG-AFTRA are currently on strike over the issue of unauthorized voice cloning.
The AI advocates actively advertised AI as a tool for replacing creatives, including plagiarizing their work, and copying the appearance and voices of individuals. It's not really surprising that everyone in the creative industries is going to use what little power they have to avoid this doomsday scenario.
I have read the original article as well as many pieces of additional context posted in this thread and yet still don't understand what is going on here.
Yes, the intern was actively behaving maliciously, but why? What did he stand to gain from breaking another team's training code? I don't buy that he went through all that effort and espionage simply to make his own work look better. An intern is only employed for 3 months, surely sabotaging another team's multi-year project is not the most efficient way to make your toy 3-month project look better in comparison.
Of course they will deny it, they have investors... Read the posts from the engineers - 30 people's research and large model training coming to a grinding halt for a quarter. That's easily worth billions in today's market, can you imagine if OpenAI or Google didn't report any progress on a major model for a quarter?
"maliciously interfering" does a lot of the lifting here. And if true, I hope that they didn't stop at firing him. Play stupid games, win stupid prizes. I hate the kind of entitlement that makes people feel justified to destroy huge amounts of value.
I find it weird that China has a very tight information control and simultaneously over and over again has the weirdest "netizen" rumors that go mainstream.
What's the explanation? That they are explicitly allowed for some strategical reason? Something else?
Edit: @dang: Sorry in advance. I do feel like we got some pretty good discussion around this explosive topic, at least in its first hour.
Folks, keep up the good behavior — it makes me look good.
As someone who have lived most of his life in China, I can give you some perspective.
1. There is no such thing as a single entity of government, CCP is not a person, each individual member of the party and government has his/her own agenda. Each level of government has its own goals. But ultimately it's about gaining control and privileges.
2. It is impossible to control 1.3-1.4 billion people all the time, so you make compromises.
3. The main point is: the tight control is both for and rooted from hierarchical power. To put it plainly, anything goes if it doesn't undercut CCP's control. OSHA? WTF is that lol. Law? "If you talk to me about law, I laugh in your face" says the head of a municipal "Rule of Law Office". "Don't talk to me about law this and law that", says the court. But the moment you order a picture of Winnie the pooh carrying wheat (Xi once said he carries 100kg of wheat on his single shoulder) on Alibaba, your account gets banned.
Off topic thoughts: Because CCP has total control, there is no split of power to speak of, so once they are right, they are so right; but when they are wrong, it is catastrophically wrong and there is no change of course. It's why you see 30-50 million people starve to death and an economy miracle within the same half century.
My explanation is that their tight control is an illusion. Not to get political, but the illusion of power is power, and suggesting they control billions of peoples speech is certainly an illusion of power.
China, and all other (supposedly) top-down-economies, survive only because their control is not airtight. If they were to actually have complete control, things would fall apart rapidly. “No one knows how Paris is fed” and all that.
From my work visits and sort of guarded discussions with people there: I feel like they have just accepted the inevitable. Don't ask weird questions about things you're not supposed to ask about, be pragmatic, get things gone, get rich.
There are individuals and subcultures that prioritize idealism, yes. Often they are young people. Idealistic individuals can get ground down and turned into pragmatists, but some hold onto their hopes and dreams very tightly.
I mean, one could argue that the early Soviet Union suffered from this issue. Or early revolutionary China. Cambodia is certainly an example. The french revolution might be an even better example, what with wanting to re-do the clock and calendar and such. To convert startup culture speak's "pragmatism beats idealism" into political science speak, it might come out as "rationalism has tremendous difficultly reinventing all unconscious behavior".
One could argue that the only system under which a citizen can own the means of production is capitalism. If you "own" something you can sell it, trade it, and otherwise use it as you wish. In any realistic version of communism these powers are transferred to a central authority instead.
"the kind of control you're attempting simply is... it's not possible. If there is one thing the history of evolution has taught us it's that life will not be contained."
Humans are clever and typically find workarounds given enough time/hope. Sure you could argue that this is some kind of authoritarian 4D chess/matrix scenario to let off steam for an unruly populace, or it's just the natural course of things.
Culturally, the Chinese population has more of a rebellious streak than people realize. It's a weird contrast - the Great Firewall is there but citizens and often the workers that maintain the firewall seem to circumvent it on a regular basis. Often in order just to function day to day and survive, as noted above.
Also an analogy re how the image is of communist central planning, but post Deng, it's maybe even more of a freewheeling capitalist economy in some regions than the US....(especially in Shenzhen - see Bunnie Huang's write-ups of the ecosystem/economies there)
There will be times when the struggle seems impossible. I know this already. Alone, unsure, dwarfed by the scale of the enemy.
Remember this: freedom is a pure idea. It occurs spontaneously and without instruction. Random acts of insurrection are occurring constantly throughout the galaxy. There are whole armies, battalions that have no idea that they’ve already enlisted in the cause.
Remember that the frontier of the Rebellion is everywhere. And even the smallest act of insurrection pushes our lines forward.
And remember this: the Imperial need for control is so desperate because it is so unnatural. Tyranny requires constant effort. It breaks, it leaks. Authority is brittle. Oppression is the mask of fear.
Remember that. And know this, the day will come when all these skirmishes and battles, these moments of defiance will have flooded the banks of the Empires’s authority and then there will be one too many. One single thing will break the siege.
I turn the question back at you: why do you think it would be in the interest of the Chinese state to surpress this particular rummour?
I don’t see any implication of this news which would undermine their society, or cause disruption, or make people riot. If anything it is a tepid warm “do your job correctly and don’t be too clever by half or else…” story.
China isn’t really that centralized and Zhongnanhai has less control than the White House does. Local party bosses are basically little kings and the average Chinese citizen sees less of the government than the average American does, ie one of the factors of the Chinese illegal immigration surge last year was that China basically has zero social support for pensioners or people who lost their businesses in lockdown
The thing that stuck out to me the most in the west were the long string of articles about the social credit system & the fear around the surveillance state. The surveillance state is probably about the same level as the UK, and the social credit system doesn't run anyone's lives like its described.
I've heard somewhere that the social credit system is really misrepresented in the West - it's designed to track financial scammers and people who set up fraudulent companies. It's meant to weed out untrustworthy business partners, just like how the Western credit system is designed to weed out untrustworthy bankers. (Weird how the only 'group' in the West who gets implicit protection against scams are the banks)
It doesn't really concern the everyman on the street.
The few high profile cases where it was used to punish individuals who ran afoul of some politically powerful person or caused some huge outrage are red herrings - if the system didn't exist, theyd've found some other way to punish them.
The articles functionally stated that you couldn't get an apartment, or pay for a hotel room if you were caught jaywalking or walking around with a scowl on your face.
If people get to read shocking rumors, they don't feel that their information access is so censored, after all? I could see that at least partially working.
"It's just some dangerous information that is censored."
Tight information control means that rumors are often the best source of information so people are more engaged in the rumor mill. Same thing happened in the Soviet Union.
I’ve spoken extensively about this with people from China.
If something is totally forbidden, that holds.
However, the government doesn’t want people to feel oppressed beyond the explicitly forbidden.
What happens instead is, if it’s unfavorable but not forbidden, it will be mysteriously downvoted and removed, but if it keeps bubbling up, the government says “okay clearly this is important to people” and leaves it up.
This happened with some news cases of forced marriage in some rural mountain regions, and the revelation that a popular WeChat person (like YouTuber) was involved with one of the families.
Both can be true in a country with over 1 billion citizens, through shear volume of individuals talented/determined enough to bypass information control.
China does have a tight information control but it may not be what you think it is.
All communication software (QQ/Wechat are the two most used) have sort of backend scanner that detects topics that are in the "in-season" blacklist and ban accounts accordingly. No one knows what the list is so people could get banned for random reasons, but in general bashing current policies or calling out names of the standing members of Politburo is the quickest way to get banned -- and in many instances also got the Wechat group banned.
On the other side, surprisingly, there are many contents that are apparently inappropriate floating on the social media without getting banned. This also throws people off feet.
What I gathered is:
- Don't shit on current party leaders. Actually don't discuss current politics at all. AIs don't always recognize contents correctly so you could be banned for supporting one side or desisting it at the same time.
- Don't ever try to call up other people to join whatever an unofficial cause, whatever it is. Like, even if it's purely patriotic, just don't do it. You do it and you might go to prison very quickly -- at least someone is going to call you to STFU. Grassroot movements is the No.1 enemy of the government and they don't like it. You have to go through official channels for those.
This leads to the following conclusion:
Essentially, the government wants as much control as possible. You want to be patriotic? Sure, but it has to be controlled patriotic. You can attend party gathering to show your patriotism, but creating your own, unofficial gathering is a big No. They probably won't put you into a prison if the cause is legit, but police are going to bug you from time to time ->
IMO this is how the CCP succeed. It has successfully switched from an ideologic party to an "All-people" party. It doesn't really care about ideology. But it wants to assimilate everyone who potentially can be out of control. If you are a successful businessman, it will invite you to participate in political life. If you are an activist who can call up thousands of people, it wants you in. It is essentially, a cauldron of elitists. It has nothing to do with "Communism". It is essentially, GOP + DEM in the US.
Thanks. I felt like things must have progressed from my last sort of insider view from 12 years ago when my company's China subsidiary received weekly meetings from officials to discuss things that needed to be addressed.
"Item number 12. We feel like this URL is hurtful to the Chinese people"
You are welcome. I probably don't know the full picture though, but I think the biggest difference between now() and now() - 12 YEAR is that digital surveillance is way more advanced. Other than that, I don't think the logic changes. CCP has been learning from USSR's experience and successfully converted itself away from an ideological party many years ago. It started around the early 90s and took about a couple of decades for it to happen.
I'd say China doesn't have particularly tight_er_ information control than other places, they're using the same tools everyone else is using (keyword/hashtag bans, algorithmic content demotion, "shadowbans" of responses, and outright content removal etc.)...
It's mainly just that there's more politically motivated manipulation... versus in the west where those tools would be used on things like copyright infringement, pornography, and misinformation etc.
The house just passed $1.6B spending bill for the production of anti-china propaganda. This isn't necessarily a result of that, but I'd imagine some of the weird rumors you hear are manufactured by US intelligence/state dept.
(6) to expose misinformation and disinformation of the Chinese Communist Party’s or the Government of the People’s Republic of China’s propaganda, including through programs carried out by the Global Engagement Center; and
(7) to counter efforts by the Chinese Communist Party or the Government of the People’s Republic of China to legitimize or promote authoritarian ideology and governance models.
——-
Feels like the defense sector is determined to make China a perpetual enemy.
It’s a real drag. We need to step up competence, not fight a war. Viewing China as an enemy vs a strategic competitor leads to bad policy. Like it is killing ASML right now…
But it's relatively easy for China/CPC to squash them if they really want to. Western media is even reporting on changes in particular keyword censorship.
Call me paranoid..."paranoid." but this could be a good way for ByteDance to redirect blame to others when they do something in the future that people don't like. "It was a rouge employee and we fired them"
It's legit. Read the malicious changes he made to the code and read the posts from the researchers.
And sorry, people are not "gullible" for disbelieving the media. I have worked at most big tech companies and the media misreports so badly on easily verifiable things in my area of expertise, that I no longer trust them on much. https://en.m.wikipedia.org/wiki/Michael_Crichton#Gell-Mann_a...
(via https://news.ycombinator.com/item?id=41906970, but we merged that thread hither)