Hacker News new | past | comments | ask | show | jobs | submit | sheepscreek's comments login

> The most promising idea is to use reasoning models to generate data, and then train our non-reasoning models with the reasoning-embedded data.

DeepSeek did precisely this with their LLama fine-tunes. You can try the 70B one here (might have to sign up): https://groq.com/groqcloud-makes-deepseek-r1-distill-llama-7...


Yes, but I meant it slightly differently than the distills.

The idea is to create the next gen SOTA non reasoning model with synthetic reasoning training data.


Update: Scratch that. Two of them together would only be able to run something half as big (~400B parameters) and cost as much as this rig. Maybe the next gen of DIGITS could do it. Keeping that in mind, this rig is pretty darn impressive for $6k!

This is going to drop by half when Nvidia starts shipping DIGITS. I think we’re all going to want one. It’ll probably have a much bigger impact than Apple VisionPro, that costs the same.

I can already think of using it as a much more intelligent local Siri/Alexa to control devices. It’s something that can actually keep the kids engaged with useful trivia/knowlege (better than watching mindless trash on YT) or it can just humour me whenever I want - all without needing to worry about privacy.


There’s the whole capital expenditure vs operating expenses angle too, and depending on a company’s particular situation, one might look better on paper than the other. Without going into too much detail, contractors will be hired typically to contribute to capital expenditure and employees to the latter.

This distinction is even more relevant for earnings. So companies will optimize this for taxation and accounting to win shareholder brownie points.


I am wondering whether a company "optimizing for shareholder brownie points" is a good signal to either look for employment elsewhere or as an investor start investing elsewhere. It seems like a company who prioritizes this either has reached their potential (which might be fine) or is just not able to innovate anymore.

A simple question to ask an employer during an interview is whether the company is profitable or not. If so, for how long?

> A simple question to ask an employer during an interview is whether the company is profitable or not. If so, for how long?

This is great advice.

For instance, I was once in an interview where they were grilling me. I was reluctant to do the interview in the first place, because they'd gone bankrupt TWICE in the past five years.

At the end of the interview, it seemed fairly clear that my odds of getting the job were about 50/50. The interviewers were smart and they were asking hard questions.

But when I asked them to comment on their two recent bankruptcies, it changed the mood entirely. At that point, the entire "vibe" of the interview shifted. It became CLEAR that they'd been losing employees at a furious pace, because of their financial struggles.

Once we talked about "the elephant in the room," the entire interview tone changed, and they made me an offer in less than twelve hours.

My "hunch" is that they'd been grilling interviewees (because they were smart folks) but had been scaring interviewees off because they were in such terrible financial shape.

Basically, potential hires were ghosting them because of their financial problems, while they were simultaneously discussing technical issues when the real issue was financial.

I accepted the offer, and the company is still around. I had a similar interview experience at FTD in San Diego (the florist), and they are kaput:

https://www.cnbc.com/2019/06/03/flower-delivery-company-ftd-...


I'm in a VC-owned business with a 50% profit ebitda. But a common trick is to just load it with debt. The VC firm pays out all profits as dividends, all investments into restructuring, M&A and new technology is paid for by high-interest loans from the shareholder. What's left is a company that barely cashflows as all profit goes towards paying interest to the VC firm.

The appointed management team has to operate within that scope (i.e. no real budget to work with, despite the 50% interest), and they squeeze a bit more each year, meaning it's an uphill battle each year to get a raise or promotion. On top of that it's a cashcow in an otherwise dying and slowly shrinking business sector.

In other words a terrible place for general salary growth.

So I'd add two points to your list which is to: look for (1) profitable companies, (2) in expanding markets, (3) that aren't owned by VC.

Startups have their own set of rules where (3) doesn't really apply as much.


Most VC backed private companies aren’t profitable. If it is a public company the information is readily available

Sure, and then there's all the private companies backed by non-venture capital, and the profitable ones running on revenue.

You don’t find too many profitable “lifestyle companies” in tech.

There are plenty of mid-size tech companies that are both not-public and not-lifestyle.

My employer is one of them. Several thousand employees, global reach, and owned by PE (Blackstone and Vista).


You stated that there aren't many profitable lifestyle companies. And the insinuation put forth is that they are very rare to the point of almost nonexistent.

This comes off as rather reductionist and absolute to me; tech is a massive industry, do you know every sector within and adjacent to tech to have reached this conclusion?


No. But I do know statistics. The largest employees in tech are the public companies that we have all heard of. The next largest segment are VC funded companies with the smallest segment by far being the “lifestyle companies”.

Do an exercise, go to any job board and put in filters to match the types of jobs you are qualified for. How many of those do you think are going to be profitable, private, lifestyle companies?


I would put money on all of big tech and all public companies combined not employing more than 30% of professional programmers. At least in the US only 15% work at a large company (500+).

I didn't say anything about '"lifestyle companies" [sic]'. I don't even think we were talking just about 'tech'?

Those are the companies he meant by "public companies", ie publicly traded not government owned.

I don't think by 'public companies' they meant 'private companies', no.

There's still a question of what you consider profitable.

A company may make more in revenue than strictly expenses but stock-based compensation is often not considered an expense so if you add those into the expense side it could change profitability.


Stock-based compensation is absolutely considered an expense under US GAAP.

Which is why companies report non-GAAP numbers.

https://abc.xyz/assets/71/a5/78197a7540c987f13d247728a371/20...

> We provide non-GAAP free cash flow because it is a liquidity measure that provides useful information to management and investors about the amount of cash generated by the business that can be used for strategic opportunities, including investing in our business and acquisitions, and to strengthen our balance sheet.


But honestly, profitability doesn’t matter. All of the major tech companies were profitable and still had tens of thousands of layoffs between them.

Layoffs in big tech are mostly to place workers in their place and shake the market, they've definitely been able to drive down salaries these past two years.

Yes - I think layoffs are also backlash against WFH.

Employees were getting a bit too uppity.


That would be a red flag to me.

Companies that make a shit ton of money generally don't like changes.

They're just looking for the next fool to squeeze.


Large US companies that I’ve worked with or for do this as a SOP. It’s not a calculation being done at the hiring manager level as much as a path of least resistance because that’s the way it’s been done for so long.

> contractors will be hired typically to contribute to capital expenditure

You know, operational expenses are the ones that get an immediate tax break, and capital expenditure the ones with a depreciation period.

Changing the expenses that way can only increase the company's tax payments. The only reason one could possibly want to make that change is if they want to fraudulently show the money paid for the contractors as earnings.


This is exactly what has changed [1]: R&D costs had been an immediate tax break, but since 2022 became an expenditure requiring a 5-year amortization period.

That change had been planned to be canceled before coming into force, but it was not canceled on time.

Hence the wave of layoffs in 2022, as companies were urgently trying to improve their balance sheets, as investors and the Wall Street requested, AFAICT.

[1]: https://www.corumgroup.com/insights/major-tax-changes-us-sof...


> to fraudulently show the money paid for the contractors as earnings

Bingo. That's the main reason to shift opex to capex.


If you have time, how can capex, an expense, appear as earnings? (I'm pretty clueless about these things)

Aha, it's that: "Opex is subtracted from earnings before public reporting and before taxes" (I see in other comments here),

but capex is not subtracted, so then it looks as if the company is doing better, on paper, although it's not. And this works only for a while, maybe some years? Which might be long enough for the current management, if they leave before things get too bad?


"Capex expenses" are investments. And the investment is one of the two things a company may do with profits, the other is paying dividends.

Just by classifying something as capex, it's automatically classified as profit already.


Thanks!

Most of the people in charge of making these kinds of decisions are not that smart.

Without going into too much detail, contractors will be hired typically to contribute to capital expenditure and employees to the latter.

That doesn't make any sense. In any situation in which a contractor expense would be capitalized, an employee's salary would also be capitalized. Labor costs are labor costs; whether someone is a contractor or an employee is a labor law issue, not a tax issue. (Internal R&D was the big exception to the capitalization rule, but that loophole was closed, which is what prompted a lot of tech and videogame layoffs over the past 2 years.)


Can you explain more how paying double for a contractor for tax reasons saves the company money? Or is this all some nonsense setup by the company to shuffle the numbers to look superficially better for a specific metric?

To my understanding, it's the latter.

"We spent 1B in one-off costs for increased future growth" is a much happier story to investors than "we have recurring costs of 1B", put simply, even if the actual recurring cost number is worse.

(There's also some complexities in some industries around money from, say, grants, which you can only spend on certain types of expenditures...)


It’s all about accounting for the spend. Wall Street often looks at Capital Expenditures as a sign of growth or at least net neutral, but they view Operating Expenses as negative. If you can reduce your operating expenses by 200k, but increase your capital expenditure by 400k, you’ve reduced overall profit in order to increase growth potential because your investing 400k into new stuff that will bring in more revenue.

This strategy cannot work long term unless there is growth happening elsewhere in the company to make up for the excess money burned on contractors and reduced number of employees. But it can definitely work short term if the growth numbers for the quarter are going to look bad, and it has the benefit of giving management someone else to blame when the project work doesn’t get done.

If your company starts replacing employees with contractors, that’s a bad sign.


That might be it, this company was obsessed with CAPEX vs. OPEX. Everything was always put into the context of CAPEX or OPEX. OPEX being bad and CAPEX good.

Wait, when did that change? I thought the prevailing wisdom in our industry is that CAPEX sucks, OPEX rules. I understdood that's what's driving SaaSification of everything - replacing some internal tool and labor with a SaaS is literally turning CAPEX into OPEX, and it was supposedly what the investors liked.

The only real difference is tax treatment. Opex is subtracted from earnings before public reporting and before taxes. So opex are more tax-efficient, but they lower your reported earnings.

>Can you explain more how paying double for a contractor for tax reasons saves the company money?

This may vary due to region. For example in the U.S where you can fire people quickly the contractor benefit is less apparent, but in EU where after a short period you may have to spend a long time to fire someone it may be beneficial to hire a contractor rather than going through a lengthy hiring process only to find out you want to fire them.

Contractors in such an environment often are a reasonable investment for a project that has a particular dedicated timeline. Like we expect 1 year for project to finish. We hire for 1 year, and opportunity to extend for 3 months 2 times in case it goes bad.

Otherwise you have to hire for project and then do these layoffs everybody here is complaining about.

Furthermore in EU if you are paying 10000 for an employee, you probably have extra fees on top of that so you are paying 14000 (estimation) then for contractor you are not paying 28000, but 20000. The pricing is not great, but there are lots of factors that can make it seem more attractive than it might appear on its face.

Finally, Contractors tend not to do any of this quiet quitting or whatever, probably because for them it is more a business and they are also earning significantly more that makes it an interesting business to be in and to maintain.


In my experience long time contractors will absolutely "quiet quit" if put into the same catch-22 situations that push employees to do this.

The main difference at least in my region is that if you're a contractor then it's much quicker for you to quit and find a better job so the incentive to stay isn't as strong. In other words, tech workers who become contractors here usually are better contributors and have an easier time finding good offers.


You can give workers temporary contracts and extend them as you see fit. None of what you are saying makes any sense to me.

Also, I will repeat this as many times as possible: you can fire employees in Germany exactly the same way you can fire employees in the US. You just need to follow the damn law. You need to give your employee a WRITTEN letter of termination, to make the termination legally binding. Then all you have to do is give them notice (or pay the salary out immediately if you want to get rid of them immediately).

Paying double so you can fire contractors is illogical. The maximum amount of notice you can be legally entitled to is 7 months, after working 20 damn years at a single company, which means at worst the company would have to pay half your salary out a single time to get rid of you immediately. None of this 2x every year multi-year bullshit.

The reason why you hire contractors is that you do not need the full output of an employee. You might only need three months or maybe just a week. It's the same reason companies rent equipment instead of buying.


> you can fire employees in Germany exactly the same way you can fire employees in the US. You just need to follow the damn law.

That is overly simplified. First, you have to commit to one of three types of layoffs, only one of which usually is applicable (betriebsbedingte Kündigung). But if you do that you have to consider the social circumstances of the employee and also other comparable employees. Which absolutely can result in not being able to fire the employee you would like to fire without also firing a number of other employees first. That could be really disruptive, so it is not quite so easy for German employers.


> The maximum amount of notice you can be legally entitled to is 7 months

I believe the maximum amount of notice you can be legally entitled to as a contractor is whatever your contract says


- an employee is an "expense" that bogs down your money-machine.

- a contractor provides a "service" that improves your money-machine output.

(or so it's said).


Then simply fire all the employees and hire contractors!

Or as wind-up to a merger /acquisition.

Readability? It is crucial for indicating which code belongs to specific blocks to help the compiler/interpreter and humans alike. There are primarily two methods for structuring code to achieve this: rigid indentation rules, as seen in Python, or the use of brackets. Elixir and Ruby use end blocks, which essentially serve as a substitute for curly braces.

Similarly, when considering JSON versus YAML or TOML for human readability, the same principles apply. Personally, I find indented code more readable, but this can be easily attained in most languages with universal code formatting rules and intelligent code editors that highlight different elements to improve readability.


It’s crazy that we have totally opposite views here.

I find code block which are delineated by brackets so much more readable than ones delineated by invisible characters. Especially at a glance

I don’t love do..end either, but still prefer them to arguing with Python about the level of indenting not being correct or that spaces were used somewhere somehow.

> this can be easily attained in most languages with universal code formatting rules and intelligent code editors that highlight different elements to improve readability.

Ofc, none of this is a practical issue, but that’s what makes it a fun topic of discussion ;)


My biggest issue with tabs over curly braces is that it makes copy / pasting code insanely difficult. That alone makes tabs a bad choice.

Why not both? Approximately no one uses braces without indentation, and that's trivially enforceable with a code formatter. But I personally find that the redundant signal of the braces makes things much more readable than in the languages that lack that redundancy!

Every human communication mechanism includes deliberate redundancy to ensure that the meaning gets through even in suboptimal situations. Programming languages are somewhat unique in having a strong movement towards eliminating all redundancy.


How would you compare it to Ubunto or Fedora on WSL?


It's closer to Red Hat Enterprise Linux - polished, stable and boring. I can't comment on UI, since I use it only via WSL.


I get much better output from o1* models when I dump a lot of context + leave a detailed but tightly scoped prompt with minimal ambiguity. Sometimes I even add - don’t assume, ask me if you are unsure. What I get back is usually very very high quality. To the point that I feel my 95th percentile coding skills have diminishing returns. I find that I am more productive researching and thinking about the what and leaving the how (implementation details) to the model - nudging it along.

One last thing, anecdotally - I find that it’s often better to start a new chat after implementing a chunky bit/functionality.


Yes, I've tried out both: ordering it to ask me questions upfront, and sometimes restarting with an edited 'report' and a prototype implementation for a 'clean start'. It feels like it sometimes helps... but I have no numbers or rigorous evidence on that.

Spot on. The whole world is in a constant state of SNAFU. That is the sole truth I’ve understood from many years of pondering and soul searching. To make sense of this world (and the universe) beyond our limited means of comprehension is a futile attempt. Everything is relative - compassion, love, hurt, morality. One being’s loss is anothers’ blessing (food).

I’m also not so hopeful about humans. A simple example: the way we mass produce and slaughter animals for food is beyond cruel. It’s impossible to not be on the wrong side of some argument if you look hard enough. No matter how ethical you are. I am honestly convinced the world just is. We give meaning to things to live as a civilization (and avoid conflict) but the universe/nature/world probably doesn’t give a shit.


In case I wasn’t clear, I asked ChatGPT to explain what I wrote above. I think it did a pretty good job:

This reflection expresses a deeply existential and relativistic view of the world. Here’s a breakdown:

1. SNAFU as the Status Quo: The term SNAFU (Situation Normal, All Fouled Up) suggests that chaos, dysfunction, and disorder are inherent to the world. It implies acceptance that imperfection and unpredictability are the norm, not the exception.

2. Limits of Comprehension: The belief that trying to make sense of the world or universe is futile arises from recognizing the limitations of human cognition and perspective. This acknowledges that meaning, morality, and truth are often constructs rather than absolutes.

3. Relativity of Values: Concepts like compassion, love, morality, and even suffering are seen as subjective. They are shaped by context and perspective—what benefits one being might harm another, exemplified by the food chain.

4. Critique of Human Ethics: The example of industrial animal farming underscores a broader disillusionment with humanity’s moral contradictions. Even well-intentioned actions can have unintended, unethical consequences, leading to the view that absolute moral “rightness” is unattainable.

5. Indifference of the Universe: The idea that the universe “doesn’t give a shit” challenges the anthropocentric tendency to project meaning onto existence. This highlights the randomness and indifference of nature, leaving humans to assign purpose to their actions as a way of coping.

6. Civilization as a Construct: Meaning and morality are framed as tools for societal cohesion and conflict avoidance, rather than intrinsic truths. This echoes a nihilistic perspective where life “just is,” devoid of inherent purpose or universal justice.

It reflects a worldview shaped by realism, skepticism, and existential questioning, wrestling with the contradictions of human morality and the indifferent nature of existence.


I actually love doing this, and then having ChatGPT alternatively criticize and support it preferentially leaning towards established fields. I.e. "This thing you've said is a deeply researched and discussed topic named X". Thanks for sharing.


It’s puzzling. Just the other day, it was reported that their power surplus was so significant that rates were even negative. So, why are end-users still paying high prices? Is it because of the distribution costs?


It is because of merit order. The most expensive generator of energy sets the price for all the energy.

https://en.wikipedia.org/wiki/Merit_order


If rates go negative this isn't relevant.

You can get dynamic contracts based on the day ahead price in Germany. You pay a negative price for the electricity there. You still have to pay a transmission fee and taxes, so the actual end price is almost never negative though.

Of course sometimes you also pay much more than normal contracts.

Also without merit order, you'd still pay close to the most expensive source on the spot market. It's a consequence of the open grid where demand and supply has to match and price is not fixed.


Distribution is one of the costs. For discussion lets call it 10% of your power bill (someone can find real numbers, but that will work if like me you are too lazy to look it up). However there are other costs - you pay to maintain the backup systems that are not in use. I know of a generator installed in 1920 connected to a coal boiler built in 1880 - this horribly inefficient system (by modern standards) takes about 12 hours to turn on when needed - but gets regular maintenance including pressure testing just to be sure it works for the one time every 5 years when all the other backup systems are not enough - that is not free and you pay for the costs even though you rarely get power from it. (well you probably live in a different part of the world and so could not get power from it - but there is something like it that you are paying for)


There's a big demand-supply mismatch throughout both the 24hr and the 365d cycle, while most consumer prices are flat throughout the day and the year. Solar panel owners are basically using the grid as a big battery, and miffed that electricity storage doesn't come cheap. (They'll discover the same thing if they buy home batteries to store their self-generated power.)


Actually, this is not the case in Germany, as we do not have net metering (feeding into the grid does not give you as much money as consuming from the grid costs).


The high prices people pay is the cost of having power when it’s not windy/sunny


The prices were already high before wind and solar power came to be (and for quite many years grid prices where among the lowest in EU, while consumer prices where among the highest). The high consumer prices for electricity in Germany are a result of grid fees, taxes, taxes on grid fees, taxes on taxes, and a lack of political will to lower those.


I live in the UK where we people point at the rest of Europe and say "why are our prices so high here when we have 50+% of our power coming from wind".

> The high consumer prices for electricity in Germany are a result of grid fees

Grid fees are a coverall for "making sure the whole thing runs", which in my mind covers "making sure there's gas (at least that's what we use in the UK) to cover the shortfall when it's not windy". The Gas stations need to be able to spin up to handle the demand, so it's an abritrage where you really want to maximise the amount of wind you use, while paying as little overhead to keep the gas ready to go. It's like being on-call, and having your on-call pay amortized into your regular pay.


It's a give and take between profits and taxes, a wonderful symbiosis of neoliberal politics and the oligopoly.


Prices fluctuated a lot based on weather and demand. Seeing order of magnitude fluctuations day to day is not uncommon. Prices can be close to zero one day and hit record highs 24 hours later.


The developers they are referring to aren’t just enthusiasts; they are also developers who were purchasing SuperMicro and Lambda PCs to develop models for their employers. Many enterprises will buy these for local development because it frees up the highly expensive enterprise-level chip for commercial use.

This is a genius move. I am more baffled by the insane form factor that can pack this much power inside a Mac Mini-esque body. For just $6000, two of these can run 400B+ models locally. That is absolutely bonkers. Imagine running ChatGPT on your desktop. You couldn’t dream about this stuff even 1 year ago. What a time to be alive!


The 1 PetaFLOP spec and 200GB model capacity specs are for FP4 (4-bit floating point), which means inference not training/development. It's still be a decent personal development machine, but not for that size of model.


This looks like a bigger brother of Orin AGX, which has 64GB of RAM and runs smaller LLMs. The question will be power and performance vs 5090. We know price is 1.5x


How does it run 400B models across two? I didn’t see that in the article


> Nvidia says that two Project Digits machines can be linked together to run up to 405-billion-parameter models, if a job calls for it. Project Digits can deliver a standalone experience, as alluded to earlier, or connect to a primary Windows or Mac PC.


Point to point ConnectX connection (RDMA with GPUDirect)


Not sure exactly, but they mentioned linking to together with ConnectX, which could be ethernet or IB. No idea on the speed though.


How does it matter if it’s going to be diluted anyway? Investors will get their post-money 30% equity, or whatever it is, and rest of the shareholders will keep their proportion of the 70% (and ~50% pre-money gets diluted to ~35% post-money).


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: