Hacker Newsnew | past | comments | ask | show | jobs | submit | enragedcacti's commentslogin

Predictable does not necessarily follow from deterministic. Hash algorithms, for instance, are valuable specifically because they are both deterministic and unpredictable.

Relying on model, seed, and hardware to get "repeatable" prompts essentially reduces an LLM to a very lossy natural language decompression algorithm. What other reason would someone have for asking the same question over and over and over again with the same input? If that's a problem you need solve then you need a database, not a deterministic LLM.


It depends. There are a whole bunch of weird complex financial interactions between the mfg, the dealer, and the loan provider (who is often also an arm of the manufacturer). There can definitely be situations where the dealer makes off better by getting you into a loan even though the loan provider is almost sure to lose money on it.


The Corolla hybrid is only $1500 more than the base model and gets 50MPG combined vs 35MPG. The break even for 15k miles/year is 2.5-4.3 years given the highest and lowest US prices as of today (California@$4.59, Texas@$2.70).


Not really. Toyota dealers are filth. The ones around here were trying to sell my neice one for $5000 over sticker as a “market adjustment”.

It’s better in other regions, but you couldn’t pay me to buy a Toyota.


you and Retric are using different definitions of "series". A "series hybrid" is a specific term describing a design that uses an ICE engine to generate electricity that powers an electric motor. This design replaces the transmission completely because the ICE rpm doesn't need to be matched to the wheel speed and the electric motor has a much wider RPM range.

Many series hybrids do have a way to power the wheels directly with the engine at highway speeds but it's generally much simpler than a full transmission. Most Honda hybrids for instance have a single clutch that connects the ICE to a "6th" gear.

> You keep bringing up transmissions when the main point is related to the ICE.

less parts -> more reliability


Ah, ok. I didn’t realize “series” is a specific term of art in the EV space. Thanks for clarifying. I was using it in the pure reliability domain sense (similar to the use in electrical circuits)

>less parts -> more reliability

This is the general heuristic but only true if the components in each system are equally reliable (and specific to the original claim about cost of ownership, equal in cost). I don’t think that’s true, and am asking for a nuanced breakdown.

For example, the hybrid ICE may be more reliable for good reasons (eg consistent RPM). Or the traditional battery may have half the reliability, but 1/50th the cost. All of that factors into cost of ownership.


> I don’t think that’s true, and am asking for a nuanced breakdown.

In my experience this kind of nuanced info is unfortunately pretty hard to come by. MFGs know it but have no interest in sharing it. Same for taxi operators (though the number of hybrids in taxi fleets is pretty staggering). Fleet operators usually only look at the first 5 years so longer term maintenance and repairs aren't studied all that rigorously. That said, here's a 5-year fleet TCO analysis where HEVs on average were 6k cheaper than ICE: https://www.afla.org/news/692431/The-Hybrid-Value-Propositio...

Also, here is an analysis from 2016 showing that the 2005 Prius had the lowest 10 year maintenance cost of any model. Toyota had only been making hybrids for 7 years at that point. That level of reliability for a new technology is pretty impressive: https://www.greencarreports.com/news/1104478_toyota-prius-hy...

> (and specific to the original claim about cost of ownership, equal in cost).

speaking to this piece, it can be hard to gauge because its not all that common for companies to sell very similar trims in hybrid and non-hybrid. The two PSD hybrid examples off the dome are the corolla which is +$1500 for hybrid and the first-gen maverick which was -$1100 for the hybrid (before Ford knew the hybrid would sell like hotcakes, then they cranked the price up).

Perhaps Ford just wanted to burn cash but imo PSD hybrids are likely very competitive in terms of per unit cost, which would hopefully translate into lower repair costs. Toyota has also just switched to hybrid only for the Rav4, which is one of the best selling models on the planet. That would be a pretty bold move if they weren't very confident about the reliability and TCO (basically their entire brand value) or their ability to make money selling them (cost vs consumer value prop).


The design of a powersplit hybrid (like a Prius) allows for consolidation and elimination of a number of common failure items on a traditional ICE vehicle.

- pure ICE needs mechanical gears or a belt-style CVT. a HV power source and 2 electric motors enable the use of a dead simple planetary gear set to change the ratio between ICE and the wheels.

- ICE needs a starter and an alternator. psd hybrids use the existing electric motors and a dc-dc converter to do those jobs

- belt powered components (e.g. A/C, power steering) are replaced by more reliable electric versions powered by the high voltage battery

- ICE needs small displacement, high compression, turbo'd engines to meet power and efficiency targets. Hybrids can get away with wheezy but efficient and reliable low-compression engines because the electric motors make performance acceptable

- ICE cars need to run their engine anytime they are moving. Hybrids will have 20+% lower runtime and that runtime will be spent at optimal RPMs and with minimal stress as bursts in acceleration are assisted by the electric motor.


It's actually YouTube that adds those as a feature for Premium subscribers. It infers the locations automatically using viewing data.


What is interesting about reducing the problem to counting? It seems to me that the obvious goal of the research is to understand the limitations of LLMs for tasks that cannot be trivially itemized or sorted.


The more specific are the instructions, the better they perform. There is a huge difference, between trying to find omitted text, or omitted words, or omitted sentences.

If omitted words are to be found, put each word into it's own line and number it. The same with sentences.

If you are trying to find omitted words and sentences, make one pass with only words, and another one with only sentences. Then combine the results.


To what end? You have to segment and order the document (i.e. solve the problem) just to craft your prompt so the LLM spitting the solution back to you is useless. The experiment uses these tasks because test cases can be algorithmically generated and scored, but it's not very interesting that one can structure the input to solve this specific, useless task with LLMs. It is interesting, though, that this limitation could carry over into tasks where traditional algorithms fail. LLMs improving at this would be legitimately useful which is why a benchmark makes sense, but cheating the benchmarks by augmenting the input doesn't.


> You have to segment and order the document (i.e. solve the problem)

Well, let's say that if this benchmark targets AGI, then no help should be given, no segmentation or structuring of information in any way, and it should be able to figure it out by itself.

If this benchmark targets LLMs trained on internet data, statistical engines that is, not AGI, these engines have a preference for structuring of information in order to solve a problem.

Segmenting the problem into smaller parts, using numbers usually, but dashes are acceptable as well, is what they have seen countless of times in textbook examples. When the input doesn't match prior input they have seen, then their performance easily degrades from superhuman to utter confusion. Superhuman for small problems, anyway.

This problem of omitted information is interesting to me, many times I want to interpolate some paragraphs into stories I write, to fill up some plot holes. I used the word "interpolate" in unstructured text, and the results were underwhelming, pretty bad most of the time. From now on, I will number each paragraph, and ask it to find omitted text in there.


There was no "They". The state legislature passed it, NY/NYC/MTA designed it, and the Biden admin approved it to go into effect before the election. Kathy Hochul delayed it until after the election on extremely spurious grounds, despite the law being on the books and NYCers supporting it.


Some important counter-points:

- FSD has been failing this test publicly for almost three years, including in a Super Bowl commercial. It strains credulity to imagine that they have a robust solution that they haven't bothered to use to shut up their loudest critic.

- The Robotaxi version of FSD is reportedly optimized for a small area of austin, and is going to extensively use tele-operators as safety drivers. There is no evidence that Robotaxi FSD isn't "supposed" to be used with human supervision, its supervision will just be subject to latency and limited spatial awareness.

- The Dawn Project's position is that FSD should be completely banned because Tesla is negligent with regard to safety. Having a test coincide with the Robotaxi launch is good for publicity but the distinction isn't really relevant because the fundamental flaw is with the companies approach to safety regardless of FSD version.

- Tesla doesn't have an inalienable right to test 2-ton autonomous machines on public roads. If they wanted to demonstrate the safety of the robotaxi version they could publish the reams of tests they've surely conducted and begin reporting industry standard metrics like miles per critical disengagement.


It absolutely does take away from other models. Most automotive brands have a 50-60% loyalty rate and all sorts of general brand appeal/features that span models. Someone who wants an HR-V but can't get one is massively more likely to buy a Civic or CR-V than the average person. If the Model 3 didn't exist, most of those buyers would get a Y because they want the software, the supercharging, the range, the brand name, etc.

Having a large lineup is good for customer satisfaction and for attracting customers on the fence, but it definitely hurts you on this one very specific, mostly meaningless metric.


I generally agree with you.

I do think regardless that even if you do have a smaller line up of cars, it is still an impressive metric that it is the most sold car in the world. That does mean that whilst Tesla has a smaller line up, that they hit the mark with meeting what people are looking for.

It is still a very in demand car for a reason and quite honestly I almost don't believe the metric.

Whilst I agree having a larger line up of cars would dilute your model sales, it is still impressive. Afterall, people wouldn't buy that model nor Tesla if they didn't like their cars.

I agree that if they did this by brand, Tesla would be much further down the list.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: