How excellent for a quantized 27GB model (the Q6_K_L GGUF quantization type uses 8 bits per weight in the embedding and output layers since they're sensitize to quantization)
> If I know the exact title of the video, it finds it. Everytime.
This is unfortunately not true. I have a little channel and there have been times when searching for the exact title of some of my videos did not return it in the results at all (searching with quotes or not). Cannot reproduce now because the search algorithm has now started liking me.
I agree that the booking experience of the DB-SNCF cooperation trains sucks from the DB end, but the underlying blame arguably lies with SNCF which insists on compulsory reservations which is against the philosophy of trains in Germany. On the other hand, in my experience DB offers cheaper tickets for these cooperation trains, most of the time.
But these trains are a special case; in other cases DB is clearly far more pleasant.
This is not primarily a problem of the cooperation trains, I have the same situation with trains within Germany. DB-Navigator only tells you if a train is bookable right at the end, right before payment. Before that, it might show "there is high demand", but this is rather useless, especially when you have a school kid and want to book a train a the beginning or end of school holidays, when every train is in high demand. Your only chance with DB-Navigator is to play the whack-a-mole game where you run all the steps repeatedly until the very last step until you find a train you actually can book.
In the SNCF app I have this information right away, that is what makes the difference for me.
I think it's partially excusable. Most LP solvers target large-scale instances, but instances that still fit in RAM. Think single-digit millions of variables and constraints, maybe a billion nonzeros at most. PDLP is not designed for this type of instances and gets trounced by the best solvers at this game [1]: more than 15x slower (shifted geometric mean) while being 100x less accurate (1e-4 tolerances when other solvers work with 1e-6).
PDLP is targeted at instances for which factorizations won't fit in memory. I think their idea for now is to give acceptable solutions for gigantic instances when other solvers crash.
Indeed those are the "big four" solver businesses in the West, and also probably the most reliably-good solvers. But by the time Gurobi withdrew from the benchmarks (a few weeks ago), COpt was handily beating them in the LP benchmarks, and closing down on them in MIP benchmarks. Solver devs like to accuse each other of gaming benchmarks, but I'm not convinced anyone is outright cheating right now. Plus, all solver companies have poached quite a bit from each other since cplex lost all its devs, which probably equalizes the playing field. So overall, I think Mittelmann benchmarks still provide a good rough estimate of where the SOTA is.
Their numerical results with GPUs, compared to Gurobi, are quite impressive [1]. In my opinion (unless I'm missing something), the key benefits of their algorithms lie in the ability to leverage GPUs and the fact that there’s no need to store factorization in memory. However, if the goal is to solve a small problem on a CPU that fits comfortably in memory, there may be no need to use this approach.
I agree that their results are impressive. Just to be clear, however:
1. They compare their solver with a 1e-4 error tolerance to Gurobi with 1e-6. This may seem like a detail, but in the context of how typical LPs are formulated, this is a big difference. They have to do things this way because their solver simply isn't able to reach better accuracy (meanwhile, you can ask Gurobi for 1e-9, and it will happily comply in most cases).
2. They disable presolve, which is 100% reasonable in a scientific paper (makes things more reproducible, gives a better idea of what the solver actually does). If you look at their results to evaluate which solver you should use, though, the results will be misleading, because presolve is a huge part of what makes SOTA solvers fast.
hmm... I am reading [1] right now. When looking at their Table 7 and Table 11 in [1], they report comparison results with Gurobi presolve enabled and 1e-8 error. Do I miss anything?
Their performance isn't quite as good as Gurobi's barrier method, but it's still within a reasonable factor, which is impressive.
Regarding presolve: When they test their solver "with presolve", they use Gurobi's presolve as a preprocessing step, then run their solver on the output. To be clear, this is perfectly fair, but from the perspective of "can I switch over from the solver I'm currently using", this is a big caveat.
They indeed report being 5x slower than Gurobi at 1e-8 precision on Mittelmann instances, which is great. Then again, Mittelmann himself reports them as 15x off COpt, even when allowed to do 1e-4. This is perfectly explainable (COpt is great at benchmarks; there is the presolve issue above; the Mittelmann instance set is a moving target), but I would regard the latter number as more useful from a practitioner's perspective.
This is not to diminish PDLP's usefulness. If you have a huge instance, it may be your only option!
The three linked papers seem to be old, but the broader impact section mentioned cupdlp, which is more recent and has interesting numerical comparisons with commercial solvers: https://arxiv.org/abs/2311.12180, https://arxiv.org/pdf/2312.14832. It is CPU vs GPU, though, not sure how fair it is.
I'm generally strongly in favor of technical system for making speeding impossible -- for example I don't think it should be legal to sell cars that are able to go 100km/h on city streets. But this implementation seems designed to elicit backlash. Beeping is bad, and setting the tolerance at zero is not friendly given general driver behavior. I'd find it much more sensible to have "smooth" systems, such as a constant "buzzing" sound whose volume increases with the square of the speed excess, or translating pedal pressure into speed in such a way that convexly increasing amounts of pressure are required to go further above the limit.
As far as I understand, because employment is at-will and firing is trivial, most employees in the U.S. do not even have a job contract! This was very mind-blowing to me when I took my first U.S. job, where there was no contract (!!), only a 500-word "offer letter". I guess the reasoning is that if there ever were to be any conflict between employee and employer, the conflict would be settled by ending the employment relationship. So there is no point in the employer promising anything (e.g. number of vacation days) since the employer can costlessly renege on any such promise.
reply