More

amrrs · 2025-01-26T20:28:56 1737923336

This has been the problem with a lot of long context use cases. It's not just the model's support but also sufficient compute and inference time. This is exactly why I was excited for Mamba and now possibly Lightning attention.

Even though the new DCA based on which these models provide long context could be an interesting area to watch;

amrrs · 2025-01-23T21:54:37 1737669277

He is already running a hacker house in bangalore with cohorts of researchers and practiontioners getting together - https://turingsdream.co/

so probably it's picking one of those ideas or investing in it!

amrrs · 2025-01-23T18:34:01 1737657241

For those who don't know, He is the gg of `gguf`. Thank you for all your contributions! Literally the core of Ollama, LMStudio, Jan and multiple other apps!

kennethologist · 2025-01-24T02:45:47 1737686747

A. Legend. Thanks for having DeepSeek available so quickly in LM Studio.

sergiotapia · 2025-01-23T19:18:29 1737659909

well hot damn! killing it!

halyconWays · 2025-01-23T19:47:56 1737661676

[flagged]

kamranjon · 2025-01-23T22:28:50 1737671330

They collaborate together! Her name is Justine Tunney - she took her “execute everywhere” work with Cosmopolitan to make Llamafile using the llama.cpp work that Giorgi has done.

madeforhnyo · 2025-01-23T22:22:12 1737670932

Someone did? Could you pls share a link?

amrrs · 2025-01-15T10:15:44 1736936144

Kokoro really mentions that they used only permissive licensed voice

amrrs · 2025-01-13T19:12:15 1736795535

https://www.reddit.com/answers

amrrs · 2025-01-08T22:11:56 1736374316

For Context:

SWE-Bench (+ Verified) is the benchmark (of resolving Github Issues) that companies into Coding are chasing - Devin, Claude, OpenAI - all these!

A new leader #1 - CodeStory Midwit Agent + swe-search - has been crowed with a score of 62% on SWE-bench verified (without even using any reasoning models like OpenAI o1 or o3)

More details on their approach - https://aide.dev/blog/sota-bitter-lesson

alach11 · 2025-01-08T22:15:31 1736374531

This is a very impressive result. OpenAI was able to achieve 72% with o3, but that's at a very high compute cost at inference-time.

I'd be interested for Aide to release more metrics on token counts, total expenditure, etc. to better understand exactly how much test-time compute is involved here. They allude to it being a lot, but it would be nice to compare with OpenAI's o3.

skp1995 · 2025-01-08T22:26:39 1736375199

Hey! One of the creators of Aide here.

ngl the total expenditure was around $10k, in terms of test-time compute we ran upto 20X agents on the same problem to first understand if the bitter lesson paradigm of "scale is the answer" really holds true.

The final submission which we did ran 5X agents and the decider was based on mean average score of the rewards, per problem the cost was around $20

We are going to push this scaling paradigm a bit more, my honest gut feeling is that swe-bench as a benchmark is prime for saturation real soon

1. These problem statements are in the training data for the LLMs

2. Brute-forcing the answer the way we are doing works and we just proved it, so someone is going to take a better stab at it real soon

amrrs · 2025-01-08T22:17:33 1736374653

tbh there has been some issue with their previous reporting

https://x.com/Alex_Cuadron/status/1876017241042587964

dang · 2025-01-08T22:26:05 1736375165

Thanks! It feels like we should switch the top link to that URL since it's a deeper dive into the new bit that's interesting here.

Edit: I've done that now. Submitted URL was https://www.swebench.com/ and submitted title was "SWE Bench just got updated – new #1s".

amrrs · 2025-01-03T22:08:31 1735942111

fwiw - Aravind Srinivas (Perplexity cofounder) spoke about Ads - https://www.youtube.com/watch?v=FWPmu_rKxJo

sabareesh · 2025-01-03T22:14:19 1735942459

He wanted to expose ads to AI, which felt weird for some reason.

amrrs · 2024-12-29T23:03:36 1735513416

How is it different from LMstudio or GPT4ALl?

amrrs · 2024-12-19T17:58:08 1734631088

I remember back in the day there was an Ernie model

axpy906 · 2024-12-19T18:30:21 1734633021

Don’t forget ELMO. The bi-lstm.

amrrs · 2024-12-12T15:35:54 1734017754

It was quite sad to see Ding lose at the end. But it's been a very tough year and half or so. Precisely since he won the championship.

I was quite sad at the way some very top players spoke of him.

But the way he came back and almost took the game to tie breaks was unbelievable as a Ding fan.

At the end of the day, it's generational shift that chess is witnessing.

Almost written in destiny that it all started with candidates about how Alireza played against Gukesh and where it is now!