Hacker News new | past | comments | ask | show | jobs | submit | blackcat201's comments login

Shameless plug, for anyone who's interested in "self-improvement" agent check out StreamBench[1] where we benchmark and try out what's essential for improvements in online settings. Basically we find feedback signal is vital and the stronger the signal the more improvement you can get if you were able to feed it back to the agent in terms of weights (LoRA) or in-context examples.

[1] https://arxiv.org/abs/2406.08747


Do beware on some reasoning task, our recent work[0] actually found it may cause some performance degradation as well as possible reasoning weakening in JSON. I really hope they fix this in the latest GPT-4o version.

[0] https://arxiv.org/abs/2408.02442


Thank you! This confirms my intuition!

Structured generation seems counter to every other signal we have that chain of thought etc improves performance.


The standard operation is to stop and check if any machine was out of calibration. So yes


I own my LLM not because I need it now but having the luxury to fall back if openai ran out of money


I have been following the vector database trend back in 2020 and I ended up with the conclusion: vector search features are a nice to have features which adds more value on existing database (postgres) or text search services (elasticsearch) than using an entirely new framework full of hidden bugs. You could get way higher speedup when you are using the right embedding models and encoding way than just using the vector database with the best underlying optimization. And the bonus side is that you are using a stack which was battle tested (postgres, elasticsearch) vs new kids (pinecone, milvus ... )


https://theblackcat102.github.io/

Recently I am ranting the AI trends and some short writeup of things I read


This looks pretty interesting! But the landing page has only one sentence : Understand and implement research papers faster, followed by a get in touch button. Care to extrapolate more? The blog button doesn't work as well?


Thanks for checking it out. We are trying to build a system for engineers, researchers, academics, basically anyone who has to implement cutting research into computer code for their product. Usually the process of finding the right paper, and then understanding it, and then implementing it in computer code is very time consuming, and our hypothesis is that we can reduce that time drastically by using large language models. We are still very early and exploring the right problem-solution fit and feedback from early testers like you is extremely valuable! so thanks for reaching out! join our discord! :)


Note that stability have been funding freelance researcher by providing compute resource such as RWKV[1], Open Assistant, some works by LAION[2] and lucidrains[3]

[1] https://github.com/BlinkDL/RWKV-LM

[2] https://huggingface.co/laion/CLIP-ViT-L-14-laion2B-s32B-b82K

[3] https://github.com/lucidrains/gigagan-pytorch#appreciation


On the other hand, meta AI research should be renamed as OpenAI. They are the only few big institute who open almost every models they train (galactica, OPT, M2M, wav2vec ...)


Recently just migrate a project from pypoetry away to the traditional setup method. Poetry works great for simple package, but once you started to add in complexities, it just falls apart due to everything was abstract away and simplified into config files and command line.


I had the same experience. I think setuptools nowadays is quite good, esp. In combination with setup tools_scm.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: