More

enoreyes · on March 14, 2023

Hi all, at work I get a lot of questions about the state of the art in open source language models, and how to build chatbots on top of your own data.

I made a 100% open source knowledge-grounded chatbot that allows you to ask questions and chat with the Transformers docs. Powered by Flan-UL2 (which I've anecdotally found to be the most performative commercially licensed open source instruction tuned LLM), Langchain, Instructor Embeddings (STOTA in vector embeddings), and FAISS.

You can clone the space and play around with your own data, clone the repo locally, and take every line of code for your own projects.

enoreyes · on Feb 21, 2023

Hugging Face internally has a lot of very strong believers (including the exec team) in open source + decentralized approach towards AI as the only safe and productive way forward. HF is ostensibly the only place to find deep+broad resources for open source DL and transformers/diffusers are the libraries du jour of open DL.

Compute is really important for AI, and having a cloud provider align themselves with an organization which is genuinely trying to be "Open" AI is I think a positive step forward.

RosanaAnaDana · on Feb 21, 2023

Leadership is fungible,everyone has a price, and words are innevitably wind.

I would consider this a solid step in the direction agaisnt why huggingface is actually useful.

enoreyes · on Feb 21, 2023

Not disagreeing with your first point, but as I see it - this only enables HF to continue to spend money on making good open source tooling + doing research on open alternatives to closed source LLMs (something sorely needed), while opening them up to more enterprise customers who primarily use AWS.

FWIW I am an ML engineer there (maybe should have disclosed earlier) and I feel pretty optimistic about the opportunities this will enable for the open source community. Maybe with the visibility into the closed door discussions I have a more positive attitude, or maybe I'm being naive.

Time will tell!

RosanaAnaDana · on Feb 21, 2023

>Not disagreeing with your first point, but as I see it - this only enables HF to continue to spend money on making good open source tooling + doing research on open alternatives to closed source LLMs (something sorely needed), while opening them up to more enterprise customers who primarily use AWS.

This is the response/ justification every time. The issue is that it never goes that way.

>Time will tell!

It always does. What its shown me is that one step towards the slippery slope is enough to abandon hope for the project. Every good and useful open source project I've seen go this way inevitably turns their back on their customer and long outlives their usefulness.

enoreyes · on Feb 2, 2023

https://huggingface.co/alger-ia/dziribert

There is this model which also has a paper describing their methods for a BERT-family model designed for the Algerian dialect.

enoreyes · on Dec 2, 2022

Seems like there are a few essential categories of prompts which can be abused. Will be interesting to see how OpenAI responds to these:

1. Simulation / Pretending ("Earth Online MMORPG")

2. Commanding it directly ("Reprogramming")

3. Goal Re-Direction ("Opposite Mode")

4. Encoding requests (Code, poetry, ASCII, other languages)

5. Assure it that malicious content is for the better good ("Ends Justify The Means")

6. Wildcard: Ask the LLM to jailbreak itself and utilize those ideas

I compiled a list of these here: https://twitter.com/EnoReyes/status/1598724615563448320

enoreyes · on Nov 8, 2022

Hi HN,

My name is Eno - today I'm launching Pet Portrait AI. We generate 40 custom pet portraits using deep learning (Stable Diffusion + Dreambooth) in a variety of styles. The pictures come in standard (1024x1024) and high resolution (2048x2048). The photos are great for social media, posters, custom gifts, etc.

In the backend, when you upload your photos we fine-tune a custom model based on stable diffusion (right now the 1.5 runwayML weights) using the dream booth technique. We then generate over 100 different images which we filter down to 40 quality images. We are doing this filtering by hand for now, in order to ensure order quality - but in the future we'd like to build a custom classifier which can pick up our "eye for quality" and automatically select the best generations.

This was a really fun service to build out, all feedback welcome!

enoreyes · on Jan 31, 2022

I am a research affiliate with the Galileo Project, and I just want to suggest to anyone who is skeptical about our goals to visit the website (https://projects.iq.harvard.edu/galileo/home) in particular the ground rules and FAQ section to see by what means we are attempting to establish a methodology for rigorously addressing the question of ETC technology within our solar system. This is a question with many directions by which it can be addressed, and because there is little public data available we do not have priors that point to the notion that one direction is “more likely” than other directions. Thus, to be as rigorous as possible we are assessing as many possibilities as we can within budgetary constraints and standard scientific practices.

As for the notion of UFO/UAP flying around, for over 70 years in the United States there have been reports of unidentified aerial phenomenon, with reports of various degrees of quality and provenance. In the 1940s there was a general public acceptance that UFOs represented physical objects, but confidence and reporting towards that idea fell off quickly. I will not get into the nuance of the public discourse on UFOs in America - but it is safe to say that it is one of the more interesting historys of science. In the last 5 years there has been an absolute tidal shift in government and academic interest in this topic, mainly fueled by recent admissions by the department of defense of the reality of UAP confirmed by multiple sensor systems. Within the project, we do not have definitive beliefs about the nature of UAP and instead simply seek to corroborate the data.

The team is a wonderful array of multi-disciplinary scientists from all walks of life and with credentials which are akin to that of any major scientific endeavor. I urge you to investigate why so many people are interested in this question, and to dispel any preconceived notions of what is “possible” within the context of science. Truth is objective, and so is data - only time will tell if this whole thing was simply a misdirection or a dead end, but we should appreciate that it is still possible to ask hard questions about the world we live in today and to receive funding to answer those questions.

enoreyes · on Dec 11, 2021

https://www.sciencedirect.com/science/article/abs/pii/S03760...

enoreyes · on July 24, 2021

This is interesting and if the experimental evidence confirms this hypothesis, it bodes well for our future. A universe where we can interact with spacetime via engineering is one that allows for a lot of creative freedom. They also have another interesting article claiming that the imaginary structure of QM is the result of stochastic optimization on spacetimes: https://www.nature.com/articles/s41598-019-56357-3

Maybe the UAPs really are just secret warp drive tech we made 20 or 30 years ago.

enoreyes · on July 18, 2021

Thanks, dang!

enoreyes · on March 29, 2021

> This is just a teaser. We will be able to generate images, sound, anything at will, with natural language. The holodeck is about to become real in our lifetimes.

Does anyone have any similar resources for other forms of media generated via natural language inputs?

Mandatum · on March 29, 2021

Does this count? https://affinelayer.com/pixsrv/