Hacker News new | past | comments | ask | show | jobs | submit login

I was laid off in October, started playing with AI stuff around then (SD in August, ChatGPT/GPT APIs around December or whenever it was released)

Got hired over the last week as Founding Prompt Engineer at an AI company! May 1st start date. Extremely excited to be playing with LLMs for money!!!




What exactly does a founding prompt engineer do?


I'm going to guess it involves engineering prompts.

Which requires a surprising amount of skill and experience!

I still haven't found a 100% reliable way of getting a LLM to always produce results in JSON for example. See https://twitter.com/genmon/status/1646194992761782278

    State of the art techniques to get GPT to return JSON

    - logic: Responses must match this JSON schema
    - demonstration: For example…
    - appeal to identity: You are a chatbot that speaks perfect JSON
    - cajoling: Remember always return JSON!
    - threat: if you don't I SWEAR I'm gonna–
I wrote more about why I think prompt engineering deserves more respect than it gets here: https://simonwillison.net/2023/Feb/21/in-defense-of-prompt-e...


Hey Simon! I've been digging your writings on LLMs lately.

I've been having some decent luck with some of the approaches that I've discussed in the following articles and projects:

From Prompt Alchemy to Prompt Engineering: An Introduction to Analytic Augmentation: https://github.com/williamcotton/empirical-philosophy/blob/m...

Writing Web Applications with LLMs: https://www.williamcotton.com/articles/writing-web-applicati...

https://github.com/williamcotton/transynthetical-engine

I'd love to hear your thoughts on the matter!

One of the techniques that I've found for reliably returning JSON is... ask for multiple responses and then use one of the responses that successfully parses!


A glaring problem is: Non-determinism of LLMs, creating different answers to the same prompt. I appreciate your blogging and analysis in this space, so I am I am interested in your responses. The non-determinism implies that prompt engineering is brittle, difficult, and prone to no formal evaluation techniques for correctness.


Set the temperature to zero to make it more deterministic.


I agree with everything you've said there.


It is certainly an interesting phenomenon, and I wonder what techniques from neuroscience for brain mapping could be used for model "brain mapping", which could lend itself more to prompt engineering as a science (latent space mapping).


You can use vicuna-7b-1.1. No need for chat prompts. Just slam in your data and end it off like so

Generate a JSON with this and that {"this": "

Lower the temperature to minimum for deterministic results, fine tune the other parameters if needed. And have a stop token for JSON closing tag like so }.

That usually works perfectly fine for me in most scenarios. Best: that stuff runs on RTX 3080 with 15token/s (quite fast!). Also vicuna-7b is pretty much as good as gpt-3 when it came out.


Vicuna-7b is much better than Gpt4All, but still struggles with math - I can't wait until my new work computer comes in, I will try to run the new StableLM models


> I still haven't found a 100% reliable way of getting a LLM to always produce results in JSON for example.

Have you tried Guardrails?

https://github.com/ShreyaR/guardrails


Interesting project…thanks for sharing.


Hi Simon, your blog posts have been invaluable in my ongoing process of refining a document that covers major concepts in prompt engineering and LLM fine-tuning and I'd love to pick your brain over email or a call if you have any bandwidth!


why not just adjust the decoder / beam search to not emit any tokens that aren't semantically valid JSON?

ie. instead of using temperature to sample something from the top k most likely tokens, first exclude all the tokens that cause the output to be malformed. the model can only emit {, ", [, or a number for the first token, for example.

if someone would like a fun project to try this right away, one place to start would be to modify llama.cpp's chat example just before the line that samples tokens [1], going through `lctx.logits` to zero out invalid tokens (or these are logits, so i guess set them to -INFINITY). For smoketest, fix the first token of the model's output to "{" without any other changes and I bet you'd get something approaching JSON out.

[1]: here's the line to change: https://github.com/ggerganov/llama.cpp/blob/c4fe84fb0d28851a... see the bit on line 317-319 about how it ignores the end-of-sequence token by zeroing out the probability of sampling it? just like that!

i mean, the most principled approach probably requires some theoretic CS knowledge about regular expression derivatives or parsing machine derivatives, but i'm surprised it isn't more common to just hook into the decoder design a little, given how much we want structured data out of these models

i wish i knew how to voice my ignorant skepticism in a less disparaging way, sorry.... but i feel like a lot of this "legitimization of prompt engineering as a useful trade/practice" thinking assumes that we're trapped in the "magic circle" where the only input we have to the model is picking the prompt and the only possible output is the most likely token. but these are generative models! conditioned on their output, we have our choice about which token to accept, so why not just condition on the distribution of possible JSON output instead of the distribution of possible prose?

i suspect very quickly the most competitive prompt engineers will combine their solid understanding of theoretic machine learning and statistics with a solid understanding of computer science, perhaps even combined with a dash of persuasion / neurolinguistic programming experience. kinda worries me but it's how it is


You're basically describing https://github.com/newhouseb/clownfish/, except there it tries to validate the output against JSON schema on the fly.


Ask ChatGPT


What was your general background prior to this position?


Check out my resume on my LinkedIn: www.linkedin.com/in/arthurcolle (pinned document) if you are interested


Anthropic?! Regardless of what it is, congrats!


brainchain.ai

General idea is gaining insights into supply chains using LLMs and other machine learning models for specific applications.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: