Thanks. Very cool. Have you ever tried to implement a transformer from scratch? ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		FezzikTheGiant on April 25, 2024 \| parent \| context \| favorite \| on: Ask HN: How does deploying a fine-tuned model work Thanks. Very cool. Have you ever tried to implement a transformer from scratch? Like in the Attention is all you need paper? Can a first/second year college student do it

xyc on April 26, 2024 | [–]

Andrej Karpathy's course is a good resource: https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThs...

fzzzy on April 27, 2024 | [–]

I haven't tried it yet, but I do intend to. I think the code for llm inference is quite straightforward. The complexity lies in collecting the training corpus and doing good rlhf. That's just my intuition.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact