Transformer models: an introduction and catalog

xamat · on Feb 16, 2023

Thanks for the feedback everyone. Here are the sources in case anyone wants to contribute (or fork): https://github.com/xamat/TransformerCatalog

adamnemecek · on Feb 16, 2023

I have recently written a paper on understanding transformer learning via the lens of coinduction & Hopf algebra.

https://arxiv.org/abs/2302.01834v1

The learning mechanism of transformer models was poorly understood however it turns out that a transformer is like a circuit with a feedback.

I argue that autodiff can be replaced with what I call in the paper Hopf coherence.

Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

I'm working on a next gen Hopf algebra based machine learning framework.

Join my discord if you want to discuss this further https://discord.gg/mr9TAhpyBW

amkkma · on Feb 17, 2023

Is there any hope of understanding this with just calc and linalg knowledge?

adamnemecek · on Feb 17, 2023

I think so. The main idea is the idea of Hopf coherence. The transformer/Hopf algebra update their internal state in order to enforce the Hopf coherence formula (you can find that in the paper).

The idea of streams (as in infinite lists) is related to this via coalgebras.

erichocean · on Feb 16, 2023

> Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

Have you written any more about this?

adamnemecek · on Feb 16, 2023

Look into the connection between diffusion and Hopf algebras.

mi_ainsel · on Feb 17, 2023

Apologies if I'm missing something completely obvious but I'm having some difficulties dredging up material that connects Hopf algebras and diffusion. Are there any resources that you would recommend in addition to your paper?

erichocean · on Feb 20, 2023

Hi Adam, can you update your Discord invite? It's now invalid.

amatecha · on Feb 17, 2023

I was so certain this was discussing Transformers like, the action figures, and have never been so confused looking at both a link and the comments section on HN before. Especially considering: https://github.com/xamat/TransformerCatalog/blob/main/02-01.... I'm just going to keep scrolling now :'D

visarga · on Feb 17, 2023

> 2.5.9 ChatGPT is also more than a model since it includes extensions for Memory Store and retrieval similar to BlenderBot3

I don't think this affirmation is factual. There are people who played with this idea, but it is not part of chatGPT.

sva_ · on Feb 17, 2023

> 2.5.5 BERT

> Extension:It can be seen as a generalization of BERT and GPT in that it combines ideas from both in the encoder and decoder

I believe this is an error? Text from BART. And a space missing.

visarga · on Feb 17, 2023

I have a hunch they used LLM to compile the list.

DerSaidin · on Feb 17, 2023

It is a shame the figures 5,6,7,8 break up the content of 2.5 Catalog, just to fit the figures onto pages.

Are pages even needed anymore?

h_lezzaik · on Feb 17, 2023

Good timing, I've been trying to compile a list like this myself to keep track of everything released.

theredlancer · on Feb 16, 2023

Where's Cliffjumper and Ironside?

zndr · on Feb 16, 2023

I'm glad I'm not the only one looking for a taxonomy of refugees from the great Cybertron wars

sircastor · on Feb 17, 2023

When I was younger I would often encounter mentions of electrical transformers, and be quite disappointed when it wasn't related to the toys or the series. Even in my 40s I still have a bit of disappointment about it...

robertlagrant · on Feb 17, 2023

Don't get me started on when we learned about the Terminator on the moon.

peresthe · on Feb 16, 2023

"The goal of this paper is to offer a somewhat comprehensive but simple catalog and classification of the most popular Transformer models."

Yet of the 6 comments here, 2 of them are complaining about missing models and three more are arguing about the typesetting on figures.

pama · on Feb 16, 2023

Per the paper, here is the link to the data in fig 5: https://docs.google.com/spreadsheets/d/1ltyrAB6BL29cOv2fSpNQ...

swyx · on Feb 16, 2023

figure 5 on page 10 is a ridiculously small font and unreadable. i wish there was a better way to display this kind of info on PDFs

mdp2021 · on Feb 16, 2023

> i wish there was a better way to display this kind of info on PDFs

...Portable Document Format was /born/ to display vector (i.e. you just zoom in)... The error in the page was to embed a raster image of text!

nighthawk454 · on Feb 17, 2023

Another tip is arXiv lets you download the original tar file from the "Other Sources" section with all the original images etc in it.

https://arxiv.org/format/2302.07730

dylan604 · on Feb 16, 2023

From the bottom of the page in question: "Figure 5: You can access the original table at https://docs.google.com/spreadsheets/d/ 1ltyrAB6BL29cOv2fSpNQnnq2vbX8UrHl47d7FkIf6t4 for easier browsing across the different model features."

abc20230215 · on Feb 16, 2023

Does not even list Lundahl transformers...