Hacker News new | past | comments | ask | show | jobs | submit login
Transformer models: an introduction and catalog (arxiv.org)
188 points by mariuz on Feb 16, 2023 | hide | past | favorite | 25 comments



Thanks for the feedback everyone. Here are the sources in case anyone wants to contribute (or fork): https://github.com/xamat/TransformerCatalog


I have recently written a paper on understanding transformer learning via the lens of coinduction & Hopf algebra.

https://arxiv.org/abs/2302.01834v1

The learning mechanism of transformer models was poorly understood however it turns out that a transformer is like a circuit with a feedback.

I argue that autodiff can be replaced with what I call in the paper Hopf coherence.

Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

I'm working on a next gen Hopf algebra based machine learning framework.

Join my discord if you want to discuss this further https://discord.gg/mr9TAhpyBW


Is there any hope of understanding this with just calc and linalg knowledge?


I think so. The main idea is the idea of Hopf coherence. The transformer/Hopf algebra update their internal state in order to enforce the Hopf coherence formula (you can find that in the paper).

The idea of streams (as in infinite lists) is related to this via coalgebras.


> Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

Have you written any more about this?


Look into the connection between diffusion and Hopf algebras.


Apologies if I'm missing something completely obvious but I'm having some difficulties dredging up material that connects Hopf algebras and diffusion. Are there any resources that you would recommend in addition to your paper?


Hi Adam, can you update your Discord invite? It's now invalid.


I was so certain this was discussing Transformers like, the action figures, and have never been so confused looking at both a link and the comments section on HN before. Especially considering: https://github.com/xamat/TransformerCatalog/blob/main/02-01.... I'm just going to keep scrolling now :'D


> 2.5.9 ChatGPT is also more than a model since it includes extensions for Memory Store and retrieval similar to BlenderBot3

I don't think this affirmation is factual. There are people who played with this idea, but it is not part of chatGPT.


> 2.5.5 BERT

> Extension:It can be seen as a generalization of BERT and GPT in that it combines ideas from both in the encoder and decoder

I believe this is an error? Text from BART. And a space missing.


I have a hunch they used LLM to compile the list.


It is a shame the figures 5,6,7,8 break up the content of 2.5 Catalog, just to fit the figures onto pages.

Are pages even needed anymore?


Good timing, I've been trying to compile a list like this myself to keep track of everything released.


Where's Cliffjumper and Ironside?


I'm glad I'm not the only one looking for a taxonomy of refugees from the great Cybertron wars


When I was younger I would often encounter mentions of electrical transformers, and be quite disappointed when it wasn't related to the toys or the series. Even in my 40s I still have a bit of disappointment about it...


Don't get me started on when we learned about the Terminator on the moon.


"The goal of this paper is to offer a somewhat comprehensive but simple catalog and classification of the most popular Transformer models."

Yet of the 6 comments here, 2 of them are complaining about missing models and three more are arguing about the typesetting on figures.


Per the paper, here is the link to the data in fig 5: https://docs.google.com/spreadsheets/d/1ltyrAB6BL29cOv2fSpNQ...


figure 5 on page 10 is a ridiculously small font and unreadable. i wish there was a better way to display this kind of info on PDFs


> i wish there was a better way to display this kind of info on PDFs

...Portable Document Format was /born/ to display vector (i.e. you just zoom in)... The error in the page was to embed a raster image of text!


Another tip is arXiv lets you download the original tar file from the "Other Sources" section with all the original images etc in it.

https://arxiv.org/format/2302.07730


From the bottom of the page in question: "Figure 5: You can access the original table at https://docs.google.com/spreadsheets/d/ 1ltyrAB6BL29cOv2fSpNQnnq2vbX8UrHl47d7FkIf6t4 for easier browsing across the different model features."


Does not even list Lundahl transformers...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: