Hacker News new | past | comments | ask | show | jobs | submit login
How does a mobile GPU work? [pdf] (windows.net)
172 points by goranmoomin on Aug 7, 2023 | hide | past | favorite | 25 comments



Did "Manga Guide" make a mobile GPU book?

I can't say I'm personally too interested in mobile GPUs. I know they're a very different architecture than Desktop GPUs and have some important optimizations with regards to memory bandwidth. (Desktop GPUs can just run higher-power RAM for more bandwidth, IIRC Mobile GPUs need to rely upon optimizations that Desktop GPUs simply don't care about yet). EDIT: Oh wow, now that I'm done reading the .pdf, I think they did a good job explaining this in comic-book form.

I can assume though that video game programmers who optimize for specifically mobile-GPUs can get a graphical advantage over their competitors though. So even if its not my cup of tea, its probably very useful to others.

----------

On the "Manga Guide" perspective: these books are a lot of fun and I feel like they're a good introduction. As a comic-book format, they are very quick to read.

Its hit-and-miss though. "Manga Guide to Linear Algebra" is awful and uninspired. Skip, just not worth your time.

But "Manga Guide to SQL" is great. I think the story about the Princess trying to organize Apples in her kingdom (and thinking about customers, businesses, etc. etc.) very naturally teaches SQL. So much of SQL and Databases are about learning a hypothetical business, that a grand story about a Fairy, a princess, a kingdom losing money that needs more organization (etc. etc.) really helps out.

Knowing that Customer's Address and how it can conflict with other customers in 1st or 2nd normal form, but how 3rd normal form fixes the possible conflict (etc. etc.) solidifies SQL ideas and theories with a story.

I think that's the key. Some subjects (and authors) are able to properly weave a story into the explanation. A subject like Linear Algebra probably can't have any story weaved in, so they kind of failed at that. SQL does, and I think a "Mobile GPU as a factory / tour of the factory comic book style" could make sense.


Desktop GPUs are borrowing ideas from mobile GPU designs though - most modern desktop GPUs have tiling support (kind of) just like mobile GPUs: https://www.realworldtech.com/tile-based-rasterization-nvidi...

Even though it's not strictly a mobile GPU design, the hybrid architecture is definitely a shift.


Indeed. Nvidia is tile based:

https://www.realworldtech.com/tile-based-rasterization-nvidi...

AMD is tile based:

https://pcper.com/2017/01/amd-vega-gpu-architecture-preview-...

Intel is tile based (section 5.2):

https://www.intel.com/content/dam/develop/external/us/en/doc...

The technology originated with PowerVR in 1996, today known as Imagination Technologies, who designed the ancestor architecture to Apple's GPUs today.


Imagination released the driver and simulator sources for PowerVR series 1 last year [0] which is a fun resource to see how tiling GPUs started. Series 1 is particularly weird because it doesn't use triangles, it uses infinite planes, which may be clipped by other planes - meaning that convex polygons with more sides are faster than the equivalent in triangles. Sadly no RTL but I'm guessing the simulator is based on it given some of the weird coding patterns.

[0] - https://github.com/powervr-graphics/PowerVR-Series1


Thanks so much for sharing that, I had no Idea something this cool had been made public. Though initially when you wrote "simulator" I though the Verilog/VHDL source code and the simulator for the HDL design would be there, but that's not it.

Someone correct me if I'm wrong, but I asume it's a simulator for the driver, probably to simulate there's a PowerVR GPU in the system.

Just looking through the driver source code and seeing all those #ifdef blocks for all possible combinations of HW, SW, APIs, etc scattered everywhere is making my head spin.

Hopefully they had some IDE that would collapse all the unused blocks, as it reminds me of one of my gigs in embedded where the IDE didn't have such a feature and it was brutal following the code.


I haven’t tried to run the simulator, but I believe you should be able to run the HW driver with it - and get the frame buffer output. I imagine it’d take some work to get it working on a modern OS.

It’s the simulator that I suspect is derived from the RTL - hwsabsim.c is a good example…


Right: While an entertaining presentation is helpful, the most important factor is having a good metaphor that bridges the underlying concepts to something that can be pictured and casually reasoned about.


This one isn't "manga guide" though. I don't think the source is even japanese, cause it reads from left to right (kinda jarring, reading a manga looking comic from left to right). I think this one just comes straight from arm.


Arm calls it "The Arm Manga Guide to the Mali GPU".


Hmm, fair enough, I guess I should have clarified that I was talking about the official "manga guide" series


Pretty much everyone I taught SQL I mapped the queries to natural questions anyone would have about a business that they were working at right then... because that's what it was basically invented for, is very good at it, and most people don't really care or need advanced features anyway.



There are more pages if you go to the original source. https://interactive.arm.com/story/the-arm-manga-guide-to-the...


There are actually three volumes: https://developer.arm.com/en/dev2/Gaming%20Graphics%20and%20...

(Why are there several websites…)


Both desktop and mobile GPUs are tile based now. This is outdated.


They're still a very different breed of "tile based", though.


That's not what I was expecting but I'm not complaining.


Interestingly, this PDF causes iOS 17 beta Safari to crash


This is what happens when mobile GPUs do not work


What's the "conveyor belt" between vertex and fragment shader? Doesn't communication between these two shaders happen via GPU memory in discrete GPUs as well as in integrated/mobile ones?

Or is that something that only happened after the switch to the unified shader model?


It depends on the GPU, in some cases there is a direct pipeline from vertex shading to fragment shading. But typically (even on mobile GPUs) there is some sort of cache to exploit the fact that often the same vertex gets shaded multiple times. In tile-based rendering typically the shaded vertex data is all partitioned into "bins" that are used to do things like depth sorting, hidden surface removal etc.

There's often a memory hierarchy as well - for example the "tile" memory might be special dedicated high-speed memory, as it was on the XBox 360. So you could end up with one type of memory being used for the input vertex/index data and a different kind used for the "bins" containing shaded vertex data.

(Despite being a high-power desktop-grade GPU part, the XBox 360 had "predicated tiling" mode which would partition the screen up into big tiles, much like described in ARM's PDF. They did this to support multisampling at high resolutions.)

Also, unless things have changed, modern NVIDIA GPUs are sort of tile based, albeit not in the same way as an ARM mobile GPU. See https://www.youtube.com/watch?v=Nc6R1hwXhL8 for a demo of this. This type of rasterization implies buffering the shaded vertex data in order to be able to rasterize multiple vertices in a single "tile" at once.


I kind of got a reality check reading a Quora answer back in the day when I was hunting for GTA V on Android. There were a lot of scammy terrible apps (and still are).


You can also ask your GPU how it works and to generate the explanation in manga style.


Obligatory blog entry made by Alyssa when dissecting the M1 GPU hardware: https://rosenzweig.io/blog/asahi-gpu-part-1.html


Awesome!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: