Hacker News new | past | comments | ask | show | jobs | submit login
Assembly Language for Beginners [pdf] (yurichev.com)
672 points by dennis714 on July 17, 2018 | hide | past | favorite | 94 comments



I actually got paid a salary for learning & programming in IBM mainframe assembler (BAL or Basic Assembly Language) in 1970, for an insurance company. The CPU memory was so small (32K, yes 32,768 bytes) that the only way we could squeeze enough functionality was to write in assembler, with overlays (no virtual memory). Debugging consisted of manually toggling hex data and instructions at the control panel. What a blast!

It was a lot of fun, but terribly inefficient programmer productivity. I would not want to go back :o) Dereferencing registers prepared me for C pointers later.


Ooo! I have a question! Where do I learn more about overlays? The BSD 2.11 code I've read has comments about overlays, but I have no idea where to learn more about how to understand the topic. I came across it while I was seeing if I could get newlib to compile for the PDP-11.


You could look at some old Turbo Pascal articles (though don't know how applicable it is to other systems): http://www.boyet.com/articles/publishedarticles/theslithytov...


Game consoles supported the concept a whole lot longer than other domains. ROM bank switching can be thought of as overlay loading, and beyond that most consoles supported overlays up until 360/PS3. The Nintendo DS cartridge format natively supported overlays for instance.


AFAIK, early MS-DOS made heavy use of the technique, probably(?) before EMS/XMS became available/common.

"OVL" popped up in my head for some reason, and https://www.google.com/search?q=dos+ovl seems to return interesting results (showing a few real-world examples).



Dammit. Wrong link.

Here is the correct one:

https://www.elsevier.com/books/linkers-and-loaders/levine/97...


Was wondering about that initial link. Thank you for returning with the correction.


Glad it got seen.


Wow, 32K! I'm working on an Arduino Uno project where I need to squeeze some complicated logic into 2K RAM and 32K FLASH :). I haven't resorted to assembly, yet.


32KB was a ton of memory in the 70s.


Especially in 1970. That was a lot even for 1980.


Forgive me, but I'm struggling to understand how much useful work one could extract from a computer with only 32k of RAM. A microcontroller for an appliance, sure, but a mainframe? Could you tell us more about the work you were doing?


In 1970, a company named Telemed built an Electrocardiograph analysis service where hospitals would call in from all around the country and send ECGs over the telephone line with analog FM signals, three channels at a time. The computer located near Chicago would accept the incoming call, digitize the three analog signals every 2ms to 10 bits, write them to disk, and decode the touch-tone patient ID.

When the call was completed, the data from disk would be assembled into a full ECG record and written to tape, and simultaneously passed to the diagnostic program written in Fortran. The system would then initiate a phone call to the hospital's printer and print out an English-language diagnostic. The result was then available in ten minutes to the staff a the hospital.

The front end and back end was all Sigma 5 (first SDS then Xerox midrange real-time computer) assembler in an interrupt-rich process--one interrupt every 2ms for the analog, one interrupt for the disk write complete, interrupts for the tape record writing, interrupts for the outgoing phone call progress. This included an cost optimization process that would choose which line (this was in the days of WATS lines) based on desired response time. The middle was the Fortran program that would analyze the waveforms, identifying all the ECG wave forms--P-wave, QRS, and T-Wave--the height, duration, sometimes slope.

This all took place in a machine with 32k words (four bytes per word). There were two computers, one nominally used for development, but could be hot-switched if the other failed. I think downtime was on the order of an hour per year. This would have been called an Expert System, but I don't think the term was in common use as yet.

So the answer to your question is: "A considerable amount". Today we are all spoiled by environments with more memory on one machine than existed in the entire world at that time.


Thank you, that's brilliant! It is unfortunate that the source code is lost to humanity, as an example of what is possible.


Ah, I could replicate it with a bunch of time and likely emulators. There wasn't anything secret about it, just good engineering with an innovation or two. The one that was the coolest was to use coroutines to kind of invert the inner loop. In the middle of the loop was a macro call that said "Get interrupt from the OS" which simply translated as a WAIT that would fire when the particular interrupt came in. We wrapped coroutine around this to properly save and restore state.

By the way, this was significantly easier than what folks have to go through with C or using that dratted asynsc/await pattern.

This particular code that used the coroutine was the outbound call processing low-level stuff. I was second fiddle on that one, and the lead was a fellow who is quoted in TAOCP. We had zero single-thread errors and one multi-thread error when we stood it up. Keep in mind that this was in the days of no debuggers other than console switches.


If you want examples of how far you can go packing way too many things in too little code, you may be interested in the demoscene!


I would not have guessed that there were automatic ECG analysis in 1970! Was it good? Are today's methods any better?


We were the first commercial offering. There was one or two working at universities.

It was quite good.

I would imagine that they are--I haven't kept track.


But to further comment --

The landscape is vastly different these days. You can't even stand up something in a medical environment that measures heart rate, much less waveforms without significant clinical trials. Apparently the FDA is all over this one.


Awesome... Can you please share more stories of that time with us? Maybe write them somewhere ?


I should write up a blog post. When I do, I'll submit it here.


You really really should! That was a fascinating read, thanks for sharing.


A key thing to remember is that 32k of RAM isn't the only memory available; often there was all sorts of longer term storage, and in many cases the data for a job was separate than the machine running the program, which could feed that data in using a number of methods. Today's programs load entire files into RAM because it's available; back then you might load a single record from the file at a time, and there were routines to seek through the offline data (subject to optimization) and manage the working memory more effectively.

It's also worth remembering that, at the time, a machine with 32k of RAM was one of the most powerful on the market, was still considerably expensive, and the alternative was paying (a team of) humans to do the work by hand. For all its shortcomings and the insane complexity required to get the machines to work properly, they were generally much faster than humans performing the same task and generally (assuming they were programmed correctly) could be relied on to make fewer mistakes. Their utility was remarkable, especially their ability to perform arithmetic very quickly, which was (and still is) quite tedious to perform by hand.


Oblig NASA factoid: The guidance computer on Apollo that got as to the moon had 4096 bytes of RAM and about 72K of ROM.


Whew, excellent point. That raises another question, though—how much computational work did that computer have to do? The real heavy lifting was performed by big NASA mainframes on Earth, right?


The Apollo computer did things like control rocket burns, provide navigational information, etc. More information

http://nassp.sourceforge.net/wiki/Guidance_and_Control_Syste...

The source is on Github, too, for example:

https://github.com/chrislgarry/Apollo-11/blob/master/Luminar...


The original gameboy has 8K of RAM and the first and second gen pokemon games run on it. Most of the hard work is done by the graphics and sound hardware. Fairly large games though.


The original NES had 2K of RAM (but cartridges could expand that).


You could land a spaceship in the moon with a computer with less RAM.

https://en.wikipedia.org/wiki/Apollo_Guidance_Computer


> with a computer with less RAM

And a team of scientists who had done all the difficult calculations beforehand...


In 1969, a computer with "2048 words of RAM" and "36,864 words of ROM" managed a landing on the Moon and subsequent return[1].

People do amazing things with primitive tools.

[1] https://en.wikipedia.org/wiki/Apollo_Guidance_Computer


(An aside: do terms like "mainframe" and "microcomputer" have any meaning any more, when a Raspberry Pi Zero has orders of magnitude greater RAM and power than a '70s piece of "big iron"?)


Yes. Mainframes still exist. IBM still sells them, and you can do things like hotswap CPUs and RAM or setup an geographically distributed HA cluster that can swap which mainframe the VMs or databases are "running" on without dropping connections, requests, or other interruptions.

https://en.wikipedia.org/wiki/IBM_Z


One of the four large banks in my country I worked at only a few years ago still managed all their internal change management process on a fairly old IBM mainframe.

You had to connect via an arcane telnet client (tn3270 protocol perhaps?) and input the change details. No fancy web forms. Perhaps it was a limitation of the application, but you couldn't mix uppercase and lowercase in the one form.


The PDP (11? I think so, but maybe 7? My home PDP for running the internet in my corner of the universe was an 11) at JPL that processed the Voyager flyby snapshots of Saturn into colorized planet+ring images was 64KB. Instead of viewport-centric geometry scans doing texture lookup, they scanned in texture-space (less swapping)


Thanks for sharing


[flagged]


No matter how you feel, posting like this will get your account banned.

https://news.ycombinator.com/newsguidelines.html


Maybe too much content? 1000+ pages, many architectures... Probably too much content for a beginner book?

Btw, a great book imho is "Assembly Language Step By Step - Programming with Linux - 3rd ed" (https://musho.tk/l/d2d56a34).

The great things is that it is an easy read and really starts from the basic and explains how the i386 architecture works, and then explains how to program it using assembly.

The sad thing is that afaik the author is quite old and probably is not going to release a 4th edition, meaning that the book will stay on intel i386.


I have the first or second edition of "Assembly Language Step By Step", and it's the best intro I know of.

It must be difficult to write a good assembly book. On one hand there's a lot of basics to cover, like memory addressing, segmentation registers, etc. But on the other hand, the main use case for it today is hand optimized functions for when the compiler can't optimize enough, which is inherently an advanced topic.


There is another use: Understanding compiler behavior and reverse engineering. Those mostly need reading skills.


I read the 2nd edition of Jeff Duntemann's book (DOS & Linux) in little more than a couple of sittings. Incredibly readable! Best introduction on the background of how CPUs work and what machine language is about.


>Assembly Language Step By Step - Programming with Linux - 3rd ed +1. I have read 2-nd edition. Great book!


The fact that the document targets multiple architectures is great, most ASM tutorials online target solely x86 and as such give a very partial view of how assembly is written and how CPUs work, and on top of that x86 ASM is really quite a mess.

I've skimmed the document and it seems rather thorough and doesn't shy away from all the nasty details (which is a good thing for an ASM tutorial), the only criticism I have so far is that the assembly listings are a bit hard to read, additional whitespace and basic syntax highlighting (at least to isolate the comments) would make it a bit easier on the eyes I think, for instance: https://svkt.org/~simias/up/20180717-151848_asm-listing.png


Yep, I was very happy to see the ARM and MIPS output. Syntax highlighting is a great idea. It didn't even bother me but this is coming from a guy that had a hard copy print out of the 386 programmers reference.


Would have been over-the-top cool if RISC-V had been included as well.


This seems to the same text as the author's Reverse Engineering for Beginners (https://beginners.re/)

https://news.ycombinator.com/from?site=beginners.re


Yeah, I'm a little confused about this "build" of the pdf. It clearly states: "The latest version (and Russian edition) of this text is accessible at beginners.re." - which points to: https://github.com/DennisYurichev/RE-for-beginners

But as far as I can tell there's no branch with a different name - maybe this was just a working title for the English version at some point?

Anyway, this new submission with a new title made take a look, so I'm happy :)

Now, I just hope someone takes a crack at forcing an epub build for better reflow/resize on small screens...

There's a (dormant?) issue: https://github.com/DennisYurichev/RE-for-beginners/issues/37...


The title pages are different. The Reverse Engineering book's title text is hex digits, and is dated July 9, 2018. The Assembly Language book's title page is English text, and is dated today, July 17, 2018. And there is no mention of the Assembly Language book on his home page, yurichev.com.



Fun fact: both books have precisely 1082 pages.


Fun fact: both books have precisely the same content


Fun fact: both sources have precisely the same favicon as well.


At one point when I was considerably younger I started learning 32-bit x86 assembly as my very naive career goal was to become the next Linux Torvalds. I managed to construct a multiboot-compliant 32-bit kernel that could load tasks from a ext2 ramdisk image passed to the bootloader and multitask up to around 64 processes. I figured out how to use the CR3 register and craft page tables, enter into kernel syscalls using software interrupts, handle CPU faults, page faults, double faults etc. It was quite the learning experience until it eventually crumbled under my lack of skill and foresight. In short, I got more or less about as far as most amateur, self-taught, lone OS developers get before losing interest and giving up.

Fast forward a couple of decades, and I found myself reverse engineering CP/M for the Z80 processor in order to create a virtual Z80-based system that ran inside Unreal Engine. I started with Udo Munk's wonderful Z80pack system, adapted a public domain Z80 CPU emulator which was written in C to C++, and did minimal reimplementation of the Z80pack terminal and disk I/O devices. Since the systems were implemented as "actors" in UE4 it's possible to spawn and run quite a few concurrently as long as you limit the CPU speed of each instance somewhat.

The resulting virtual system in UE4 is able to run original CP/M ports of Rogue and Zork (https://i.imgur.com/gnOCp3e.png), various Z80 instruction exercisers (https://i.imgur.com/kwNuq5X.png), a Z80 C compiler and and even Wordstar 4 (https://i.imgur.com/Q6307w3.jpg) and Microsoft BASIC.

Learning assembly can be a lot of fun - it can really teach you quite a bit about systems architecture that you otherwise might not get if you're always programming in high-level languages only.


Have you published a plugin or the source? Sounds very interesting for using in computers inside the game (as in Fallout and the like).


I'd like to but it would still take quite a bit of work to make it production quality code, and I don't really have the time right now. One day I hope.


My god, 1000+ pages. What a great submission though. I wonder what kind of things keep such people motivated to write one fuckin' thousand and eighty two pages about assembly!? This is nuts.


A couple of points come to mind:

1. Assembly language needs more lines of code to achieve the same task than higher level languages, by its very nature.

2. What I call Pascal's Amendment :) - very loosely like claiming the Fifth Amendment (to the US Constitution):

https://en.wikipedia.org/wiki/Fifth_Amendment_to_the_United_...

https://www.brainyquote.com/quotes/blaise_pascal_386732

"I have made this letter longer than usual, only because I have not had the time to make it shorter."

- Blaise Pascal

As a writer, I can corroborate that. In fact, if he "had the time to make it shorter", it implies that he spent even more time to write those 1000+ pages than it would seem on first glance. Plus even more than for the same 1000+ pages if in a higher level language, since assembly is a lot more error-prone.


I don’t think it’s nuts to share such depth of knowledge. I greatly welcome the knowledge, too. I legitimately am wondering what sort of project I could take up that would be “small enough”, only do-able in asm/c, and interesting.


Think "nuts" was used as an endearing compliment there. Like "insane" in the sense of "insanely good".


wow - impressive. In the intro of the book, yes, its a book - there is a call for proof read (English and Russian) and translators, will accept work no matter how small and credit. now that's cool. when I get a few minutes between meetings i'm going to see if I can find anything to contribute and submit. I absolutely love the tone of the book! what a cool guy.


"Chapter 1" is over 400 pages. :)


Old school, but I've always enjoyed this gentle introduction:

https://chortle.ccsu.edu/AssemblyTutorial/index.html


That looks nice. People learning hardware, too, might follow-it up with a study of Plasma MIPS:

https://opencores.org/project/plasma

Then, they'll understand at least one architecture inside and out. Plus be able to customize it to their liking. :)


Amazing. One of my fondest memories is when, at age 10-11, I "played" Core War [0] with an older friend of mine (he was 20-21 back then, and a CS student at one of the best CS universities in Italy, Pisa). I loved to learn how to program in a game-like environment. I still remember several things, ~30 years later.

[0]: https://en.wikipedia.org/wiki/Core_War


DiffPDF 2.1.3 finds only two tiny (if weird) differences in the appearance of the two versions of the books (https://yurichev.com/writings/AL4B-EN.pdf and https://beginners.re/RE4B-EN.pdf) besides the title, author and build date: two page references at pages 1075 and 1078.

Since both sites appear to be owned by the book's author this is most likely just a change that has not yet been pushed to github (or mentioned in the author's sites), but it would be better if the author clarified it (would that be you, dennis714?)



Godbolt is my favorite goto for "What would this look like in assembly" answer websites.


That will only tell you what a compiler will generate, though, which sometimes isn't enough.


I got interested in assembly fairly early on my programming career by playing wargames(io.smashthestack.org anyone?). Writing exploit payloads was very fun. After that in my career i have only written assembly to vectorize some code using neon.

I think the best reason to learn assembly is not to write rather be able to read compiler output.


This is a great resource. I don't think it is for beginners, it is a bit overwhelming. But as a programmer who has to dive into asm every few months (on a variety of architectures!), this will be a very helpful reference to help me reset my frame of mind for the particular architecture.


I feel like 6502 assembly is probably better for beginners, but that might be because that's what I'm trying to teach myself...


This sentiment is one of the reasons why I didn't bother in the past: it's apparently so complex that I have to learn something completely different and useless these days, before I can start learning something useful. I'd much rather just deep dive into x86_64 or ARM or something.

These days I know that older versions are still (partly?) included in x86_64 and that they're often mostly the same, but that was not clear to me when I saw tutorials for ancient architectures of which I didn't see the point.

But then, I've never taken well to the school system where you get taught something stupid first only to see practical applications later. It's why I dropped out of high school and went to do something of which I did see the use (a sysadmin 'study', because I knew I enjoyed working with computers, and that was indeed a good fit for me).


Saying that 6502 is useless is a bit harsh- I wouldn't call it an employable skill, but it's great for hobbyists who are into retrocomputing (like me).

You could deep dive into x86_64 or ARM, but in the general case you would never actually code in those (i.e., most folks trust the compiler) unless you were writing a driver or writing something with crazy performance like MenuetOS.


"Saying that 6502 is useless is a bit harsh. I wouldn't call it an employable skill"

It must be both useful and a job skill for some people:

https://wdc65xx.com/chips/

I wouldn't study it to get a job. There's apparently still utility in it, though, with WDC's versions of it.


What resources are you using to learn?


A little bit of Easy 6502 [0] but mostly Richard Haskell's Apple II-6502 Assembly Language Tutor [1][2] along with Virtual][ [3] . Basically I'm trying to use assorted manuals I could find in my childhood home's basement, most of which seem to be out of print (although there are some books floating around on archive.org).

[0] https://skilldrick.github.io/easy6502/

[1] https://www.amazon.com/Apple-II-6502-assembly-language-tutor...

[2] https://archive.org/details/Apple_II_6502_Assembly_Language_...

[3] http://www.virtualii.com/


Now do that for the GPU.


To be fair to NVIDIA, they have done a pretty good job here.

https://docs.nvidia.com/cuda/parallel-thread-execution/index...


I thought that PTX is an intermediate format, kind of like a Nvidia specific LLVM bitcode for GPUs.


Great share. I have been curious about assembly languages for a while. Hopefully it’s not to technical for someone without a CS background.


13 year old kids figure it out. I was one of them, and I'm an idiot more days than I care to admit. Granted, it was 6502 assembly, but later I'd be able to sift my through x86, before it went completely nuts.


I think my motivation for learning 65C02 assembly back in the day was necessity. To take advantage of the Apple //e at any reasonable speed I had to. Besides 65C02 was simple, as was 16 bit x86. Things got a lot more complicated and the necessity went down.


Great resource! It brings me back to the early days of the web, poring over "Fravia's Pages of Reverse Engineering".


I've never had a desire to learn about Assembly, but I thought the first few pages of this book were kinda interesting, and I might end up learning more than I had ever cared to.


This looks great! Thank you for sharing :)


Nice basic intro to assembler, as a person from some electronics background it fits with knowledge of simpler architectures


Thanks for the recommendation.


Thank you. This is a godsend.


Nice!


Nice share! Thanks


This isn't for beginners. What beginner's assembly text covers multiple architectures and assembly flavors?

My recommendation for beginner's assembly on linux is to write toy code in C and then view the disassembly in gdb or objdump. You have options to switch to Intel syntax from GAS/AT&T if you want.

I'm generally against using windows for anything, but Visual Studio has decent disassembly debug options where you can step through the native assembly code. You could also look at IL code ( which is very similar to native assembly ) and learn assembly concepts that way. ildasm and ilasm are great tools for that.

Assembly is so low level and can be intimidating to write from scratch in the beginning. It's better for beginners to write code in a higher level language like C and then read the compiler generated assembly code. Once they are comfortable with a disassembly of a "hello world" program, then write more complicated code and understand the disassembly. Then try to edit and compile the disassembled code. Once you are comfortable, then write your own assembly code from scratch.

Edit: Also, if you have time and the will, watch the nand2tetris lectures and try their projects. It'll give you a hands-on general overview of hardware to assembly to VM to OO. How native assembly works with hardware. How VM interacts with native assembly. How you go from OO to VM? It's a very limited but educational overview of code flow from object oriented code all the way to circuitry ( software circuitry ).


The book literally follows this process starting on page 5.


[flagged]


This is spam.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: