Hacker News new | past | comments | ask | show | jobs | submit login
"Hello, world" in Assembly. (sourceforge.net)
89 points by pavs on Sept 5, 2010 | hide | past | favorite | 33 comments



"System calls in Linux are done through int 0x80. (actually there's a kernel patch allowing system calls to be done via the syscall (sysenter) instruction on newer CPUs, but this thing is still experimental)."

This is no longer the case. While int 0x80 still works, this is not how well behaved binaries should syscall on Linux.

The proper way is to call a magic address on a magic page in own process space, which contains the proper code to do the system call. It will execute int 0x80 on CPUs that don't support sysenter/sysexit and sysenter on those that do.

The page (vdso) is inserted in all processes' address space by the kernel. Although its address is probably fixed, the proper way to find it is to examine the ELF auxilliary vector, which in turn is passed by the kernel via the initial process stack.


Intel archs have support for call gates that allow jumps to privileged code (ring 0/1/2). It maps nicely to the notion of syscalls, but I wonder why it hasn't been used.


Not sure about the times before the sysenter mechanism (maybe int 0x80 was more portable?), but now that sysenter is in place it is faster than call gates, because most of its behaviour is hardcoded in the CPU itself.


Yesterday I submitted "A Problem Course in Compilation: From Python to x86 Assembly". An 80 page book with worked examples. It got ZERO up votes.

http://news.ycombinator.com/item?id=1662430

Yeah, I am jealous.


HN works best for discovery of interesting fields/material. That's why short pieces like the OP get up-voted: if someone hasn't looked at assembly before, they might learn a thing or two. The problem with what you submitted is that there's no easy way to verify its quality, because 1. The author isn't well-known and 2. It's too long (and there's little incentive) for someone knowledgeable to go through it and post a thumbs up in the comments.

If I seriously want to learn about compilation, I'll just Google site:news.ycombinator.com compiler books, and grab the top recommendations.


With the high frequency of posts on HN, I can imagine a post on HN not making it "big" even if it's good. Digg, Reddit and other democratic content sites suffer from the same problem. They don't guarantee that all good posts will be at the top. They only guarantee that all the posts at the top are good. Most of the time, the good stuff gets enough up-votes to have the same behavior reinforced.

I wonder if spending some of your karma to the post would help. Though, I see no dearth of hackers here with a ton of karma, so it would have to be substantial "price" to make that worthwhile.


If you want traffic and downloads, work through the book a little at a time as blog posts, with "learn more" type links to your book. That will both satisfy "the masses" as others said, and solve the underlying problem you're trying to solve.

Goodluck!


That link currently has more votes than this one, despite the pain to skim and evaluate an 80-page PDF. Thanks for the good find.


Not anywhere in any non-trivial population of people will you ever see anything of serious depth and novelty (aka, not some kind of cargo cult "classic") get nearly as much recognition as LOLcat/Hello World type material.

This is one of my favorite communities, but it's not exceptional in that regard.

That said, I like your link.


I think it may be when a site transforms from basically being a resource to being considered a community. I effectively just said goodbye to Less Wrong, which I have followed longer than HN.


For those interested in how this translates to Win32, there's a site with a wealth of tutorials for Win32 assembly that I had used just about 10 years ago. I just Googled and it still exists.

http://win32assembly.online.fr/tutorials.html


If you test it under OS X you need to use the OpenBSD/NetBSD instructions using the FreeBSD code but you will use a 'macho' object file.


Yep, "nasm -f macho hello.s && ld -e _start -o hello hello.o" did the trick for me on Mac OS X 10.6


I take it you mean Mach-O?

Mach-O is mach microkernel object format.

http://en.wikipedia.org/wiki/Mach-O

In general, BSD assembler tends to be pretty stack heavy for cultural/engineering reasons. I don't think it ultimately matters all that much if you're fiddling around on your personal machine.


In general, BSD assembler tends to be pretty stack heavy for cultural/engineering reasons. I don't think it ultimately matters all that much if you're fiddling around on your personal machine.

I'm not sure I follow?


Can someone give some insight on the last line?

That's it. Simple. Now you can launch the hello program by entering ./hello. Look at the binary size -- surprised?

Without being able to compile the executable myself, is the binary surprisingly large or small? A few months ago, someone posted this article:

http://blog.ksplice.com/2010/03/libc-free-world/

Which leads me to believe that the executable is surprisingly large given the content. Am I understanding that correctly?


I am only guessing but I suspect the binary will be tiny. The earlier article you mentioned described how hello world is surprisingly large with libc overheads. It then sets out to eliminate those overheads and concentrates on the problems involved in doing that. But it doesn't mention how small the binary gets once those overheads are successfully eliminated, My expectation would be just a handful of bytes.


In my case the binary turned out to be much larger than I would have expected. I was expecting a couple hundred bytes tops, but 8k?

  osx-box:dev esa$ nasm -f macho hello.asm && ld -e _start -o hello hello.o

  osx-box:dev esa$ ls -lh hello*
  -rwxr-xr-x  1 esa  staff   8.1K Sep  5 22:43 hello
  -rw-r--r--  1 esa  staff   721B Sep  5 22:39 hello.asm
  -rw-r--r--  1 esa  staff   389B Sep  5 22:43 hello.o


Also, http://www.int80h.org/smallest/

The rest of int80h is also interesting, but is FreeBSD oriented. The site presents its own Hello World - http://www.int80h.org/bsdasm/#first-program


And here's a Linux version of smallest - http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...

Edit: previously submitted - http://news.ycombinator.com/item?id=68056



Here's the "Hello, World" written in VAX Macro32 assembler:

http://labs.hoffmanlabs.com/node/1435

And if you're so inclined, there are (free) OpenVMS licenses for hobbyists, US$30 CD media kits, and (free) VAX hardware emulators for most any OS platform.


Why is this news -- or even interesting? Perhaps assembly feels exotic to a dilettante, but to anyone with a computer science degree this was simply a part (and an early and basic part) of their education...


Computer science curriculums vary wildly. And just because you were taught assembly and how a cpu works, doesn't mean you learned how to do it on unix or even x86. (My college taught with the spim mips emulator. I also taught myself dos assembly when I was younger, but this is the first time I've seen unix assembly.)


Nah, what you're seeing here is relatively poor, old, IA32, AT&T-style (which means GCC targeted), Linux-targeted assembly code.

This is not the article I would use to learn about Linux, operating systems, or assembly (although it's more accurate to say you learn "an ISA" instead of "assembly").


The article uses Nasm not GCC (or rather, not Gas - though Gas supports Intel syntax too now, afaik), and is definitely using Intel syntax. Note for instance the unintuitive order of source and destination in the MOV calls and the lack of delimiters on the operands, amongst other things.

[edit: clarification]


I strongly object to your characterization; it's highly dependent on what you're familiar with. MOV calls map intuitively to assignment, and with assignment, the LHS is the destination and the RHS is the value. And Intel syntax uses type inference to deduce the width of operands, and only requires DWORD PTR / WORD PTR / BYTE PTR etc. to disambiguate when it's necessary.

(I have a very strong personal preference for the Intel syntax.)


I prefer Intel syntax too, in every other respect other than order of operands. That's really just because it differs from the only other assembly language I've ever used (68k). I still think it's unintuitive, but it's no worse than, say, the order of arguments to the Unix 'ln' command. You get used to it.

The comment about the lack of delimiters was not a value judgement; I was just pointing out one of the obvious signs that the article wasn't using AT&T syntax. I think that having to stick $ on immediate operands is particularly annoying, for instance.


And the trick to getting used to things is finding the right mnemonic.

I used to get tripped up by 'ln -s' all the time, until I realized that if I thought about 'cp src dst' but with 'cp' replaced 'cp' with 'ln -s' it suddenly made sense.

Likewise Intel assembly's mov statement and its similarity with assignment statements that barrkel mentioned.

Whether something seems intuitive or not depends on the analogy you prefer, or even your choice of words. E.g. look at 2D coordinates for a terminal character cell: if you think of them in terms of (x, y), you will find that order intuitive, but if you think (row, col), you'll find the opposite order intuitive.


Did you find assembly interesting the first time you encountered it? Some people here are still there. A CS degree is not yet a prerequisite for HN user registration!


true- but this type of content does not (at least in my experience) lend itself well to sharing discovery. it (at least in my experience) comes with the territory of interest.

i guess more remarkably, there is no timeliness to this article.


i would say it's a nightmare programming in assembly without going through your tutorial.!!!


Sweet.

We're covering Hello World in my software dev. class on wednesday.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: