Hacker News new | past | comments | ask | show | jobs | submit login
The Architecture of Open Source Applications: Nginx (aosabook.org)
292 points by larsvegas_ on Nov 23, 2015 | hide | past | favorite | 126 comments



Lua support in nginx is phenomenal, especially when combined with LuaJIT. It basically allows to transform nginx into an application server and run arbitrary code that can pretty much do anything.

I would like to mention agentzh and his team that did an amazing job in releasing OpenResty[1] which makes it easy to extend nginx with custom Lua functionality, which also happens to be the backbone of CloudFlare architecture, and the core technology being used by projects like Kong[2] when it comes to microservices management.

[1] http://openresty.org/

[2] https://github.com/Mashape/kong


+1

I built a nginx+luajit RTB bidder that did 168k qps on 8 cores. It smashed the C eventlib version and of course it crushed the Java version. Golang came in a close second but at the time with 8 cores I had to do crazy things to get it to pin to CPU's since the golang thread scheduling didn't seem very scaleable beyond 4 cores.


168K/s? Did you actually decode the bid request (json) or just return no bid? That seems too good to be true.


I also would love to know details about this .. very impressive stats!


These bidding platforms typically don't provide you with JSON but with some headers or query parameters that are very quick to parse (especially for the native nginx code), so your Lua code can basically boil down to a few table lookups.


Uh, look up OpenRTB. Except for Doubleclick Adx almost all exchanges encode bid requests in JSON.


May be I got it the wrong. Are the time spent on the bid processing portion (CPU bounded) or the network messaging and connection handling portion (IO bounded)? When talking about eventlib and nginx, I assume the network messaging and connection handling are relevant. In that case 168K requests/sec doesn't sound too great.

This benchmark shows Ngix+Lua (openresty) is more than 5 times slower than the leading ones, which can do more than 6M requests/sec.

https://www.techempower.com/benchmarks/#section=data-r11&hw=...


The problem w/ engine's Lua support is that it doesn't go far enough. Apache's mod_lua allows you to create complete modules for Apache, with total access to Apache's API.


I really enjoy nginx, as it's more flexible to configure but I've never understood why Apache got this slow label...

FTA: In February 2012, the Apache 2.4.x branch was released to the public. Although this latest release of Apache has added new multi-processing core modules and new proxy modules aimed at enhancing scalability and performance, it's too soon to tell if its performance, concurrency and resource utilization are now on par with, or better than, pure event-driven web servers. It would be very nice to see Apache application servers scale better with the new version, though, as it could potentially alleviate bottlenecks on the backend side which still often remain unsolved in typical nginx-plus-Apache web configurations.

I'm using Apache 2.4 with mpm_event + mod_proxy_fcgid and it's doing fine - 99% of the work and time spend is done in the FastCGI application anyway and for static content mpm_event is good enough. I wouldn't run a dedicated static CDN box on Apache but for everything that can run on a single server Apache can also do the job... even HTTP/2 with mod_h2 works fine as of 2.4.17

A problem with nginx is to figure out what matches in a complex config... it's not straightforward. .htaccess is nice and simple for a shared server with lot's of users.

I really like nginx but I guess most people just don't really need it. Migrating to 2.4 and mpm_event should be good enough.


Historical reasons...after nginx started blowing Apache's performance out of the water, Apache addressed some fundamental architectural issues and caught (mostly) up.

As an example, I would refer you to a discussion down this thread about Apache 1.x forking for concurrent connections.


Also just for context remember that on linux forking and threads are basically the same thing so you can't just switch out one for the other and expect performance to change (it won't).


"forking and threads are basically the same thing"

It's nowhere near that simple. You can fork lightweight like threads or heavy. One of the terrible things about Apache performance was all the heavy forking on every single request for generated pages. That was back when it was really expensive.


I've used Apache forever and have always been able to tune it to get the job done. The thing that made me switch to nginx was the ability to use more than one SSL cert on the same IP. Now that I was "forced" to try it, I'm pretty happy with it.


there was never a time when nginx supported sni and apache did not.

Nginx requires an openssl version that supports sni -- that was added in 0.9.8f on Oct 2007 (if you compiled openssl manually to enable it), and was enabled by default in 0.9.8j on Jan 2009.

Apache supported SNI using mod_gnutls since 2005; and in 2.2.12 (July 2009) using openssl.


Technically correct, however for users that install from OS packages it's possible that they're getting a version of nginx that had SNI enabled and a version of Apache that did not.


Apache supports SNI as well.


.htaccess are a huge drag for performances. I think it might contribute to the slow label.


Those can be disabled and I always do so. Location specific configuration can be put in the main configuration files.

They should never be used unless you're in a shared hosting environment where you don't have a choice.


> Migrating to 2.4 and mpm_event should be good enough.

Debian e.g. only received 2.4 with Jessie. Meanwhile, a feature-comparable nginx was available at least in Wheezy, maybe earlier.

If I have to choose between "compile my own Apache" and "just install nginx", well… Actually, I had to choose, and I picked nginx.


It got the slow label from all the years BEFORE 2012.....


No matter how many times I see “nginx”, and know that it’s supposed to be “engine X”, I always pronounce it as [ŋɪŋks] in my head!

The way nginx handles requests and responses in an implicit event loop reminds me of a recent talk by Brian Kernighan, in which he mentions the ubiquity of the “pattern–action” model in many domains. I think it’s a very useful architectural pattern to have in mind when you’re designing a configuration system or a DSL.

I also liked this quote:

> …it is worth avoiding the dilution of development efforts on something that is neither the developer’s core competence or the target application.


Until I learnt the correct way of pronouncing it, I always thought it was "en-ginks".

Sometimes there can be a long gap (even many years) between first reading about something and first using its name in conversation. A long time for mispronunciations to stew away in my brain (assuming the person I'm talking to even knows how to pronounce it themselves.)


I spent an embarrassing amount of my life thinking epitome was 'epa tome' instead of 'a pit oh me'.


Here’s a sizable list of words like that: http://english.stackexchange.com/q/1431/1506


Mine was "hitherto." I pronounced it "hit-hurt-oh" once. Once. (It's "hither too" if you didn't know.)


Better than hit-her-too, I guess.


Oh... today I learn.


I still thought that until just now - that there were 2 different words: epitomy and epitome; epitome being stronger: suggesting some kind of singular platonic ideal the former meaning just a good example of this kind of thing

Just as I used to think misled (miss-led) and misled (my-zled) were 2 different words - the latter implying an element of malice


'cache' being pronounced 'kash', not 'kaysh' is one that's still wrong in my brain.


Oh, don't feel too bad; a few weeks ago, my wife had lunch with the Ivy League-pedigreed nth-generation CEO of a reasonably large midsized company who used the exact same mispronunciation. Paris being worth a mass, she chose not to correct him.


There was a story floating around about a self-taught C# programmer going for a job, and verbally describing the language as "C Pound".


I have been known to call that language "Coctothorpe" when in a biting mood.


Or editorilizing C-hash


I especially recall the old debate in the mid-1990s about whether Linux was pronounced "Lee-nux" or "Lin-ux."



I still always think "an-jye-nex" (like a chemical that would give you angina).


I've taken to calling it "engine eggs" in protest


What have you done! Now nginx will read "engine eggs" to me.


Are you ESL (or bilingual)? I'm curious because word-initial [ŋ] is unpronounceable to most native English speakers.


For others like me who are phonetically-illiterate, [ŋ] is explained here: https://en.wikipedia.org/wiki/Velar_nasal


I’m a native English speaker with a background in phonetics and a silly brain. I speak several languages very poorly. :)

Rather like Larry Wall:

> I started trying to teach myself Japanese about 10 years ago, and I could speak it quite well, because of my phonology and phonetics training–but it’s very hard for me to understand what anybody says. So I can go to Japan and ask for directions, but I can’t really understand the answers!


Everyone I've asked about how they initially pronounced nginx said "enjinx". Maybe 4 people total, all native english speakers.


If it's supposed to be pronounced "engine X" they probably could have named it "engineX". That's what you get for trying to be cool.


I suspect, it was more to do with enginex[.com;.org] domains not available.


Do you have a link to that talk? Sounds relevant to a side project I'm working on. Thanks!



At the office I always hear it pronounced 'en-Gen-nix'


wait...it's not "in-jin-ix"?


While I like nginx over apache because I've been burned too many times by strange apache configs, I have recently found and am growing to love Hiawatha. It's a GPL webserver focused on security, uses PolarSSL which jut got bought out and is now mbed, and it's pretty fast. All the benchmarks I've seen show it comparable to stock nginx, apache, but once you tack on some of the optimizations nginx will beat Hiawatha. It also has a very easy config syntax.

Just for anyone interested: https://www.hiawatha-webserver.org/

The dev doesn't do much advertising, so word of mouth on a place like HN really helps.


What optimizations are you talking about? And how do those beat Hiawatha?


Well, I could have sworn I remember seeing a benchmark comparing stock vs basically tuned (config files) nginx, hiawatha, and a few others, where stock hiawatha won but tuned nginx beat it out, but I just spent ten minutes or so looking for it and couldn't find it, so I'm going to have to retract my earlier statement I suppose.

Either way, I think Hiawatha is a great webserver that should get more attention. Especially since I am GPL proponent and Hiawatha is one of the only currently maintained GPL webservers.

As a bonus here is a benchmark of nginx and hiawatha under attack. (notice the service drop on nginx)

https://www.hiawatha-webserver.org/weblog/64


Every "new" webserver tries to fix apache configuration syntax mess, and imho they all fail. Yes, setting up a reverse proxy looks simpler with nginx or haproxy for that matter. Now when it comes to complex configurations they all suck, and I'm not sure a json/yaml config format is going to fix that as long as webservers have such wide scopes (from serving static pages, to proxying traffic, authorizing, authenticating, encrypting...). At least apache is very modular on this regard and some credit should be given to it having survived and evolved along with all the newer options.


The better choice is not a new webserver program with a different, possibly better, configuration language, it's HTTP server libraries that can be used from actual programming languages to do whatever you want. Many languages ship with really good libraries these days.


They indeed are shipping with really good libraries these days but no matter how good they are, there is some level of doubt that they fail to alleviate and they get deployed behind nginx or apache httpd anyway.


It seems more like they are deployed behind load balancers, because load balancing is a specific job that it makes sense to do with a separate piece of software. Admittedly, that separate piece of software is often nginx, but there are other popular options, and I don't think very many people use Apache that way (though I could be wrong).


I guess I should have said Apache Traffic Server rather than httpd. Their plugins and routing configuration is getting pretty enjoyable to work with.


Every third telecom around the world uses apache to load balance requests for specific application.

Source: I have to support that SW stack.


Interesting, thanks for the info!


> Every "new" webserver tries to fix apache configuration syntax mess, and imho they all fail.

Fail at what? What were they were trying to "fix"?

> Yes, setting up a reverse proxy looks simpler with nginx or haproxy for that matter.

People don't choose nginx and haproxy because of the configuration syntax, they choose it for performance and features.


I wish there was more input as to why Apache 2.4 isn't suitable. It's been 3 years since its event driven model was released and it is a perfectly acceptable web server even for static content.


Apache 2.4 isn't event driven. The Apache's Even MPM only handles keep-alive connections asynchronously, while the whole request processing is still synchronous. It solves only one problem of Apache, but it still doesn't make it as scalable as nginx.


Hi, I'm one of the main authors (along with others, it is a community driven project) of the Event MPM for Apache.

On Benchmarks and timelines: I agree that 2.2+ was not widely available for many years due to the update cycle of linux distributions, but at the time nginx wasn't in the distros... so it become a thing where people would yum install apache2, get a 2-5 year old version, and they would benchmark that against an ngnix from their latest dev download.

The original work for the Event MPM started around 2004:

http://mail-archives.apache.org/mod_mbox/httpd-dev/200411.mb...

The version in 2.2 was mostly focused on Keep-Alive requests. Apache 2.2.0 was first kicked out on December 1, 2005.

To go beyond Keep-Alive requests, is a set of features/patches called "Async Write Completion". Much of this work was done in 2006-2007 by Graham Leggett:

https://mail-archives.apache.org/mod_mbox/httpd-dev/201510.m...

Timing wise, most of that work did not find its way into a stable release until 2.4, which came out February 17, 2012. This is the date the article references.


Thanks for your work on Apache. Do you recommend any published benchmarks of modern Apache vs. modern Nginx?



Those look to be a few years old. :-)


> Although Apache provided a solid foundation for future development, it was architected to spawn a copy of itself for each new connection

That's not really correct:

https://httpd.apache.org/docs/2.2/mod/prefork.html

Nginx makes it easier to handle a bunch of concurrent connections, but it's not as if Apache simply forks for each new connection.


You're quoting from a paragraph that's summarizing the history of Apache, and the statement was true about Apache 1.x.

Back in the day, Apache 2.0 took a long time to gain substantial market share, for various reasons. The situation was not too dissimilar from the current Python 2/3 split.

Edit: But I guess Apache never forked for every connection, even in 1.3. It only forked if it needed a new child and the existing ones were busy.


> the statement was true about Apache 1.x.

Apache 1.3 did not fork a process per connection. It forked to handle additional concurrent connections, but that's very different than forking for each and every connection.


> In February 2012, the Apache 2.4.x branch was released to the public. Although this latest release of Apache has added new multi-processing core modules and new proxy modules aimed at enhancing scalability and performance, it's too soon to tell if its performance, concurrency and resource utilization are now on par with, or better than, pure event-driven web servers.

When was this written? Is it still too soon to tell? 2.5 years seems like enough time to tell?


Math? 3.75 years?


More impressive in the same book, how another server (Warp) written in Haskell (GCed lazy functional language, supposedly way slower than C over epoll) achieves the same performance as nginx!

http://www.aosabook.org/en/posa/warp.html


Back when I was tinkering with mod_python performance[0] there was this web server called nxweb[1], which out-performed nginx consistently by quite a bit.

[0] http://grisha.org/blog/2013/11/07/mod-python-performance-rev... [1] https://bitbucket.org/yarosla/nxweb/overview


Why do web servers always seem to invent their own config file format? Whilst nginx seems to do it slightly more sanely than Apache, it still doesn't use something like YAML or JSON; is there a good/obvious reason for this I'm missing?


nginx configs can be difficult enough on their own without constraining them to formats that wouldn't allow directives that take arguments preceding a block, not to mention the extra escaping that would come with quoting already-quoted strings. This is a common formulation:

  location ~* \.(jpe?g|png|gif|ico)$ {
    ...
  }
That would be pretty messy inside of JSON or YAML. You couldn't have the location line be a key for a map/hash, because you can have multiple blocks with the same "key".

Besides, as bad as nginx configs are, just trying to understand which block "traps" a request is where you can spend most of your time; see ifIsEvil [1].

[1] https://www.nginx.com/resources/wiki/start/topics/depth/ifis...


Thanks for that comment. Running a complex nginx config with lot's of special treatment for subfolders on nginx - why and where a requests ends up in such a configuration is really difficult to tell.

However Apache just can't do some things that help you shooting in your foot. Maybe the C++ vs. Java comparison is not too far fetched.


nginx is one of those things that requires its own mentality to really get. Like SQL's set logic or CSS's forward-looking selectors, it's not quite like traditional imperative, control flow-based programming. But once you start thinking the way it wants you to think, there are benefits.

I'm in the middle of a project built entirely in nginx and it's astoundingly performant. The restrictions on what I can do in (mainline) nginx force me to think through how I structure the blocks and directives with better logic representative of a web server and not an application server, which is what I'm used to.

DigitalOcean has a decent tutorial on now nginx decides on server and location block [1]. And, related to ifIsEvil, this blog post [2] goes a little into explaining how nginx "traps" a request. If someone has better resources, I would appreciate them.

[1] https://www.digitalocean.com/community/tutorials/understandi...

[2] http://agentzh.blogspot.com/2011/03/how-nginx-location-if-wo...


I've written a tool to tell us exactly that, called nginspex...hoping to open source it shortly. We use generated configs with tens of thousands of lines, for me this was an exercise in learning what nginx is up to as well as being able to say how a request will be handled.

The tool interprets the nginx conf, rather than compiling any of it (as nginx does with the rewrite rules), makes it easy to log which lines are involved in the processing as it hits them.


> Since its public launch in 2004

YAML's initial release was 2001[0], likely not yet popular or stable enough for consideration.

JSON was likewise "released" in 2002[1] with a similar story.

In all likelihood, we're probably just lucky Igor didn't go with XML.

[0] https://en.wikipedia.org/wiki/YAML

[1] https://en.wikipedia.org/wiki/JSON#History


(s-expression (cough (cough)))


> In all likelihood, we're probably just lucky Igor didn't go with XML.

Okay I'll bite, what's wrong with writing nginx to go with XML?

XML is easily translatable to JSON, so... there's that.


> XML is easily translatable to JSON

Not really. XML is a document markup language, where order is important. JSON is a serialization format, and is a bit more bare-boned. For example, how would you transfer this XML to json, and back to XML?

    <foo bar="baz"><bork>foo1</bork><bork>foo2</bork></foo>


There are many ways to do it, you just need to pick a convention. For example, one convention is that you use "#" as key for element name, "." as key for element children, and attribute name as keys for the attributes. Following those conventions, you can produce:

    {
        "#" = "foo",
        "bar" = "baz",
        "." = [
            { "#" = "bork", "." = "foo1" },
            { "#" = "bork", "." = "foo2" }
        ]
    }
Or, for something more verbose (but maybe more intelligible), you could use "$element" as key for element names, and "$children" as key for child elements / text. (The point of choosing $ as a prefix, is it is not a valid character in attribute names, so cannot conflict with them.)


That is pretty hard for a human to parse and understand, though.

I think it could be reasonably easy to come up with a config file standard in XML or JSON, but that the format will have to rely on the strengths of each. Translating between the two just becomes an unreadable mess. If anything, if I were to write an application that allowed for either format, I would come up with a separate standard for each. More code/upkeep, but when the config files are intended for humans and to be hand-written, the focus should be on the user.


JSON is not all that readable for a human either, if that was the main complaint. I guess let me just ask this: Why does XML suck?


It's not just readability. XML has some serious issues when used as a configuration file format rather than as document markup. It has features oriented towards document markup that get in the way of writing data structures, and it lacks convenient features for writing data structures that yaml-like languages have.

First, a quick disclaimer: Apache conf format is not really XML. It leverages XML-like syntax but it's mostly not XML and avoids most of the serious problems that XML tends to bring, which I'll explain in more detail below.

XML was designed to provide structure to documents, it was not designed as a configuration syntax or a data serialization format. XML is meant for a document that already exists in its own right as a document, where the XML is added on as a layer to aid automated semantic understanding of that document's structure. It is not meant to directly represent programming data structures. As such, XML tags are designed to pop out and be visible from significant amounts of text that is not metadata. When there's more tags than text, as is usually the case when you try to use XML to write programming data structures, XML winds up being hopelessly verbose, and it's hard to avoid errors writing it (like misspelling end tags, forgetting a slash, putting end tags in the wrong order, etc.)

When not used for its intended purpose, XML winds up being hard for humans to write directly and hard(er than yaml and json) to write programs to parse it. In yaml and json, there's a standard, mostly direct mapping to common data structures in most high-level languages. With XML you have to make a lot of trivial decisions to make use of features that weren't designed for what you're trying to do. The most obvious examples are the distinction between attributes and tags: what does each one mean? What do tagnames represent? What do attribute values represent? What do attribute names represent? How do you handle CDATA that has more XML in it? XML is designed to elegantly handle something like this:

    <A>first section <B>marked up section</B> second section</A>
But this kind of structure is horrible for a configuration file, unless the CDATA sections are a parsed language of their own and parsed externally, which is essentially what Apache does. If you're trying to use XML to specify data structures like lists and trees, it's messy. Consider this example:

    <VirtualHost>
       <ServerName>my.server.domain.com</ServerName>
       <DocumentRoot>/var/lib/www/my.server.domain.com</DocumentRoot>
    </VirtualHost>
You might envision "VirtualHost" to be an item in a list, where the value of that item is a dictionary with subkeys specifying "ServerName" and "DocumentRoot." But in fact, there's more to it than that. An XML parser also gives you all the whitespace in between those two tags. You can discard it, you can write checks to ensure that nothing ever ends up in that unused CDATA area by mistake, you can write tools to generate the XML-- but no matter what method you choose it's something you have to think about that just doesn't come up if you are using a language designed for writing programming data structures instead of abusing one designed for marking up text documents.

And that example highlights another problem with XML which is a flat out lack of support for common programming data types such as lists and integers. In the example above, how would you know that "<VirtualHost>" represents an item in a list, but "<ServerName>" should be a key in a dictionary? XML doesn't help you there, every parser decides for itself.


Thank you for the extensive answer!


> There are many ways to do it, you just need to pick a convention.

The fact that you have to pick a convention is the crux of the issue. "Easy" is a subjective term, but the fact is there isn't a direct mapping between XML And JSON.

It means that if you're using XML and converting it into a programming data structure, you have to make a bunch of decisions about how to handle the XML. It means that if you're converting a programming data structure to XML, you have to have a bunch of specific rules for how to generate that XML.

With JSON, you only have to make those decisions if you need to use data structures that aren't supported by JSON.


XML is largely human-unreadable, ludicrously verbose, and has weird character escape requirements. Not a trifecta I want from my config files. I don't think anyone who's ever found themselves tampering with a complex XML file in vim ever wanted to repeat the experience.


I am highly disappointed by Nginx in three major areas:

1. They created a paid version that has some additional basic features such as cache purging, dynamic upstream name resolution, and a few others. Charge for support, charge for some fancy management interface, monitoring, but for basic features (most of them available in Tengine [0]) - thanks, but no thanks. You lost me as an evangelist! In fact, in may aspects, they now are catching up with Tengine!

2. Instead of making LuaJIT integration standard and avoid the need to escape Lua in the configuration files, they invented some subpar JavaScript. People already use Lua widely, it's fast, it's great - don't you have anything better to do than invent yet another language!? I really can't believe pragmatic people would have done this, honestly! Speaks so badly about their thought process! I know can expect anything stupid from them!

3. The configuration language is not very intuitive. If they embedded Lua, the whole configuration could be a Lua script that initializes some internal state. This would have been a dream come true!

[0] http://tengine.taobao.org


The real question is: why didn't they use something like Lua for the config file format?

Web servers can have notoriously complex configs, up to the point where designing a mini-language might a worse idea than stripping down an embeddable language, such as Lua.

Lua, especially when sandboxed, seems like a fitting configuration language: http://stackoverflow.com/questions/1224708/how-can-i-create-...


It's not just that Lua makes a good config file format, it's that serving HTTP requests is a complicated enough business with enough edge cases and corner cases that it merits a turning complete language.

I actually have some special corner case API endpoints that nginx just simply can not handle in the manner I would prefer. Further, the regex based location syntax, and even the prefix based ones, are not really what you want; you want a "Path" object that's aware of what / in a URL means, and does the right thing if you do/don't add it to the URL. (And it's not as easy as "/foo/bar/baz/?")


In Nginx, you can use Lua for complex routing and responses, with the right extensions. [1]

[1] http://openresty.org/


And we do. We don't use openresty (at least at the moment), we use a custom-built nginx that includes — among other things — the lua module. It's a great boon, but something things are still hard or impossible[1], and we don't use it for every request, opting for the location syntax, and all its flaws, for most.

[1]. Conditionally streaming an upload to a backend (i.e., if auth fails, don't stream) is impossible; nginx will buffer the entire request body, either in memory or on disk, and there is no way to change this behavior.


I invented one and wrote an article about why: https://caddyserver.com/blog/deliberate-caddyfile-syntax

(You're definitely not the only one to ask that question.)


This looks really interesting. Can't believe I haven't run across it before.

I've been toying with getting a couple home servers going (replaced home service w/ business service, just installed two router based DMZ, looking at lightweight hardware -- probably will be Fit-PC products). I was going to run a separate reverse proxy and Lighttpd or similar but a quick glance makes it seem like Caddy could be used for both and more easily. Thanks for the link.

(Edit. BTW, you're not here: https://en.wikipedia.org/wiki/Comparison_of_web_server_softw...


JSON doesn't allow comments. So it is an ok-ish config format. Yaml would have been better. But they probably wrote everything from scratch to be optimized and decided if they are writing a parser for a config file might as well invent a config file format as well :-)


Not allowing comments doesn't make JSON an "ok-ish" config format, it makes it completely _unsuitable_ for it.


But it is used though pretty oftne. That's why it is an -ish format ;-)

You can do things like

   {
     "comment": ["blah blah comment message"],
     "k1" : "v1", "k2" : "v2", ... 
   }
Or some silliness like that.


Nginx was invented before the days of JSON and YAML...


JSON, sure, but I think YAML pre-dated the launch of nginx (if not the initial development).

https://groups.yahoo.com/neo/groups/sml-dev/conversations/to...


YAML initial release: 2001 Nginx initial release: 2004

Well ok. But I'd still argue that Nginx was invented before the popularization of YAML. I think I didn't see YAML until I got in touch with Rails in 2007, which used the format extensively. And even then, almost everyone I met didn't take it seriously, saying that it is a joke until Microsoft, IBM, etc support it.


{ "//This": "works though!" }


JSON is horrible for a configuration language. Needing to put quotation marks around every single literal is insanely irritating if you ever have to write a lot of configuration. There's also a lot of application-specific syntax sugar that you cannot do if you stick with strict JSON.

Note that it's not really fair to call Apache's configuration language "XML". Apache relies on XML for some structured data in its configuration file, but all the individual directives are parsed separately from the XML.


I agree. When people started to move towards the "better" JSON standard from XML, I was puzzled. The inability to have comments, the need to "escape" every key name, the lack (in the past) of a schema and query language standard, etc.


To be fair, while I agree that JSON is bad for config, I would argue that JSON's objects and arrays eliminate the need for DTDs and standard query languages altogether in many cases. Of course, with sufficiently lqrge and complex data stores, you'll want a documented structure and method for accessing that data no matter what format you use, but unlike XML, JSON provides structure to get started on a small scale very easily. Consider this python:

    import json
    import sys
    sys.stdout.write("%s\n" % json.dumps(json.load(sys.stdin).get(sys.argv[1], None)))
This will accept JSON on standard input and will return the value of an object with the key name specified by the first command-line argument.

    $ echo '{ "value1" : { "sub-value": 5 }, "value2": 99 }' | python json-test.py value1
    {"sub-value": 5}
    $ echo '{ "value1" : { "sub-value": 5 }, "value2": 99 }' | python json-test.py value2
    99
Such a program will look similar in any language with a json library that maps objects and arrays to native data structures. Granted, my simple tool will fail if the JSON isn't an object, but it's a very simple matter to extend it to handle lists and literals. For many applications this is a huge advantage over XML, especially if the point of the JSON isn't configuration but rather inter-application communication (aka data serialization).


Have you seen nginScript (1)? It's a syntax for embedding snippets of JavaScript in your NGINX configuration. This allows for scripting application logic into standard NGINX config files via a very common and simple language. There's some more details and simple examples at this blog post.

(1) https://www.nginx.com/blog/launching-nginscript-and-looking-...

(disclaimer: I work at NGINX)


Okay, but you have to escape this in the Nginx config! Same issue with Lua! And this is terrible!


there is these days _by_lua_block.


Thanks! This is really great:

    content_by_lua_block {
        ngx.say("hello, world")
    }


I know it's probably not "open source-y" enough for you, but IIS just uses plain ol' XML files for configuration.


So I'd say Nginx is actually the least annoying web-server to config of those I've used:

1. Nginx

2. IIS 6 (strange metabase thing)

3. Apache (XML, mostly)

4. IIS 7, 7.5 (XML, but with some of the files scattered through your Windows directory, and also some values aren't valid in some of the files).

5. Tomcat (XML plus madness).

I'd say the thing that makes all web-servers a pain is debugging which rules are passing/failing, and where are they sending their results to.

I think the thing which makes Nginx easier than the others is probably that it doesn't try to support the 'shared hosting' scenario, which adds a lot of mess.


Is that a positive thing? Of course, like in many other cases, IIS configuration is supposed to be generated from tools, but it always needs some tweaking and XML makes it a total pain to edit.


It's not just web servers. I would say that it is because they want it to be human readable/editable, but JSON would probably be just as good in that regard.

I would prefer it if it were more like openssh or supervisor. Though I suspect those styles of configs are would make some of the more advanced configurations a pain.


Does anyone know how this architecture compares to cowboy? I know erlang is known for concurrency, but I'm assuming erlang isn't as fast as the custom-tailored C here. OTOH, I feel slightly less concerned about security issues with erlang.


Well cowboy is an application library. So it is a bit in a different league. With cowboy you can start writing your business logic or application code directly and go to work. Nginx can do some of that with Lua integration but it is nice for serving static files and proxying to back-ends.

I could see for example nginx in front of cowboy. It would strip away ssl, serve static pages, maybe authorization/authentication and then proxy connections to cowboy servers in the backend for application logic.


What's common is both are based on kqueue/epoll. Erlang then provides a very elegant interface on top of it.


I love using nginx but it seems that all the new features lately are only accessible through their enterprise service.


Nginx is a solid piece of software with a lot of people working on it.

I find it sad that people expect good services built on top of it to be free as well. Without an enterprise/paid offering how else do you suppose people fund nginx? Right now the state of open source funding is abysmal.


His point is that all this funding is not leading to any new features in the open-source version.


Are you joking? Check the changelog of the open-source version. http://nginx.org/en/CHANGES


It's a common question. As you correctly noted in the changelog, the vast majority of features have been placed in the open source version. The commercial product has gained great visibility and adoption but open source NGINX is very core to our company. In fact, development and feature releases to open source NGINX have rapidly increased since the inception of NGINX, Inc (the company) because it can support the development efforts :)

(disclaimer: I work at NGINX.)


Just maintaining a project as large as nginx is hard enough, and takes a lot of people to do it.


Sort of unrelated, but I would love to see one of these for Unreal Engine. If there's an internal architecture overview somewhere, I haven't found it.


>These days the Internet is so widespread and ubiquitous it's hard to imagine it wasn't exactly there, as we know it, a decade ago. It has greatly evolved, from simple HTML producing clickable text, based on NCSA and then on Apache web servers, to an always-on communication medium used by more than 2 billion users worldwide.

The Internet[0] has a much richer history and larger ecosystem than just the World Wide Web. The Internet started nearly six decades ago, the web has only been around for a bit more than two.

[0] https://en.wikipedia.org/wiki/Internet


Yeah, I thought that was pretty damn breathless of the author too.

Architecturally and usability-wise, we're more or less at the same place as a decade ago:

In 2004, both nginx and Gmail were released, and people were going ape about exciting "Web 2.0" technologies like DHTML and AJAX, whose paradigms more or less still underpin all modern development. There have been a lot of additions and streamlinings, but "dynamic pages/apps in the browser without a pageload" were the modus operandi then, and are the MO now.


It's not readily apparent, but this item was written in 2012.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: