Posted on 05/16/2013 6:39:16 AM PDT by ShadowAce
High Scalability has a fascinating article up that summarizes a talk by Robert Graham of Errata Security, summarizing the development choices needed to support 10 million concurrent connections on a single server. From a small data center perspective, the numbers he is talking about seem astronomical, but not unbelievable. With a new era of Internet connected devices dawning the time may have come to question the core architecture of Unix, and therefore Linux and BSD as well.
The core of the talk seems to be that the kernel is too inefficient in how it handles threads and packets to maintain the speed and scalability requirements for web scale computing. Graham recommends moving as much of the data processing as possible away from the kernel and into the application. This means writing device drivers, handling threading and multiple cores, and allocating memory yourself. Graham uses the example of scaling Apache to illustrate how depending on the operating system can actually slow the application when handling several thousand connections per second.
Why? Servers could not handle 10K concurrent connections because of O(n^2) algorithms used in the kernel.
Two basic problems in the kernel:
Connection = thread/process. As a packet came in it would walk down all 10K processes in the kernel to figure out which thread should handle the packet
Connections = select/poll (single thread). Same scalability problem. Each packet had to walk a list of sockets.
Solution: fix the kernel to make lookups in constant time
Threads now constant time context switch regardless of number of threads.
Came with a new scalable epoll()/IOCompletionPort constant time socket lookup.
The talk touches on a concept Ive been mulling over for months, the inherent complexity of modern data centers. If you are virtualizing, and you probably are, for your application to get to the hardware there are most likely several layers of abstraction that need to be unpacked before the code it is trying to execute actually gets to the CPU, or the data is written to disk. Does virtualization actually solve the problem we have, or is it an approach built from spending far too long in the box? That Grahams solution for building systems that scale for the next decade is to bypass the OS entirely and talk directly to the network and hardware tells me that we might be seeing the first slivers of dusk for the kernels useful life serving up web applications.
So what would come after Linux? It is possible that researchers in the UK have come up with a solution with Mirage. In a paper quoted on the High Scaleablity site the researchers describe Mirage:
Our prototype (dubbed Mirage) is unashamedly academic; it extends the Objective Caml language with storage extensions and a custom run-time to emit binaries that execute as a guest operating system under Xen.
Mirage is, as stated, very academic, and currently very alpha quality, but the idea is compelling. Writing applications that compile directly to a complete machine, something that runs independently without an operating system. Of course, the first objection that comes to mind is that this would lead to writing for specialized hardware, and would mean going back in time thirty years. However, combining a next generation language with a project like [Open Compute] would provide open specifications and community driven development at a low level, ideal for eking out as much performance as possible from the hardware.
No matter which way the industry turns to solve the upcoming challenges of an exploding Internet, the next ten years are sure to be a wild ride.
Today’s application programmers are not capable of writing their own device drivers or handling the stack.
They wouldn’t know malloc() if it bit them in the @ss and they couldn’t free() themselves from a paper bag.
I wrote a proprietary 4GL based on C for multiple platforms for 20 years when there were few standards. It’s doable and maintainable, especially with the standards that exist in the industry today.
The problem, IMHO, is that today’s application programmers - as opposed to systems programmers who write the OS, kernels, device drivers, etc. - are incapable of handling the stack. They code in languages that handle everything for them. It makes for great, object-oriented, reusable code, but it means they’re totally unaware of what’s under the hood. They also have no idea how to open a socket or listen to a port.
The author wrote the article as if the Unix architecture is limited to one kernel/OS per server. It’s not.
Damn straight! Pounding my fist on the desk in agreement.
You are correct; in fact you could write an OS that was a valid DOS application.
I was using TP7 to do this for my OS which I got to the point of being able to recognize commands and change the screen resolution before I shelved the project [due to school and getting stumped on handling memory management*].
* I was looking for a way to make the memory-manager 'tyrannical', and generic [able to handle the small stuff (like variables for the compiler)], in order to cut down on memory-leaks.
Here's an interesting article on Ada outperforming an experienced assembly programmer.
The thing about optimizing compilers is that they can be 'taught' all sorts of "tricks" that the experienced assembly guy might not know.
Sounds like someone is just fishing for funding. Why would you not scale up to support that many connections with the added benefit of balancing and redundancy.
Balancing and redundancy are good, but the Unix philosophy is rather hostile to the thing that would unlock the full potential: distributed computing. Remember that that OS is heavily reliant-on/intertwined-with C, and C's take on even threads is more of a "let the user [programmer] handle them" (i.e. "fork")... with distributed computing one could have the tasking system assign the task to the system with the lowest load [i.e. maintain a priority-queue]. ~~ I'm not sure, but I seem to recall IBM's OS/360 has the ability to keep services going while a particular machine (node) is under repair/replacement, VMS probably has that ability too.
How smart is an industry which uses C/C++ for systems programming instead of Ada... or even LISP... or FORTH...?*
There is honestly no good reason that Systems should be written in C/C++, especially given the number of items that are implementation dependent.
* -- Being commissioned by the DOD, Ada was designed to allow exact representations so it could interface with hardware that had no standard. LISP was the system-language of the LISP-Machine, which had the ability to debug while running, even the system routines. FORTH is actually pretty amazing, allowing for entire systems to be built "in a matchbox".
Great comments. People don’t realize what they have as far as computing power either on their desk or under it. Instead we create problems to try and solve them with the newest TLA (three-letter acronym for those in Rio Linda).
Like ‘big data’. Our company has to move to some huge NOSQL database to handle the data. Not because we need to, but because someone read an article...
I'm impressed.
Its doable and maintainable, especially with the standards that exist in the industry today.
Maintainable? Can you pull out your C compiler and compile the source w/o modification today? How about using another C-compiler?
The problem, IMHO, is that todays application programmers - as opposed to systems programmers who write the OS, kernels, device drivers, etc. - are incapable of handling the stack. They code in languages that handle everything for them. It makes for great, object-oriented, reusable code, but it means theyre totally unaware of whats under the hood. They also have no idea how to open a socket or listen to a port.
I generally agree here -- though let's not kid ourselves: Object Oriented isn't always the best choice.
I think really what we're seeing is a failure in the CS-education system; it is surprising how many languages don't have something like Ada's subtype -- and how many CS graduates don't grasp how useful it is to be able to exclude values. {IE in Ada Positive is a subtype (of Integer) that has the additional constraint of only having values greater than zero.} Any CS battery of coursework ought to include enough Math to make the advantages thereof obvious.
Here’s an idiot excerpt...
“The talk touches on a concept Ive been mulling over for months, the inherent complexity of modern data centers. If you are virtualizing, and you probably are, for your application to get to the hardware there are most likely several layers of abstraction that need to be unpacked before the code it is trying to execute actually gets to the CPU, or the data is written to disk. Does virtualization actually solve the problem we have, or is it an approach built from spending far too long in the box?”
A data center needs to be as complex as it needs to be, no more, no less. Operating systems today all essentially do the SAME things at the bottom end; they allow for sharing of hardware resources between multiple user processes.
Some Information Technology Department (IT) shops are more well managed than others. Shops with serious problems have human management problems that cause great difficulty overcoming hurdles in managing their servers. As far as server administration goes, while M$ products will “run right out of the box”, it’s a costly mistake to think that managing them will be easier than managing Unix servers, since M$ products historically typically have default settings and functionality that are inherently the wrong choice, while Unix basically requires the server adminstrator to visit the configuration, understand all the options and make their choices. If one could have a purely Unix server environment, and one spent the time to have every option choice well-thought out, instead of neglecting the “details”, the pure Unix environment would be far more secure than the pure M$ environment. Inevitably today server environments are mixed, as dictated by the needs of particular applications that IT is required to support. This, of course, makes the labor cost of server administration far greater in smaller shops.
The impetus behind virtualization...
Used to be IT shops would gradually keep adding servers. Some departments in the company would have their own file server. There are email servers. Then, applications would be purchased, and a new set of servers would be purchased; development, test, production for the app. This is just life.
But you’d find mistakes being made. Performance problem ? Don’t correctly tune the application and revisit the design and understand what you’re trying to achieve and how best to do that - no, buy a faster server.
Due to how MOST of CPU time is spent IDLE, we wind up with millions in capital investment in server hardware sitting there depreciating; unable to run fast enough to satisfy users when the poorly-tuned apps run, but sitting idle the rest of the time.
With software advertising being ubiqitous, every department started screaming for new applications that they just had to have - and getting approval directly from the top with IT all but cut out of the loop. Thus the crucial factors of “what present capabilities and plans do we have in terms of our existing IT staff and infrastructure” and “what external directions are there and how will they affect our shop” (i.e., should be be moving in this or that technology direction in terms of both hardware and IT training) are all too often not given enough consideration; perhaps lipservice, perhaps IT actually liked the idea of new apps and architectures themselves. But instead of preparing by ensuring IT staff expertise FIRST, the business would plunge into new technology unaware, outsource the required core expertise and (typically) allow selected IT staff to have at the juicy new project from a backseat role. All too often these staffers would turn around and leave to catapult their careers higher with their newfound “expertise”.
So the “glass tower” of IT was overrun. No longer could IT dictate when the changes or new reports were to be completed who had access to what, etc.
Now, every department finds out what the most popular software is for their tasks, and says to senior executives - “why aren’t we doing that ?”. The senior executives all start asking the same question. The salesman is called in for the dog and pony show, and IT gets their marching orders, and new servers come rolling in.
Thus we have IT shops with hundreds and oftentimes thousands of physical servers; maintaining it represents work that must be done (installing upgrades, installing new machines, removing old machines, etc.).
Thus we see the drive for virtualization of servers.
You want hundreds, thousands of servers ? Well, IT went out and bought virtualization software, so they can provision you a set of new servers without having to purchase, wait for, and set up new hardware. Just clickety-click, bada bing, there’s your new servers, let’s install this new software.
Is there overhead to the virtualization - sure. But sorry to say for the people who created this article, it’s no show stopper with today’s hardware performance.
The inevitable downside ? Of course - since it’s not that much easier to create servers - the decision to create new servers is made MUCH more easily today, with the predictable result that the number of virtual servers increases much faster than the number of physical servers used to increase. So IT departments continue buying hardware and continue struggling to keep up. The virtualization itself provides no direct help for keeping software updates applied to all these virtual servers, so IT can get buried trying to maintain them all. And to solve this problem there is the age-old solution of software-based automation and good old-fashioned figuring out efficient ways to manage the configurations of the software applications running on all those virtual servers.
Hmm ... come to think of it, seems like he is suggesting Windows to me ;)
That proprietary 4GL is still running today, still being modified and enhanced, still compiled on multiple platforms, and still distributed via binary to multitudes of companies. So, yes.
I think really what we're seeing is a failure in the CS-education system;
That's the truth of it right there. And I agree that OO isn't always the best fit. Therein lies the crux of the issue. They're not taught to solve the actual problem with the best tools. They're taught to write software to work around their own lack of knowledge. They don't know how to analyze a specific problem or requirement and then develop an efficient and effective solution. They write inefficient code because "hardware is cheap." They don't know enough to respect memory and bandwidth as the precious resources they still are.
This is true; but it does open the door to those who do.
They're taught to write software to work around their own lack of knowledge. They don't know how to analyze a specific problem or requirement and then develop an efficient and effective solution.
Tell me about it; I recently ran into a situation where randomization for "selecting candidates from a pool" was done via a loop of get random, continue looping if already-selected... fortunately I was allowed to replace this with Fisher-Yates shuffling. (This was in actual production code.)
That's the truth of it right there. And I agree that OO isn't always the best fit. Therein lies the crux of the issue. They're not taught to solve the actual problem with the best tools.
Which is why I'm rather against using C as a systems-level language; I don't think it's the best tool for the job. -- Sadly we're also seeing this sort of "go with the popular" mentality in application (especially Web) development: nothing else explains why anyone would willingly use PHP in any serious endeavor/project.
Agreed. Another problem with “because someone read about it” is the cloud.
I hate the cloud because I’m set it my ways, but mostly because my basic software development philosophy is “I’ll do it myself, thank you very much.”
I’m not keen on integrating third-party apps into my software and if I don’t have “physical” possession of the data, it ain’t my data anymore. IMHO, corporations are putting their entire businesses at risk when they lose control of their own data. The cloud is not appropriate for mission-critical data.
UNIX is no longer any one kernel, it’s a philosophy. A philosophy that has been tested, tweaked, broken, fixed, and pounded upon for decades.
It’s strength is in it’s diversity and easy modification.
Maybe in ten years we’ll all be using GNU’s HURD, but we’ll still call it all “UNIX”.
All you haters better jump on the train.
I think you and I have lived the same life in IT. I’ve experienced the exact same things. Don’t even get me started on having to keep up with software licenses on all those cores.
re: randomization ... *snort*
Don’t get me started telling stories. I once had to fix a bug in some software that loaded all of the records from a file into an array for the sole purpose of counting the number of records in the file. The database had an integrated method that returned the number of records. And then it cleared the array and began reloading the records in order to update the contents of one field in each record.
*chuckle* I’ve written some things in LAMP because the shop required low cost (read free) and needed something quick and dirty. It was ... quick and dirty. Had to grit my teeth.
I’m a C-lover so we disagree there, but I am always open to other tools.
Yeah; sometimes the better/other 'tools' are really amazing, but you just don't know how to use them (I'm like that with FORTH; it's an intriguing little language... but I can't do jack w/ it yet). Ada's got a lot of great stuff in it (I'm still fairly bad at using it) -- but one thing that impresses me about it is that it was designed to be maintainable (part of the "programming as a human activity" ethos) and I think it really shows with the new Pre- and Post-conditions (which never go stale due to code/[annotated-]comment impedance mismatch), type-invariants (e.g. a point on a unit circle always has to have x**2+y**2 = 1; in Ada 2012 you can specify this [or even that the signature in a header is valid]), and the new qualitative statements for all and for some (e.g. there exists) all play together.
*chuckle* Ive written some things in LAMP because the shop required low cost (read free) and needed something quick and dirty. It was ... quick and dirty. Had to grit my teeth.
When I was programming full-time (PHP) we were using LAMP; I'm not sure, but I think that (and maybe your use) might have been a violation of the terms for the free usage license.
Dont get me started telling stories. I once had to fix a bug in some software that loaded all of the records from a file into an array for the sole purpose of counting the number of records in the file. The database had an integrated method that returned the number of records. And then it cleared the array and began reloading the records in order to update the contents of one field in each record.
Too bad; I actually like them -- there's a lot you can learn listening to stories. The Unix-Haters Handbook, for instance, was an amusing and surprisingly insightful collection of stories... despite its age it made some points which are still valid today: one of which is that trying to impose state on a system that was designed to be stateless is... troublesome. (This is why, IMO, HTML 5 [and CSS] is such a bad idea: they're trying to make HTML, which was designed to have content independent of the layout [leaving layout to the browser] -- IOW, they're trying to go directly against the whole idea of HTML.)
Well said. Much of the rest of your post was pretty much on target as well. I've seen the server creep in virtualized environments first-hand. I think it's amazing how fast virtual server sprawl strikes virtual environments, and how incredibly wasteful it is. Thankfully, it's not as wasteful as physical server sprawl was.
Last job I was at had a huge virtualization push several years ago. One thing that I thought was interesting was some of the stuff that you'd hear from PHBs, because of what they'd been told. They'd look at the Unix folks with a critical eye, and point out that the Windows team was getting almost a hundred to one consolidation, (or some other rediculous number), and would ask why consolidation on the Unix side of the house was lucky to get 15/1 consolidation, (in some cases much, much less). My response was incredulity at their thinking processes. The reason you could cram so many Windows boxes in the virtual environment was because so few of them were actually doing much of anything, because each server was dedicated to a specific application/site or whatever, whereas we'd have a Unix server running Apache that had a hundred IPs plumbed on the box because of the number of sites it was supporting, or we'd have a Weblogic server running multiple clusters that contained multiple JVMs, pretty much maxing out the hardware of the box, both memory and CPU. How the hell do you virtualize that? Well, you can, but you're not gaining much from the consolidation, but instead are doing it because of some of the other capabilities iti gives you, such as the abstraction from the hardware itself.
My other question to the PHB would be why they were so happy about what was happening on the Windows side of the house, when what all that consolidation plainly showed was the incredible waste of resources all those single-purpose Windows machines had represented. Yeah, you're saving money now in relation to the flat out waste you had before, but you should keep in mind what had come before, and perhaps consider how that same methodology was impacting all this new 'virtualized' hardware.
Sadly, I never really saw anything come from those discussions, because there was no interest in assigning a dollar cost to the idea of 'one app per server'.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.