Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Rethinking software bloat.
Information week.com ^ | 12/17/01 | Fred Langa

Posted on 12/17/2001 4:33:52 AM PST by damnlimey

Rethinking 'Software Bloat'

PRINT THIS ARTICLE
DISCUSS THIS ARTICLE
WRITE TO AN EDITOR
 
Fred Langa takes a trip into his software archives and finds some surprises--at two orders of magnitude.
By Fred Langa

 
Reader Randy King recently performed an unusual experiment that provided some really good end-of-the-year food for thought:
I have an old Gateway here (120 MHz, 32 Mbytes RAM) that I "beefed up" to 128 Mbytes and loaded with--get ready--Win 95 OSR2. OMIGOD! This thing screams. I was in tears laughing at how darn fast that old operating system is. When you really look at it, there's not a whole lot missing from later operating systems that you can't add through some free or low-cost tools (such as an Advanced Launcher toolbar). Of course, Win95 is years before all the slop and bloat was added. I am saddened that more engineering for good solutions isn't performed in Redmond. Instead, it seems to be "code fast, make it work, hardware will catch up with anything we do" mentality.
It was interesting to read about Randy's experiment, but it started an itch somewhere in the back of my mind. Something about it nagged at me, and I concluded there might be more to this than meets the eye. So, in search of an answer, I went digging in the closet where I store old software.

Factors Of 100
It took some rummaging, but there in a dusty 5.25" floppy tray was my set of install floppies for the first truly successful version of Windows--Windows 3.0--from more than a decade ago.

When Windows 3.0 shipped, systems typically operated at around 25 MHz or so. Consider that today's top-of-the-line systems run at about 2 GHz. That's two orders of magnitude--100 times--faster.

But today's software doesn't feel 100 times faster. Some things are faster than I remember in Windows 3.0, yes, but little (if anything) in the routine operations seems to echo the speed gains of the underlying hardware. Why?

The answer--on the surface, no surprise--is in the size and complexity of the software. The complete Windows 3.0 operating system was a little less than 5 Mbytes total; it fit on four 1.2-Mbyte floppies. Compare that to current software. Today's Windows XP Professional comes on a setup CD filled with roughly 100 times as much code, a little less than 500 Mbytes total.

That's an amazing symmetry. Today, we have a new operating system with roughly 100 times as much code as a decade ago, running on systems roughly 100 times as fast as a decade ago.

By itself, those "factors of 100" are worthy of note, but they beg the question: Are we 100 times more productive than a decade ago? Are our systems 100 times more stable? Are we 100 times better off?

While I believe that today's software is indeed better than that of a decade ago, I can't see how it's anywhere near 100 times better. Mostly, that two-orders-of-magnitude increase in code quantity is not matched by anything close to an equal increase in code quality. And software growth without obvious benefit is the very definition of "code bloat."

What's Behind Today's Bloated Code?
Some of the bloat we commonly see in today's software is, no doubt, due to the tools used to create it. For example, a decade ago, low-level assembly-language programming was far more common. Assembly-language code is compact and blazingly fast, but is hard to produce, is tightly tied to specific platforms, is difficult to debug, and isn't well suited for very large projects. All those factors contribute to the reason why assembly language programs--and programmers--are relatively scarce these days.

Instead, most of today's software is produced with high-level programming languages that often include code-automation tools, debugging routines, the ability to support projects of arbitrary scale, and so on. These tools can add an astonishing amount of baggage to the final code.

This real-life example from the Association for Computing Machinery clearly shows the effects of bloat: A simple "Hello, World" program written in assembly comprises just 408 bytes. But the same "Hello, World" program written in Visual C++ takes fully 10,369 bytes--that's 25 times as much code! (For many more examples, see http://www.latech.edu/~acm/HelloWorld.shtml. Or, for a more humorous but less-accurate look at the same phenomenon, see http://www.infiltec.com/j-h-wrld.htm. And, if you want to dive into Assembly language programming in any depth, you'll find this list of links helpful.)

Human skill also affects bloat. Programming is wonderfully open-ended, with a multitude of ways to accomplish any given task. All the programming solutions may work, but some are far more efficient than others. A true master programmer may be able to accomplish in a couple lines of Zen-pure code what a less-skillful programmer might take dozens of lines to do. But true master programmers are also few and far between. The result is that code libraries get loaded with routines that work, but are less than optimal. The software produced with these libraries then institutionalizes and propagates these inefficiencies.

You And I Are To Blame, Too!
All the above reasons matter, but I suspect that "featuritis"--the tendency to add feature after feature with each new software release--probably has more to do with code bloat than any other single factor. And it's hard to pin the blame for this entirely on the software vendors.

Take Windows. That lean 5-Mbyte version of Windows 3.0 was small, all right, but it couldn't even play a CD without add-on third-party software. Today's Windows can play data and music CDs, and even burn new ones. Windows 3.0 could only make primitive noises (bleeps and bloops) through the system speaker; today's Windows handles all manner of audio and video with relative ease. Early Windows had no built-in networking support; today's version natively supports a wide range of networking types and protocols. These--and many more built-in tools and capabilities we've come to expect--all help bulk up the operating system.

What's more, as each version of Windows gained new features, we insisted that it also retain compatibility with most of the hardware and software that had gone before. This never-ending aggregation of new code atop old eventually resulted in Windows 98, by far the most generally compatible operating system ever--able to run a huge range of software on a vast array of hardware. But what Windows 98 delivered in utility and compatibility came at the expense of simplicity, efficiency, and stability.

It's not just Windows. No operating system is immune to this kind of featuritis. Take Linux, for example. Although Linux can do more with less hardware than can Windows, a full-blown, general-purpose Linux workstation installation (complete with graphical interface and an array of the same kinds of tools and features that we've come to expect on our desktops) is hardly what you'd call "svelte." The current mainstream Red Hat 7.2 distribution, for example, calls for 64 Mbytes of RAM and 1.5-2 Gbytes of disk space, which also happens to be the rock-bottom minimum requirement for Windows XP. Other Linux distributions ship on as many as seven CDs. That's right: Seven! If that's not rampant featuritis, I don't know what is.

Is The Future Fat Or Lean?
So: Some of what we see in today's huge software packages is indeed simple code bloat, and some of it also is the bundling of the features that we want on our desktops. I don't see the latter changing any time soon. We want the features and conveniences to which we've become accustomed.

But there are signs that we may have reached some kind of plateau with the simpler forms of code bloat. For example, with Windows XP, Microsoft has abandoned portions of its legacy support. With fewer variables to contend with, the result is a more stable, reliable operating system. And over time, with fewer and fewer legacy products to support, there's at least the potential for Windows bloat to slow or even stop.

Linux tends to be self-correcting. If code-bloat becomes an issue within the Linux community, someone will develop some kind of a "skinny penguin" distribution that will pare away the needless code. (Indeed, there already are special-purpose Linux distributions that fit on just a floppy or two.)

While it's way too soon to declare that we've seen the end of code bloat, I believe the signs are hopeful. Maybe, just maybe, the "code fast, make it work, hardware will catch up" mentality will die out, and our hardware can finally get ahead of the curve. Maybe, just maybe, software inefficiency won't consume the next couple orders of magnitude of hardware horsepower.

What's your take? What's the worst example of bloat you know of? Are any companies producing lean, tight code anymore? Do you think code bloat is the result of the forces Fred outlines, or it more a matter of institutional sloppiness on the part of Microsoft and other software vendors? Do you think code bloat will reach a plateau, or will it continue indefinitely? Join in the discussion!



TOPICS: Editorial; Miscellaneous
KEYWORDS:
Navigation: use the links below to view more comments.
first previous 1-20 ... 61-8081-100101-120121-129 next last
To: bvw
In practise -- Bloatware grows better than non-bloatware. Why?

There are reasons why, from a source-code maintenance standpoint, C++ can be better than C. On the other hand, the code produced by C++ development systems is often grossly bloated compared with that generated in C, in large part because development systems link in lots of unnecessary junk when using C++. I really wish someone would come up with and popularize some decent development systems which allow the same sorts of design extension as C++, but without having such extensibility add bloat until it's actually used.

101 posted on 12/17/2001 7:58:57 PM PST by supercat
[ Post Reply | Private Reply | To 88 | View Replies]

To: Smogger
If you develop a fanstastic piece of software in 2 years using assembly and I develop a competing piece of software in overbloated VB, but I release my working program in 2 months guess who captures the market?

That's about what I was thinking.

And I believe speed in getting the stuff on as many desktops as possible as quickly as possible is the secret of Microsoft's success, not a superior product.

102 posted on 12/17/2001 8:21:39 PM PST by Age of Reason
[ Post Reply | Private Reply | To 50 | View Replies]

To: discostu
Graphics. Everything has to be shiny with buttons and animations and all kind of other crap.

Many of these newfangled 'improvements' are downright annoying as well. Even though monitors are bigger than they were a few years ago, I still often consider screen real estate somewhat precious and am very annoyed when a program forces me to waste 20% or more of it.

BTW, does anyone know of a utility to allow the Windows 'task bar' to be placed at the right side of the screen with icon labels appearing below the icons so as not to waste space?

103 posted on 12/17/2001 9:37:48 PM PST by supercat
[ Post Reply | Private Reply | To 68 | View Replies]

To: supercat
It is so personally satisfying to produce a working embedded system with the code cleanly designed and shoehorned into it. A system that no one knows -- because it just does it's job, reliably, robustly, completely quietly. With the bit operators and the memory mapped IO space the 8051s and the PICs are especially wonderful at doing their hidden, forgotten, ignored-except-by-the-designers.

Yet Bloatware -- the complete opposite of tight, highly-crafted, error-free, robust embedded systems -- is extrememly successful. In fact, people will buy bloatware just because it is bloatware! They will rarely -- almost never -- buy the embedded or "embedded-style" software for itself. Buyers crave Bloatware. Why?

104 posted on 12/18/2001 1:50:13 AM PST by bvw
[ Post Reply | Private Reply | To 100 | View Replies]

To: 2 Kool 2 Be 4-Gotten
I want to stick to an issue I'm bringing up -- that Bloatware succeeds because the paying customers want it, and asking why people who pay want Bloatware.

Nevetheless, some vanity infects mine own pursuit of that goal -- and I'll comment on your comparison of Linux and Windows where you correctly note that Windows has more stuff in it:

Yes, Windows has far more stuff rolled into it's OS distribution than does Linux. Windows is like the snowball that rolls everything into it. But no, it's not quite right to say or sugest that that stuff is unnecessary, extra, or superfluous. And to think that the Linux OS is not also bloated in similar grandiose ways is a misconception that people familar with micro-kernals or with building rom-based systems from the interrupt vectors up do not share.

End of sidetrip.

To repeat my question: "Why do people love to BUY Bloatware?"
105 posted on 12/18/2001 2:04:07 AM PST by bvw
[ Post Reply | Private Reply | To 94 | View Replies]

To: damnlimey
Buyers crave Bloatware. They crave it because it is Bloat-ed.

Programming, engineering and cost considerations are moot. There is something about Bloatware, things about Bloatware, that make people want to but it.

106 posted on 12/18/2001 2:56:46 AM PST by bvw
[ Post Reply | Private Reply | To 1 | View Replies]

To: bvw
Also bloatware survives and thrives in long-life situations where it supplants well-crafted code serving the same markets.
107 posted on 12/18/2001 3:01:41 AM PST by bvw
[ Post Reply | Private Reply | To 106 | View Replies]

To: bvw
Also bloatware survives and thrives in long-life situations where it supplants well-crafted code serving the same markets.

So do cockroaches.

108 posted on 12/18/2001 5:11:28 AM PST by supercat
[ Post Reply | Private Reply | To 107 | View Replies]

To: supercat
Not at all like cockroaches, Bloatware is Why?
109 posted on 12/18/2001 5:30:45 AM PST by bvw
[ Post Reply | Private Reply | To 108 | View Replies]

To: supercat
My preferred text editor and word processor is Word 95. I have tried Word 97 and Word 2000 and have found that, although their size has ballooned by leaps and bounds, they are no better at core functionality. Word 95 was a little slow to load when I was running it on a 25mhz 486, but on a 200 mhz PII, it's 5 megabyte size means it flies, and I look forward to using it in 2006, when it will be positively blinding on my new, off-lease Thinkpad running at 1ghz with half a gig of memory.

What's more: in six years of running Word 95, I cannot recall it having EVER crashed.

110 posted on 12/18/2001 5:38:56 AM PST by Petronski
[ Post Reply | Private Reply | To 98 | View Replies]

To: damnlimey
Excellent article.
111 posted on 12/18/2001 6:18:15 AM PST by PatrioticAmerican
[ Post Reply | Private Reply | To 1 | View Replies]

To: krb
I prefer the assembly. At least I know what I am dragging in and third-party bugs are almost zero. I have almost written as much assembly as anything else, and I enjoy writing it and the results far better.
112 posted on 12/18/2001 6:24:52 AM PST by PatrioticAmerican
[ Post Reply | Private Reply | To 74 | View Replies]

To: PatrioticAmerican
I have almost written as much assembly as anything else, and I enjoy writing it and the results far better.

I do too, when I'm coding for an 8051, MSP430, 6811, or the like.

But when I want to craft a PC user-interface in order to deliver a turn-key solution that the typical computer user (without decades of embedded experience) can utilize, I am more than happy to whip out Visual C++ and drag in 15+ megabytes of MFC, CRT, Multimedia, and whatever DLLs. Have you ever tried to construct a user-interface with tree-views, context sensitive help, bubble help, hyperlinks, menu-bars, moveable/dockable toolbars, rebars, etc. in assembler? Ugh.

Before you say "Users don't need all that cr@p!" remember that users expect all that cr@p and will select the product (or bidder if we're talking contract work here) who can deliver that stuff. Besides, if done well those accoutrements can add immense value to the program.

Heck, have you ever tried to make even a simple frame window with a basic client area that only responds to WM_CREATE, WM_PAINT, and WM_DESTROY in assembler? Ugh. The IBM OS/2 Programmer's Toolkit actually had sample code to be able to program the Presentation Manager (the 'Windows' on top of OS/2) in assembly. hehehehe.

113 posted on 12/18/2001 6:43:43 AM PST by krb
[ Post Reply | Private Reply | To 112 | View Replies]

To: krb
Actually, several commerical apps that people love because of their speed and stability are written 100% in assembly. It only takes about twice as long to develop them. The reason? No "team" with crappy politics, no slow debugging process because of third-party or nested bugs, and everything is thouroughly thought out. Developers and managers just don't shoot off on a tangent because they can so easily. In many cases libraries, such as MFC, are so cumbersome that a simple macro set does 95% of what needs to be done. Many times I have to use Win32 APIs direct because there isn't MFC support. Think of assembly as ATL. AIt actually does have a framework, and integration of OS calls is very simple.

Of course, time to market, trained developers, etc. are also a large portion of product management, so assembly isn't but for maybe 1% of the crowd.

114 posted on 12/18/2001 7:56:44 AM PST by PatrioticAmerican
[ Post Reply | Private Reply | To 113 | View Replies]

To: PatrioticAmerican
Wow...what are some commercial apps written in assembly? As a snobby embedded guy, I think I'll buy one just to support the clique! :-)
115 posted on 12/18/2001 11:35:55 AM PST by krb
[ Post Reply | Private Reply | To 114 | View Replies]

To: krb
Can't say. First, you can only buy one of them off the shelf, the others are part of something else, and second, the companies wish to keep their little advantange secret. Rock solid, high performance apps built in assembly usually have a nice life span. Even some MS products use 10% assembly.
116 posted on 12/18/2001 1:35:54 PM PST by PatrioticAmerican
[ Post Reply | Private Reply | To 115 | View Replies]

To: PatrioticAmerican
Many times I have to use Win32 APIs direct because there isn't MFC support.

Perhaps someone can explain something to me: given that the Windows API's exist and (hopefully) work, why is it necessary to add another layer of abstraction? I'm not trying to suggest that some 'helper libraries' wouldn't be a good thing, but what purposes is served by giving every type of Windows handle an associated class of object with its own methods, etc.? I'll admit that parts of the Windows API are annoying, but since understanding the API is often necessary to get good performance, I'd rather learn one API than have to learn both the API and another abstraction layer.

Suppose I want to draw a bunch of overlapped shapes in 40 different colors (with some colors being used multiple times). For efficient operation when using the API, I could create 40 brushes, use them, and then delete them. By contrast, with something like MFC I'd be more apt to simply change the current drawing color repeatedly (MFC would then create, use, and delete each brush). If I'm doing the API calls myself, I'm apt to know when any brush is no longer needed and can delete it then. The MFC in such a situation would have no way of doing so.

Nowadays, a good C compiler should in many cases be able to code about as efficiently as 'reasonable' assembly language (optimal code often requires that instructions be sequenced a certain way, and various factors can affect the optimal sequence; a good compiler can automatically resequence code as needed). I don't, therefore, see a big difference between coding in 'straight C' vs. assembly language. On the other hand, things like MFC that add layers of abstraction to standard system data types can easily make things much bigger and slower than would otherwise be needed.

117 posted on 12/18/2001 7:19:56 PM PST by supercat
[ Post Reply | Private Reply | To 114 | View Replies]

To: PatrioticAmerican
Even some MS products use 10% assembly.

One of the better games I wrote was designed for a 4.77Mhz XT; it consisted of about 1,000 lines of Pascal and about 20 lines of assembly code. By my estimation, that assembly code was about 10-20 times the speed of Pascal code to do the same thing; additionally, when speed actually mattered the program would be spending over 80% of its time in that one loop [the game ran at 60fps; during each 16ms frame the game would spend about 4ms executing the Pascal part of the code, 1-12ms doing the assembly part, and the balance waiting until it was time for the next frame].

In that case, writing about 2% of the SOURCE code in assembly allowed the game to run at 16ms/frame rather than 124ms/frame, speeding up the entire game eight-fold. While assembly doesn't provide the speed gains it used to, the fact remains that oftentimes a little work optimizing a very small amount of code can reap huge performance benefits; unfortunately, few people bother to make any real effort even when costs would be mininal and benefits huge.

118 posted on 12/18/2001 8:00:31 PM PST by supercat
[ Post Reply | Private Reply | To 116 | View Replies]

To: Don Joe
Yes, there is something that prevents it with XP. The apps don't run. Second, that is BS about the fault of the programmer. We are not talking about a little Access database with 5-10 data files. The application(s) I am talking about have a minimum of 400 data files. Some of my competition, with pretty good programmers, cannot make it work effectively. My own experience is irrelevant--obviously it would open me up to flames--but I have had pretty good success.

I didn't post to argue. Just to say that certain production environments don't lend themselves well to GUI.

119 posted on 12/19/2001 4:18:09 AM PST by jammer
[ Post Reply | Private Reply | To 11 | View Replies]

To: supercat
"I'd rather learn one API than have to learn both the API and another abstraction layer."

That is a great point about any class library, but a class library of significant use is one that provides units of work that the API doesn't. An example might be a class that allows parsing of an XML file. The API might give the ability to read/write the file, but there is far more to files that that, such as automatic error handling. Such a feature never belongs in an API, but would be at home in a class library.

"I don't, therefore, see a big difference between coding in 'straight C' vs. assembly language"

There are many things that 'C' cannot do compared to an assembly design. Ever seen the disassembly of a switch statement?  Assembly can be an indexed jump table.  Other design considerations involve memory usage.  Memory pooling is a useful technique used in assembly that isn't used in 'C'.  malloc calls are expensive.  You could write such a thing in 'C', but it would be clumsy at best, and probably buggy.  Size is also a major factor in assembly.  'C' certainly write efficient code, but I can squeeze assembly down naturally without much work.  Anytime code can be knocked from 1MB to 200K, there will be a significant performance gain.  Don't forget that the 'C' runtime library gets dragged in.

I also find that maintenance is also easier.  While both have functions, it just seems that while working in assembly more caution is taken in designs, more thought is given to performance, and more consideration is given to necessity. 

Assembly isn't for most projects, but for core products that should be stable for 3-5 years it should considered.  Ancillary features might be written in C++, but the core might be assembly.  Take a word processor.  Data manipulation might be assembly, so might the display manager and the spell checker, while the dialogs might be C++.

The one critical factor in using assembly, for me, personally, is knowledge.  I know what is in my assembly code.  With using a 3gl such as 'C', the behavior might have a quirk in it that causes a serious performance loss or a bug that I cannot get around.  For a core product base, I prefer to know that I can write the assembly without those factors being involved.  No "whoops", "uh-oh", or "I don't know"'s involved.


120 posted on 12/19/2001 6:30:14 AM PST by PatrioticAmerican
[ Post Reply | Private Reply | To 117 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-20 ... 61-8081-100101-120121-129 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson