Posted on 07/19/2009 12:00:03 PM PDT by Ernest_at_the_Beach
fyi
*************************************EXCERPT*******************************
Embedded problems: exploiting NULL pointer dereferences
Your device could be at risk
By Federico Biancuzzi, SecurityFocus
Interview Barnaby Jack developed a method for exploiting certain NULL pointer dereferences on the ARM and XScale architectures (and likely PowerPC). This method affects a lot of devices since most mobile phones and PDA are ARM based, and high-end routers often use the XScale architecture.
Could you introduce yourself?
Barnaby Jack: I'm a staff security researcher at Juniper Networks. I've been involved in computer security for a number of years, mostly dealing with operating system internals, reverse engineering, and anything low-level. I've recently started to focus some of my research efforts into embedded systems - I'm having fun with it. I'm a kiwi born and bred, but these days I'm living way across the pond up in the bay area.
Could you describe the vector rewrite attack you have developed?
Barnaby Jack: The Vector Rewrite Attack is a method for exploiting certain NULL pointer dereferences on the ARM and XScale architectures. In general, NULL pointer dereference flaws are considered non-exploitable. On the XScale and ARM architectures the memory address 0 is mapped, and also holds the exception vector table. The exception vector table is a set of branch instructions that correspond to different exceptions, such as software and hardware interrupts. When a case arises that writes to the 0 address with user-defined source data, it is possible to gain execution control by rewriting the exception table.
On many embedded devices, execution is running in Supervisor (SVC) mode so memory access is unrestricted. The PowerPC architecture also stores the vector table at a low address, and is likely vulnerable to this same attack. Research into the PPC architecture is ongoing.
A short paper describing the attack is available here (pdf).
There were some comments around the net about your attack and its link with the JTAG interface. Could you please explain us how you used JTAG and the link with your attack?
Barnaby Jack: The JTAG interface is a hardware interface that when used in conjunction with a hardware debugging probe, allows live debugging of the embedded processor. JTAG is simply used as a debugging mechanism. JTAG is in no way required for an attack, and is used for exploit development in the same way a debugger such as ollydbg would be used on a PC. Most modern cores have JTAG support built into the processor design.
Which architectures are affected?
Barnaby Jack: ARM and XScale architectures, and likely the PowerPC architecture. The MIPS processor maps the vectors to a high address, and is not susceptible to this exploitation method. Any architecture that stores the vector table at 0x0 would be vulnerable to this attack.
Can we consider this a hardware design problem?
Barnaby Jack: This could be considered more of a problem in the architecture design. The MIPS architecture for example bases the exception vectors at a high address, at 0x8000xxxx. Thankfully, with ARM, XScale, and PowerPC - there is an option to map the vectors to a high address.
0 N035!!!!
Guess I’d better get Windows 7 pretty quick, then.
*********************************EXCERPT********************************
This is yet another weapon to add to an attacker's arsenal. In this case, prevention is fairly simple.
Vendors, take note.
Watch for a message soon from Redmond....
BFL
Well, what do you expect from something coded in C/C++?
The problem is partly caused by the coder, partly by the compiler optimization.
The typical code fragment goes like this:
struct sock *sk = tun->sk;
(probably some more declarations and stuff)
if (!tun)
return POLLERR;
The compiler, seeing that the pointer ‘tun’ has already been read, assumes that the check for “!tun” (which would be the same as “tun != 0”) is redundant. This, however, isn’t true - the dereference to tun->sk would have returned gibberish if tun were == 0.
The solution is to put the dereference of tun->sk after the check.
A better solution would be for compilers to emit an error message when a pointer that has not been assigned a value is dereferenced - as in better, more strongly typed languages like Ada.
It would be nice, tho, if all compilers based on the gcc/GNU toolchain could somehow emit a log of “this is what I did in the optimization phase” with the original source (not just the generated asm code) so that the programmer had some clue of changes being made to the code. In the old days, when the PL/I optimizing compiler was all the rage on IBM machines (the late 70’s and early 80’s), we had a joke “Does it.... or doesn’t it? Only your PL/I OPT compiler knows for sure” — because the damn compiler re-wrote so much code on the fly for you.
Another hyper-optimizing compiler that left more than one person scratching their head was the FORTH1 FORTRAN compiler on IBM machines - and Bliss-32 on VAX systems.
They ended up translating the entire ADA program into assembly by hand.
Those (stack/heap overflow errors) are easily prevented as well... by using a language that uses bounds-checking.
That was indeed a problem in the very early 80’s Ada compilers. That’s certainly not a problem now. There’s now an Ada compiler built on the GCC back-end that works quite nicely and there is active work on it all the time.
The Boeing 777 FMS is written in Ada. They started with C++ and Ada, and by the time they were about 40% of the way into the coding, they realized that the reliability of the Ada code was so much better, they canned the C++ version of the projects and converted everyone over to Ada.
Ada got a bad rap from those early compilers, especially the NYU/Ed compiler written in Setl, which was horribly slow (like three lines of source code compiled per minute of CPU time). This allowed an opening for crap like C++ to get a chokehold on the US software industry, with the attending results we see today.
Ada is a structured, statically typed, imperative, and object-oriented high-level computer programming language, extended from Pascal and other languages. It was originally designed by a team led by Jean Ichbiah of CII Honeywell Bull under contract to the United States Department of Defense (DoD) from 1977 to 1983 to supersede the hundreds of programming languages then used by the DoD. Ada is strongly typed and compilers are validated for reliability in mission-critical applications, such as avionics software.
Well, that description isn’t entirely accurate - it isn’t really reflective of the origins of the language at all, really. Ada wasn’t an outgrowth of Pascal. Modula and Modula-2 were, but not Ada.
Ada started (ie, “Ada-83”) as a strongly typed language, that much is true. It was targeted for replacing the “Tower of Babel” in DOD software contracting, where there were huge systems written in COBOL, PL/I, CMS, JOVIAL and FORTRAN, among other languages.
There wasn’t a whole lot of object oriented support in the language at that time (ie, Ada-83). By the time the Ada specification was revisited in 1995 (ie, “Ada-95”), then OOP support was added. As of Ada’s next revision in 2005, the OOP facilities have been augmented even more.
One of the best things about Ada (from 1983 clear through to today) is that there are validation suites for the language implementations. This isn’t done just for mission-critical applications - the compilers and the run-time system can be validated against a test suite as a matter of course. C (and C++) stumbled along for a very long time with no formal standard, much less a verification suite of compiler behavior. For a long time, I remember having to deal with the issue of “is my target system a K&R C compiler, or ANSI C compiler?” And the result was that the code was littered with #if’s and #ifdef’s to deal with the differences.
Then there were the fundamental issues of system architecture that were brought up to the C programmer; During the 80’s, there were C compilers that had “int” of 16 or 32 bits... and code written for PDP-11’s would compile just spiffy on VAX systems where ‘int’ was 32 bits.... and then little integer sign extension and sizeof problems would start showing up here and there.
But perhaps the single largest failing of C (and C++) was the null-terminated string. Using a data value as the “end of string” marker has led to more mistakes and mischief in code than any other mis-feature. Ada, like Pascal, Modula and other strongly typed languages, doesn’t pull this crap with a “null” to terminate a string. They have explicit string datatypes, and set explicit bounds on string sizes and reflect the actual size of a string in a length field in the internal string datatype.
The net:net effect of the strong modularity, interface typing, data typing, etc in Ada goes like this: you write some code. You then fight the Ada compiler for a rather long time to get your code to compile cleanly. Until you can get it to compile cleanly, it won’t generate an object. You have to compile your interface definitions first, then compile your code that implements the interfaces, in that order. If you don’t get agreement between the definition and the usage compilations, you fail. If you don’t get your type usage correct, or the order of initialization correct, or size agreement correct, your compile fails. By the time you get all these issues swatted down, you’re swearing a blue streak and getting heat from your manager that you still appear to be flailing about in basic coding errors. Not good. You’re getting heat about this... and you wish ou could use something else. Something quick and greasy, that gets your stupid manager off your back.
Then it finally compiles cleanly.... and then links, and lo, it pretty much works. First time. Kinda neat. If there are errors, you notice that the Ada run-time is barking about them — array boundary errors and the like. Moreover, the Ada exception facility allows you to create error handling inside your program to deal with data and run-time exceptions much more easily than C allows you to do so. You actually have a chance to do something reasonably intelligent before pitching the exception up to the OS, which would usually abort your program or puke up a stack trace all over the screen.
Compare this to C, where you can get a clean compile pretty easily. Then you fire up your program, and it crashes. You find the problems (invariably a pointer issue), fix it, re-link, re-launch, crash again further into the program. Lather, rinse, repeat.... again and again and again. After awhile, it seems as tho your program works.... just because it hasn’t crashed in awhile.
After 20+ years of slinging code, I’ve come to a conclusion that C (and especially C++) are going to be the undoing of the US technological infrastructure one day. The reason why C and C++ got such a foothold was partly the blame of the academics who were pitching such languages as Pascal, Modula and early Ada. These languages in the 80’s really didn’t address real world interface issues (ie, interfacing to hardware - eg, try writing a device driver in Ada-83, Pascal or Modula-2 — very difficult) and C just walzed into these applications. After all, C, as it started on PDP-11’s, was little more than a high level assembler. I could look at C code on a PDP-11 and tell you, instruction for instruction, what the compiler was going to emit.
Pascal was really never intended to be a real-world programming language - Wirth always intended it to be a teaching language, a successor to Algol. The first Pascal compilers wouldn’t even allow you to save your object deck - you loaded a deck which was the Pascal compiler, then you loaded up your source deck behind the Pascal compiler deck and the whole thing compiled and ran your program in one shot, tossing your object deck when the run was done. Later Pascal compilers did allow you to save an object deck, but Pascal was lacking any language standard and the language definition lacked any facilities for interfacing with hardware. The follow-on language, Modula, fixed some of these issues, but by that time, C was already gaining a foothold.
The irony is that a suitable programming language with real-world interfaces and enough of what makes Modula and Ada very good large-system languages existed in the late 70’s — “Mesa” - the language used by PARC for their “D-machines” — was a very, very capable language. If Mesa had been published and made available by Xerox for people to study outside of PARC, it could very well have become the system programming and implementation language of choice for the US computer industry. Sadly, like much of the technology at PARC, Xerox sat on it, and even to this day, most of the world has never seen Mesa. I got to dabble in Mesa while helping my then girlfriend (later my wife) when she was working on the last of the D-machines at Xsoft in the early 90’s. When I finally saw Mesa and what it could do, I really wanted to see it sprung loose for use by the world at large - it was, IMO, superior to Modula, much simpler than Ada, but robust enough it could have been a huge improvement over C/C++ as a systems language.
The trouble is, now that we’re trying to create these software systems of almost unconstrained complexity, the C/C++ paradigm of “we’ll give you a very sharp knife — make sure you don’t cut yourself while using it as a screwdriver” doesn’t cut it any more in huge team projects. Languages like Ada, with its support for modules, interfaces, generics and the like, supports big software projects with a “cast of thousands” much better than C/C++ ever will. Trouble is, too many companies are now wed to the C/C++ cancer as their toolchain - for no other reason than that’s what their people know.
And so we get more and more unreliable s/w, with bigger and bigger consequences...
The way you critique C, I take it that the Objective C used by Apple is no better?
Objective-C IS better — than C++. Which is not a terribly high bar to exceed.
C++ is a language that cannot make up its mind what it is: a floor wax, dessert topping, engine cleaner or dietary supplement. You have OOP and programming by template in the same language, and mixing the two paradigms at the same time can result in seriously brain-twisting results. It is a very complicated language and yet in all the complexity, the language designers could not see fit to handle the #1 issue of languages like C++ — garbage collection and dynamic memory allocation. Why? Well, when we read their reasons, they sound a lot more like excuses.
Objective-C picks only one paradigm - OOP, and then does it well relative to C++: you can do only single inheritance (which is a positive, IMO), you can ask any object at any time “Hey, how did you get here?” and the language pushes late binding, which IMO is something that any OOPL should prefer. In Objective-C 2.0, there is a reasonably decent automatic garbage collection system - prior to that was a fairly mickey-mouse reference counting system.
The message system is simple and clean in Objective-C, they try to model the ideas on SmallTalk, which is a better OOPL than most of the “OOP” languages out there (by a long shot, really). If I had to pick one virtue of Objective-C over C++, it is that Objective-C is simpler; it doesn’t try to solve every programming problem in the world, it tries to solve a small set of issues ONE way, and then do it reasonably well.
So is it better? Yes - than C++. But then, in my opinion, C is better than C++. Does Objective-C solve the problems of C, ie, null-terminated strings, the confluence between arrays and pointers, the ability of programmers to botch pointer math, etc? No. Objective-C (as used by NeXT and then Apple) uses their own string class to avoid using C-style strings, but that’s not a solution in Obj-C any more than the zillions of string classes are a solution in C++. The problem with null-terminated strings originates in C and the C run-time library, and the only way to *eliminate* this problem from Obj-C or C++ is to break compatibility with C and say “this language will NOT SUPPORT C-strings and you WILL NOT call C libraries” to prevent programmers from shooting themselves in the foot with this particular loaded gun.
C/C++ resulted in one more thing that has seriously compromised system reliability: the assumption of C-based run time libraries as being the defacto run time library, regardless of the actual implementation language that calls them. This enshrines the issues like null-terminated strings and lousy exception handling in stone. In ye good ol’ days on a system like VAX/VMS, the run-time libraries had excellent support and were language-independent. Strings, (for example) had a “descriptor” that said “the string is currently X long, can become M maximum long, and is of type A,B,C where the various types were ‘static string,’ ‘dynamic string,’ etc. C-style strings could be modeled on these language-independent string handling routines, albeit with a bunch of fiddling that C programmers really didn’t like because the compiler didn’t do it for them. The libraries would return errors and throw exceptions based on VMS’ very good error/exception handling mechanism, which was also language-independent.
Today, we see that the run-time libraries on most Unix-derived systems are standardized versions of the C RTL - with all of C’s problems like null-terminated strings and the atrocious error handling methods of C. It is a huge step backwards for the industry, IMO.
I certainly have to admit that hacking in C (and a bit in C++) made me a nice living and allowed me to retire from the industry early - so in some ways I’m looking a gift horse in the mouth. But as an American, concerned about American security and overall system reliability as people’s lives are increasingly dependent upon system reliability, I have to be honest and say “things don’t look good” and the reason why they don’t, the “let’s scrape all this down to bedrock and identify where we went wrong at the foundation” type of reason is C and C++.
Thanks for the tutorial...having fought with COBOL...I go back a long ways...
NVD. I enjoyed reading your discourses within this post. Brings back many memories of days bygone. Interesting to see Red Hat jumped on the issue at hand.
I agree 100% about how C has been perverted though. Don't get me started on C++.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.