Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

No, it’s not always quicker to do things in memory (computer)
ITworld ^ | March 25, 2015 | Phil Johnson

Posted on 03/26/2015 8:27:11 PM PDT by Utilizer

It’s a commonly held belief among software developers that avoiding disk access in favor of doing as much work as possible in-memory will results in shorter runtimes. The growth of big data has made time saving techniques such as performing operations in-memory more attractive than ever for programmers. New research, though, challenges the notion that in-memory operations are always faster than disk-access approaches and reinforces the need for developers to better understand system-level software.

These findings were recently presented by researchers from the University of Calgary and the University of British Columbia in a paper titled When In-Memory Computing is Slower than Heavy Disk Usage. They tested this assumption that working in-memory is necessarily faster than doing lots of disk writes using a simple example. Specifically, they compared the efficiency of alternative ways to create a 1MB string and write it to disk. An in-memory version concatenated strings of fixed sizes (first 1 byte then 10 then 1,000 then 1,000,000 bytes) in-memory, then wrote the result to disk (a single write). The disk-only approach wrote the strings directly to disk (e.g., 1,000,000 writes of 1 bytes strings, 100,000 writes of 10 byte strings, etc.).

(Excerpt) Read more at itworld.com ...


TOPICS: Computers/Internet; Reference
KEYWORDS: computers; computing; disks; memory
Navigation: use the links below to view more comments.
first 1-2021-4041-6061-68 next last
Something to think about for the tech/programming lot.
1 posted on 03/26/2015 8:27:11 PM PDT by Utilizer
[ Post Reply | Private Reply | View Replies]

To: Utilizer

I barely know how to turn my computer on, but might a better title to this post be: Can you structure a problem that can be finished faster on disk than in-memory? It pays to be specific.


2 posted on 03/26/2015 8:35:10 PM PDT by Wingy
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer

Given that nearly all operating systems use virtual memory, all bets are off anyway.


3 posted on 03/26/2015 8:35:13 PM PDT by Squawk 8888 (Will steal your comments & post them on Twitter)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer

Makes sense. Instead of putting the string together in memory and writing it to disk when it’s completed, you are writing to the disk as it’s assembled. So you’re skipping a step.


4 posted on 03/26/2015 8:36:51 PM PDT by GMMC0987
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer

Interesting comments at article. Many saying that the code was poorly written.


5 posted on 03/26/2015 8:37:46 PM PDT by kosciusko51
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer

Cheating. If the objective is to build something on the disk, building it on the disk is going to be faster, duh.


6 posted on 03/26/2015 8:38:08 PM PDT by Paladin2
[ Post Reply | Private Reply | To 1 | View Replies]

To: GMMC0987

Not only that, but when writing to disk it’s actually going to a RAM buffer. So it’s almost as fast anyway.


7 posted on 03/26/2015 8:38:54 PM PDT by Squawk 8888 (Will steal your comments & post them on Twitter)
[ Post Reply | Private Reply | To 4 | View Replies]

To: Utilizer

I can’t believe how retarded this is.


8 posted on 03/26/2015 8:44:16 PM PDT by Born to Conserve
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
It's actually a pretty stupid example.

It makes no sense to concatenate the string in memory and then write it to disk, since in either case you will be writing the string sequentially to disk, anyway. Java, Python, C#, and other "managed" languages will always do this more slowly because their strings are immutable, which any decent coder knows.

Best approach: find out the allocation block size of a file on disk, pre-allocate one buffer of that size, memory write to that buffer, flushing the whole block to disk when it's full; this avoid the penalty of zillions of memory allocations and garbage collections and writes a block of optimal size.

In most cases, just pre-allocating a moderately sized block of memory without knowing the best block size is good enough and may even be preferable, because the underlying OS is going to optimally block IO, and probably also cache that at a secondary level.

The key point is to avoid over-allocating managed objects, and again, most good coders know to do this, even if people writing stupid research papers don't...

9 posted on 03/26/2015 8:47:16 PM PDT by FredZarguna (It looks just like a Telefunken U-47 -- with leather.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
Java and Python versions of the code were written..

The real issue isn't memory vs. disk, its what the language you are using does to perform the string concatenation operation.

The fastest technique will be one that does string concatenation in memory while the disk write of the previous string section is completing, so that the disk latencies are used for string building. Oh, and of course the string concatenation code should be designed to run in cache and avoid any virtual memory paging or extra memory copy operations.

The key to performance is understanding how the system works, and writing code at a low enough level to be able to control how it interacts with the system. That's why C and C++ still get used.

The technique of "writing 1 byte at a time" to the disk is really just a way of utilizing the buffering present in the I/O system to queue up disk writes. All the interesting stuff is actually happening in memory, however its being done by clever system code written by people who understand how to get high performance.

A well written version of the string concatenation test should be able to write data to the disk as fast as the disk can write data.

10 posted on 03/26/2015 8:50:17 PM PDT by freeandfreezing
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer

I smell a bug.

That’s the way it was written, the test. It saw the flaw of some nature and then wrote a perfectly good set of conflicting code. We called them bugs and the people who exploit them hackers.

I’ve spent many of nights watching and analyzing processor bus activity on a logic analyzer along with a profiling running program in the OS to believe that they just didn’t find a bug to exploit.


11 posted on 03/26/2015 8:52:00 PM PDT by Usagi_yo (If you're not leading, you're struggling to be relevant.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: FredZarguna
A well written version of the string concatenation test should be able to write data to the disk as fast as the disk can write data.

See Fred's example above...

12 posted on 03/26/2015 8:53:11 PM PDT by freeandfreezing
[ Post Reply | Private Reply | To 9 | View Replies]

To: freeandfreezing

Assembly is still most efficient, especially if the action has to occur frequently in a system of modest capabilities.


13 posted on 03/26/2015 8:54:23 PM PDT by Paladin2
[ Post Reply | Private Reply | To 10 | View Replies]

To: Paladin2

The Story of Mel is still the best.


14 posted on 03/26/2015 8:56:32 PM PDT by Gideon7
[ Post Reply | Private Reply | To 13 | View Replies]

To: Gideon7

Thanks for the hint to history. Back in the day knowledge of hardware opportunities was widely used for system optimization, especially in real time systems. Hacking in HEX ruled...


15 posted on 03/26/2015 9:08:42 PM PDT by Paladin2
[ Post Reply | Private Reply | To 14 | View Replies]

To: Gideon7

How do you think we got to the Moon?


16 posted on 03/26/2015 9:09:38 PM PDT by Paladin2
[ Post Reply | Private Reply | To 14 | View Replies]

To: freeandfreezing

Yep. The VM paging code in Windows is highly optimized, going all the way back to David Cutler’s Windows NT in 1996 (and DEC VAX/VMS before that).

Generally the fastest way to write a file in Windows is to just call ::CreateMemoryMapping() and scribble away. You avoid the double buffering of ::WriteFile(), and the VM subsystem is smart enough to do readaheads and stride I/O too.


17 posted on 03/26/2015 9:09:42 PM PDT by Gideon7
[ Post Reply | Private Reply | To 10 | View Replies]

To: Paladin2

Or octal...


18 posted on 03/26/2015 9:10:15 PM PDT by Paladin2
[ Post Reply | Private Reply | To 15 | View Replies]

To: Wingy

I am required to post the article with the complete headline as originally printed, so as to not cause any problems with re-posts or searches.


19 posted on 03/26/2015 9:16:17 PM PDT by Utilizer (Bacon A'kbar! - In world today are only peaceful people, and the muzlims trying to kill them)
[ Post Reply | Private Reply | To 2 | View Replies]

To: Utilizer

Once you replace you mechanical hard drive with an SSD, you will then know what fast really is.

I bought a Samsung 850EVO and wow!

I’m never going back.


20 posted on 03/26/2015 9:20:09 PM PDT by ltc8k6
[ Post Reply | Private Reply | To 1 | View Replies]


Navigation: use the links below to view more comments.
first 1-2021-4041-6061-68 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson