Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

No, it’s not always quicker to do things in memory (computer)
ITworld ^ | March 25, 2015 | Phil Johnson

Posted on 03/26/2015 8:27:11 PM PDT by Utilizer

It’s a commonly held belief among software developers that avoiding disk access in favor of doing as much work as possible in-memory will results in shorter runtimes. The growth of big data has made time saving techniques such as performing operations in-memory more attractive than ever for programmers. New research, though, challenges the notion that in-memory operations are always faster than disk-access approaches and reinforces the need for developers to better understand system-level software.

These findings were recently presented by researchers from the University of Calgary and the University of British Columbia in a paper titled When In-Memory Computing is Slower than Heavy Disk Usage. They tested this assumption that working in-memory is necessarily faster than doing lots of disk writes using a simple example. Specifically, they compared the efficiency of alternative ways to create a 1MB string and write it to disk. An in-memory version concatenated strings of fixed sizes (first 1 byte then 10 then 1,000 then 1,000,000 bytes) in-memory, then wrote the result to disk (a single write). The disk-only approach wrote the strings directly to disk (e.g., 1,000,000 writes of 1 bytes strings, 100,000 writes of 10 byte strings, etc.).

(Excerpt) Read more at itworld.com ...


TOPICS: Computers/Internet; Reference
KEYWORDS: computers; computing; disks; memory
Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-68 next last
To: Squawk 8888

Right. In the disk method, the disk driver is performing at least a partial concatenation before actually writing to disk. The disk driver has more efficient code for concatenation than the code generated by Java (no surprise there) or Python.


41 posted on 03/26/2015 9:50:40 PM PDT by AZLiberty (No tag today.)
[ Post Reply | Private Reply | To 7 | View Replies]

To: TheZMan
In my experience it varies a lot from shop-to-shop, but in the best I've seen it was about half, and in the worst it was as low as 1 out of 10.
42 posted on 03/26/2015 10:07:12 PM PDT by FredZarguna (It looks just like a Telefunken U-47 -- with leather.)
[ Post Reply | Private Reply | To 24 | View Replies]

To: Utilizer
reinforces the need for developers to better understand system-level software.

Yeah, right. Like that's gonna happen. The monkeys churning out code today probably think Big Endian and Little Endian is a children's book about Native Americans.

43 posted on 03/26/2015 10:21:43 PM PDT by BuckeyeTexan (There are those that break and bend. I'm the other kind. ~Steve Earle)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Squawk 8888

Amen to that! Just a bunch of monkeys.


44 posted on 03/26/2015 10:26:44 PM PDT by BuckeyeTexan (There are those that break and bend. I'm the other kind. ~Steve Earle)
[ Post Reply | Private Reply | To 35 | View Replies]

To: TheZMan

Id go with 10%.


45 posted on 03/26/2015 10:33:38 PM PDT by Paladin2
[ Post Reply | Private Reply | To 24 | View Replies]

To: Squawk 8888

They leave HDDs in their dust:

http://techreport.com/r.x/samsung-850evo/crystal-read.gif

http://techreport.com/r.x/samsung-850evo/crystal-write.gif

http://techreport.com/r.x/samsung-850evo/db2-read.gif

http://techreport.com/r.x/samsung-850evo/db2-write.gif


46 posted on 03/26/2015 10:39:51 PM PDT by ltc8k6
[ Post Reply | Private Reply | To 33 | View Replies]

To: Utilizer

doing the operation in memory then doing a single 1m write to disk is still FAR faster then 1m 1 byte writes followed by 100k of 10bytes, etc.

even if the memory version was written as a single byte at a time, it would equate to the 1m 1 byte writes. the other writes would be slower


47 posted on 03/26/2015 11:10:49 PM PDT by sten (fighting tyranny never goes out of style)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
This problem is simple.

Generally, you should set a buffer size of 4-16K and format your app's output directly into the buffer, if possible. You may wish to use multiple 4-16K buffers, so that you are writing into the current buffer while one of your past buffers is being transferred to disk asynchronously. When you fill the current buffer, it should be queued for output, and you should switch your output-formatting activity to scribble on a previous buffer which has already been written. When you are done, you should remember to queue your final buffer for output and wait until all buffers have been written. Then please close the file.

The optimal buffer size and number of buffers should be determined by experiment.

48 posted on 03/26/2015 11:45:38 PM PDT by cynwoody
[ Post Reply | Private Reply | To 1 | View Replies]

To: cynwoody

Thanks for the tip, mate. :)


49 posted on 03/26/2015 11:55:09 PM PDT by Utilizer (Bacon A'kbar! - In world today are only peaceful people, and the muzlims trying to kill them)
[ Post Reply | Private Reply | To 48 | View Replies]

To: AZLiberty
In the disk method, the disk driver is performing at least a partial concatenation before actually writing to disk.

Disk drivers don't know squat about concatenation. They just know about "write this block of memory to this chunk of disk blocks". Of course, what happens next will depend on whether the HD is buffered or whether it's not an HD but an SSD, etc.

50 posted on 03/26/2015 11:56:15 PM PDT by cynwoody
[ Post Reply | Private Reply | To 41 | View Replies]

To: Wingy
Can you structure a problem that can be finished faster on disk than in-memory?

Are you talking 1970 or 2015?

If the problem fits in memory, then it can be solved in memory far faster than on disk. If not, then you need a strategy that takes the disparity of access times into account.

E.g., if the problem is sorting the donor file, then you need some sort of algorithm in which sorted subsets are written to disk, then read in and merged, written out again, until you end up with sorted output. Of course, if it's 2015, you just read in the damned file and sort it! Done!

In 2015, your laptop or your smartphone likely has way more RAM than a major glassed in, raised floor computer installation of the 1970's or 1980's had RAM plus disk.

51 posted on 03/27/2015 12:22:26 AM PDT by cynwoody
[ Post Reply | Private Reply | To 2 | View Replies]

To: Utilizer
I am required to post the article with the complete headline as originally printed, so as to not cause any problems with re-posts or searches.

Sorry, but my comment was not directed at you personally, but the author of the original article. My point being that the premise is false except for contrived tests that utilize memory at it's most inefficient and optimizes the hard drive.

52 posted on 03/27/2015 2:50:18 AM PDT by Wingy
[ Post Reply | Private Reply | To 19 | View Replies]

To: rdb3; Calvinist_Dark_Lord; JosephW; Only1choice____Freedom; amigatec; Ernest_at_the_Beach; ...

53 posted on 03/27/2015 4:13:00 AM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
Performance testing that does disk I/O can produce results in the test that don't translate in production because of disk contention from other processes that may be running alongside the process in production.

One million single writes to disk can be a much different proposition if the test has the disk all to itself than it is on a busy system where every write operation can potentially have to get queued and wait for some other process to release the disk channel.

IMHO

54 posted on 03/27/2015 4:27:38 AM PDT by tacticalogic ("Oh, bother!" said Pooh, as he chambered his last round.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Squawk 8888; FredZarguna; Paladin2

“25 years ... half a dozen”
“half at best, 1 out of 10 at worst”
“10%”

Thanks for the replies. It’s nice to know I’m not the only one in this camp. My answer is 10%.

If I was a bit more of an entrepreneur, these markers are the ones I would look for when hiring programmers because within the 10%, the answer tends to be ... 10% - which tells me there is an ability block that “thinks this way”.

Alas, I work for a living ~


55 posted on 03/27/2015 5:52:40 AM PDT by TheZMan (I am a secessionist.)
[ Post Reply | Private Reply | To 35 | View Replies]

To: Utilizer
Unless you're specifically calling instructions to flush the write atomically to disk without delay, under modern OSs, you're likely caching the writes anyway.

I'm sure it is possible to construct very narrowly tailored circumstances where what they are describing makes sense, but it's such an artificial construct that it's not really useful. It's simply a reminder to never use the word 'never'.

56 posted on 03/27/2015 6:45:50 AM PDT by zeugma ( The Clintons Could Find a Loophole in a Stop Sign)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
This is crap.

It only proves that you can design a test to do stupid things that don't really apply in the real world.

First and foremost, is the fact that memory is everything. In order for a process to write to disk, it must first put that data in a buffer, which is (gasp) MEMORY. In most modern, enterprise level systems, there is a ton of cache (more memory) sitting in the disk subsystem to receive the data from the operating system prior to it being written to disk.

Let's see them run an application or database doing real-world work and see how their theory holds up. I got $100 that says "not very well"

57 posted on 03/27/2015 7:06:08 AM PDT by BlueMondaySkipper (Involuntarily subsidizing the parasite class since 1981)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Utilizer
Sorry but the example is kind of silly..

In simple terms the test was to get a string of bits written to the disk in a given order

So a one step operation—write to the disk— is faster then a two step operation—organize the bits in memory—then write to the disk....

Gee that a shock...(/sarcasm off)

58 posted on 03/27/2015 9:32:46 AM PDT by tophat9000 (An Eye for an Eye, a Word for a Word...nothing more)
[ Post Reply | Private Reply | To 1 | View Replies]

To: kosciusko51

ANYTHING can be done faster in memory than on Disk if the problem is properly stated and the program is properly constructed. I can imagine, easily, situations that either could result in more rapid performance given (essentially) unlimited space on disk but not in memory. (a problem with “sparse matrices”)


59 posted on 03/28/2015 9:11:05 PM PDT by AFPhys ((Praying for our troops, our citizens, that the Bible and Freedom become basis of the US law again))
[ Post Reply | Private Reply | To 5 | View Replies]

To: Born to Conserve

When performance means money (server time, server sizing, etc...), it can actually make sense to do these types of tests.


60 posted on 03/30/2015 7:54:06 PM PDT by mbj
[ Post Reply | Private Reply | To 8 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-68 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson