Posted on 02/25/2007 10:01:58 AM PST by texas booster
Several folks have asked to learn more about the Folding@Home project. I will provide information here that talks about the science and the math behind F@H.
Above all, remember that F@H is about finding a cure to the diseases that take the live and minds of our loved ones. Basic research is generated concerning the causes of Alzheimers, Parkinsons and BSE, among others.
F@H is an outgrowth out the Genone@Home project started back in the 90's. That project ended April 14, 2004 and Folding@Home is its offspring.
Folding@Home is a combination of Distributed Computing and Serious Math, sent out all over the world to about 190,000 contributors and 1,900,000 computers.
Check out these links to learn more about F@H:
http://folding.stanford.edu/results.html
http://folding.stanford.edu/papers.html
http://fah-web.stanford.edu/cgi-bin/allprojects
http://fahwiki.net/index.php/Runs%2C_Clones_and_Gens
...
First, let's review some basic physics. The key idea is that of a "trajectory." You might recall Newton's Second Law, F = ma, which means that the acceleration a (change in velocity) that a particle experiences is proportional (by its mass m) to the force F it experiences. This means that if we can catalog all the forces on a particle, we can determine its acceleration. If we know the acceleration, then we can use calculus to determine the particle's position as a function of time, for all time. The result is what's called a 'trajectory' -- a kind of map of where the particle has been and where it will be going. By the way, when I say 'particle,' I mean that we could perform this analysis on atoms, protein molecules, baseballs, the space shuttle, the Sun, or anything in between.
The analysis gets a lot harder the more particles there are in the system -- for instance, if you set up a system with the Earth and the Sun as two particles, experiencing each others' gravity, then you can solve Newton's Second Law very easily and write down a function which describes the position of the Sun and the Earth at all times. If you include the moon or other planets, then you can't write down functions like this, though you can solve Newton II numerically. This is what we do for FAH -- solve Newton II numerically for thousands of atoms, thousands of times, once every femtosecond or so (that's "ten-to-the-minus-15" seconds). What we get is a trajectory for the protein atoms.
If we're simulating protein folding, then perhaps the trajectory will result in a folded protein. Perhaps not -- we don't have a way to say for sure how this happens for an arbitrary starting conformation. (But we're studying it, obviously, thanks to our Army of Undea -- oops, I mean FAH clients. The Army of Undead is for a different project entirely.)
Now, on my desktop machine at work, I can simulate a system of about 16,000 atoms moving for 1 nanosecond (ns, or "ten-to-the-minus-9" seconds) in one day. But the protein that I'm folding requires (on average) one microsecond ("ten-to-the-minus-6" seconds) to fold -- and this is a system engineered to fold fast. To get to one microsecond on my desktop machine, I'd have to fold for 1,000 days. Forget about "average" proteins, which might take hundreds of microseconds, or milliseconds, to fold.
Maybe I'd get lucky and the protein would fold in that time; maybe I wouldn't, and they'd find me 35 years later, in some sub-subbasement below the chemistry building at Stanford, a raving lunatic lost to the dredges of Ph. D. research, sneaking out only at night to feed on spilled yeast extract and collecting discarded NMR tubes to wear as primitive jewelry. (I heard this happened to a guy.)
To avoid life-wasting tragedy, we (and when I say "we" I mean, "Someone besides me, but who I know") has recruited hundreds of thousands of generous and interested persons ("you guys") to give us a hand with some of this work. I could run a trajectory for 1,000 days, but instead we've taken a shortcut and decided to run 1,000 or 10,0000 or 100,000 trajectories for a few days (or months or years) instead. On average, a few of these trajectories will result in a folded protein (and we have ways of yielding interesting and important information from all of the work done on FAH).
Okay, here it is: The CLONE numbers are labels for each trajectory that we run. Each GENeration is another chunk of time along that trajectory. So, say that I benchmark CLONE0, GEN0 (the first 4 ns). That WU is then done, and the FAH software builds a new WU with starting coordinates (and velocities and stuff) where mine left off. Then the new WU -- GEN1 of CLONE0 -- gets sent to you, and you simulate the next 4 ns. And so on. So CLONE is a label for an individual trajectory, and GENerations are time steps along that trajectory.
RUNs are groups of similar CLONEs. All the CLONEs in a RUN have the exact same atoms, the exact atom positions, the same temperature, etc. The difference is the starting velocities -- the initial motions of all the atoms in the protein are randomized. Although statistically the velocities are determined by the temperature, there are countless ways of partitioning the velocities to the atoms, so we try out 100 or so CLONEs to get a good feel for the sample space. Assigning different velocity sets to the atoms turns out to be wildly important: if the conformation we start with happens to represent the transition state (sort of halfway from folded, halfway from unfolded) then 50 of our 100 CLONEs will fold, and 50 won't.
The different RUNs in a PROJect might, in their simplest form, represent different starting conformations. So, we could start off 100 RUNs of different partially unfolded structures and try to find the one for which half of its CLONEs fold -- then that RUN has the conformation of a representative of the transition state.
So why is this transition state doohickey so important? The folded state is relatively easy to identify, especially if experimentalists have determined the structure for the protein under scrutiny, or for a very similar one. The "unfolded state" is a bit harder, but we can generate unfolded conformations by, say, simulating the folded protein at high temperatures so it "melts," or we can thread the amino acid sequence on a set of randomly coiled noodles, or whatever. But the path which connects "unfolded" protein with folded protein is not so easy to get to -- but if we identify the transition state, then we've found (at least one of) the paths by which proteins fold, and that's research in protein folding.
The RUNs might also represent slightly different proteins -- for instance, different mutants of some protein. They might represent other things that I haven't thought of, but whatever they are they are similar enough to other RUNs in the same PROJect, that, well, they're part of the same project.
So to summarize, when I'm setting up a project, I might do the following: 1. Pick 100 different unfolded or partially unfolded conformations of my protein of interest. These become my RUNs. 2. Then, I set up 100 different CLONEs for each RUN. (Well, I don't actually set them up myself, I just run a program. But I run it really well. And intelligently. And I look good doing it.) Each CLONE contains one WU at this point. 3. Then, I let the (100 RUNs) x (100 CLONEs) = 10,000 WUs loose on the world ("you guys"). 4. Then, I go have lunch. 5. I come back weeks later to find WUs crunched and GENerations progressing -- each of the original 10,000 WUs was the beginning of one trajectory, so at the end, I have 10,000 trajectories of 50 or 100 or more ns. 6. Finally, I sift through the data and learn something new about protein folding!
And so it goes. I'm still new at this, so I haven't actually done steps 4, 5, or 6 yet, but I've got a good handle on 1, 2, and 3, and now it's a matter of waiting (and doing 1, 2, and 3 a lot more).
...
Bruce has just correctly pointed out to me that this isn't always true (although it's true nearly all of the time). In some instances -- when different trajectories are made to interact -- the "next generation" can't be built until all the other CLONEs have returned WUs of the same generation.
This happens for instance when doing "Replica Exchange Molecular Dynamics," for which the different CLONEs would be trajectories run at different temperatures (at least I think this is how it works ...). Sometimes, the atom coordinates between different trajectories need to be swapped in REMD, and hence you need to wait for the CLONEs to all have generation n finished to build GEN n+1 WUs.
I think. Try
http://folding.stanford.edu/papers/rhee_MREMD_2003biophys.pdf
(hope I got that right). In the end, AIUI, doing REMD with FAH is a pain compared to just doing it on a supercomputer -- we'd rather use FAH for its strengths ("a freaking lot of processors").
For those wanting to know more about the math behind F@H. I have tried to keep it simple.
Somebody better ping the math and science dudes and dudettes to this thread. My advanced math was 30 years ago and was average at best.
For those wanting to know more about the math behind F@H. I have tried to keep it simple.
Somebody better ping the math and science dudes and dudettes to this thread. My advanced math was 30 years ago and was average at best.
BTTT
= Greek
Sciencespeak for "Sh_t happens"
Sciencespeak for "Sh_t happens"
And I was really hoping to make it at least somewhat comprehensible!
Folding@Home is a project (like SETI) that uses our computers, when at rest, to perform serious calculations on them to do basic research for medicine.
About 200 regular FReepers are now part of the team and contribute nearly 1,050 systems to the effort.
Please hang around and we will help fill in the gaps.
The sad part is that I have heard of guys that work forever on their Ph.D. only to be rejected.
Wouldn't THAT be a bummer?
How does it work?: You download a safe, tested program (see link below) that is certified by Stanford University. It gets work from Stanford, runs calculations using your spare computer power, and sends the results back to the University.
Is it safe? Yes! Folding@Home rarely effects computer performance in any way and won't compromise your privacy in any way. It only uses the computing power you aren't using so it doesn't slow down other programs.
How do I get started folding for Team FreeRepublic?:
1.) Download the folding program from Stanford University's folding download page (Folding@home Client Download). Type in your desired username.
2.) Type in 36120 for the team number. THIS IS VERY IMPORTANT - if you get the number wrong, you won't be folding for team FreeRepublic!
3.) The third question asks, "Launch automatically at machine startup, installing this as a service?" - We recommend you answer YES. Otherwise you will have to manually start the program after every reboot.
How can my computer help? Even if they were given exclusive access to all of the world's supercomputers, Stanford still wouldn't have as much processing power as they get from the supercluster of people's desktop systems Folding@home relies on. Modern supercomputers are essentially a cluster of hundreds of processors linked by fast networking. But Stanford needed the power of hundreds of thousands of processors, not just hundreds.
There's no reason to not get involved! It's free, easy, and you can know you're helping every minute without lifting a finger.
*******************************************
List of Relevant Folding Links
Why Fold - Watch This !!
Extreme Overclockers Stats for FreeRepublic
*******************************************
Competition (Not!!) Dummies ..Daily Kos
Dummie Folding Threads #7 #8 #9#10#11 #12
**************************************************
Other Useful Stuff - Links
How much are those work units worth? And what are they?
All Projects Listed
Point Summary for Workunits
Fahmon Third Party Monitoring Software
**************************************
Past FreeRepublic Folding threads
#1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31
ping
(Long ago, in a galaxy far, far away.)
You have made grey_whiskers purr.
Cheers!
LOL...I got keyboard imprints on my forehead from falling asleep and resting my head while reading about the physics.
In any case, thanks for the post.
FOLD one for the GIPPER!
If you're interested in tracking your folding machine(s) over the web, please Freepmail me.
Available features include:
Bumpity-bump
TB,
Is it possible that the old thread get a final message when a new thread is started? I usually find myself a few days behind on new threads as I keep the old one up consistently. Might also help other new people.
Many thanks in advance and check your six,
JosephW
F@H had a project that solved for Schroedingers Equation last year on these proteins!
I still remember the basics but realize that I can not keep up with real math any more. Still, nice to stay in touch with the smart folks.
If you have a spare computer to toss into the effort please join us. It is really a great way to make use of the spare cycles on your system.
Better yet, get involved and become a beta tester here:
http://forum.folding-community.org/ftopic3045.html
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.