Posted on 12/22/2014 8:25:08 AM PST by Academiadotorg
It wasn't until I took graduate school level statistics that based the entire course on random sampling equations for polling that I really got it.
The ‘myth’ is that there are tons of jobs and not enough ‘qualified’ candidates. Part of the reason is that company HR departments care more about resume keywords and degrees than hiring people who produce results.
Putting HR in charge of technical hiring is like putting the government in charge of health care.
“...simply stolen.”
That’s what they’re best at. They steal your women, children, weapons, and ideas. Probably your goats, too.
A majority of pre-med students are either biology or chemistry majors. I'd say that a biology major is “useful” (although not essential) for the study of medicine, and med schools tend to agree with me.
And I'll ask one of your questions to you: For what is a general degree in physics useful?
Not everyone needs to solve problems like this, but if you do, recursion is one of the fastest ways to do it. At least that's how I'd tackle it.
A general physics degree is useful for a PhD. And my point was: the ratio of males to females in my PhD year was 11:1. The prior year had none, the subsequent year none as well. The consequence of which is that a statistic citing a general prevalence of female undergrads in Physics would have no meaning. Neither does a BS in Biology lead us to any conclusion about the prevalence of females in biological fields, because the degree is only meaningful as a precursor to either advanced study or (less likely) Medical School.
Finally, if your post "was not intended to be snarky," then you're not very articulate. You don't begin a post accusing your readers of being clueless dinosaurs unless you either have a malicious intent or no social skills to speak of.
Not even close.
Go back and read the post to which I was replying. Anecdotal evidence from 30 years ago was dredged up to support an inference that women are underrepresented today in STEM subjects. I reported data from 2012 and 2013 that contradicts that claim for the STEM subjects I listed. I also acknowledged in my post to you that, indeed, some STEM majors have relatively few women.
Instead of criticizing my communication skills, you might question your own reading comprehension.
There isn’t a shortage of STEM workers.
There is a shortage of STEM workers willing to work for eight dollars an hour.
The poster made a claim about mathematically oriented engineering disciplines, and you posted a comment about biology, medicine, and chemistry, gratuitously assuming he "hand't been on a college campus in 30 years," simply because he made reference to something that happened 30 years ago.
I think it's you who need to check your reading comprehension.
My post stands unrefuted. You took a gratuitous swipe at another FReeper and you've been called on it.
How big a list? I think I’d probably just go with a merge sort but as I’m a tester these days and haven’t written code in a good 6 years I’m probably forgetting something.
Sorreee!
Thirty years later I’m still recovering too....
The size of the input file is immaterial [that is actually a hint as to the correct answer, which I'll post in a moment.]
I think Id probably just go with a merge sort
No.
Remember the problem definition calls for the FASTEST method. What is the FASTEST sorting method? [The answer is both provably correct in raw mathematical terms and just plain old common sense.]
Let me know when you want the answer.
I’ve used heapsort in the past. It is not as fast as quicksort, but it was easier to program quickly for me.
These days, I’d use the C++ STL list, and used the built-in sort() function...
The built-in C++ STL function is a modified version of Quicksort, which gets around the worst case performance, and because the algorithm for Quicksort is actually shorter than a heapsort, it's faster [although both are overall n * log(n)].
Even so, that is not the way to do this problem. Here is a hint. The fastest sorting method there is is O(n). That's the one you use. What is it? And how do you use it here?
Sorry, I did get that wrong. It has been a while since I did my first one (and I constantly re-use old code), so I couldn't remember the speed/complexity issue. I think I got confused when I read that STL use quicksort, figuring, of course, they would use the faster algorithm.
So I guess coders have awful memories (re: you first post). Full disclosure: I am not a CS person, I'm an ME who does a lot of programming.
From what I have read, unless the data has particular limits on it, it near impossible to beat O(n log n). Is there some restriction on the data that you have that allows O(n)?
Ah, I’ve re-read you first e-mail. I was thinking a generic algorithm.
If you are limited to the first 1 million numbers, you create an array of 1 million integers initialized to 0, and as you loop through the list, you increment the array at the location for that integer.
i.e, if the number is 1, you increment array item 1 (Fortran) or 0 (C/C++).
Once you have gone through the list, you loop through the array to determine the first in the list.
Correct?
The fastest general sort algorithm with a discrete set of elements [which you have here] is a "bucket sort." Typically, you implement it with either a) a lot of memory or b) a hash table.
In the present case, because you are sorting integers, a) and b) are the same thing, and you don't need much memory. Without duplicates, you need only 1 megabyte. With up to 264 duplicates per integer, you need at most eight megabytes. Your hashing function is nothing more than the index of the integer. Because the set is discreet, sorting is the same as storing an entry. Storing an entry is order O(n), and retrieving it also O(n). This is easily provable to be the fastest possible method, because you must read all the inputs, which is in and of itself O(n), there is no way to eliminate that, so an O(n) sort is the fastest possible.
The common sense answer: when you need to file papers on different subjects, the fastest way to do that is by throwing them on the floor in piles. That's the essence of a bucket sort. When you pick up the piles, the sort is complete. All you need is a big enough floor.
Here are two solutions in C++. One if there are duplicates, one if there aren't.
// sort from an input list containing a subset // of the first million integers, with or without dupes #define MAXRANGE 1000000 // version with no dupes void sortnodupes() { // assume the input is in stdin, for simplicity char * bucket = new char[MAXRANGE]; memset(bucket, 0, MAXRANGE); int nextint; // read input while (std::cin >> nextint) { bucket[nextint] = 1; } // write output for (int i = 0; i<MAXRANGE; i++) { if (bucket[i]) std::cout << i << '\n'; } } // version with dupes void sortwithdupes() { // assume the input is in stdin, for simplicity unsigned long * bucket = new unsigned long[MAXRANGE]; memset(bucket, 0, MAXRANGE * sizeof(bucket[0])); int nextint; // read input while (std::cin >> nextint) { ++bucket[nextint]; } // write output for (int i = 0; i<MAXRANGE; i++) { int ii = bucket[i]; if (ii) { for (int j = 0; j < ii; j++) std::cout << i << '\n'; } } }
We've even tried a VERY simple variation on this question, which says: you have ALL of the first million positive integers in a file. Write a program to sort them and write the sorted output. Occasionally, someone gets that right. The "sort" algorithm is so stupid it's laughable.
void main() { for(int i=0;i<1000000;i++) std::cout <<<< i << '\n'; }
Oh, and I should mention, in response to JenB's question, if there are no duplicates, the input file size is immaterial. One megabyte of bucket will sort a file of any size. If there are duplicates, an unsigned long will sort a file of size 1 million x 264, which is roughly 16 million exabytes. I believe that is about 2000 times the currently estimated storage of all of the hard drives on earth. So in practical terms, there is no size limit, even with dupes...
Correct.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.