Here is what I am planning. I welcome suggestions:
Fundamentally I want to look into the distribution of "adverse outcomes" (death or hospitalization) based on manufacturer and lot.
Is one vaccine better or worse than the others? For this I'll graph total deaths, broken out as Moderna, Pfizer, or Jansen. That in itself doesn't tell us too much. We really need the death rates. For example, as of 28 Oct in the US there have been 245,237,611 Pfizer doses given, 156,602,727 Moderna, and 15,521,154 Jansen. Since the mRNA vaccines are two dose, I'll have to cut those numbers in half. Yes, that makes an assumption that most people who got at least one of the mRNA doses finished, and that there was negligible crossover. I'm also discounting booster doses.
Is there more or less variability in vaccines? For this I'll look at total bad outcomes by lot. Then I'll get the standard deviation of the numbers. If the bad outcomes are relatively random, then the lots should be relatively even in bad outcomes and the standard deviation across lots should be relatively small. Granted, I don't know if all lots are the same size. I don't know if some are still in use and thus data is incomplete. We also don't know the distribution - did some got to relatively high risk areas (eg. older population, less healthy population, etc.) So this may not tell us much for any one vaccine (Moderna/Pfizer/Jansen) but it will tell us if one stands out from the others good or bad.
I'd like to know the relative rates of bad outcomes compared to other vaccines. For that I'll total all the covid vaccines data, get a rate based on total vaccinated people. Then I'll pull the reports for the other vaccines and see if I can find or estimate total vaccines of those types given. Looking to see if the covid vaccines are better, worse, or about the same for rates of bad outcomes. I suspect they are worse - merely because they are new, rushed into use, and not refined or well understood. They are being recommended to everyone without regard (or knowledge, yet) of who should avoid them.
Then I want to look at the time delta between vaccination and onset of the bad outcome. This is available as days post vaccine. I'll pull out min, max, mean, and standard deviation. Mostly I want to see what percentage of bad outcomes occur within the 14 day window where the patient is considered "not fully vaccinated." Because you know those patients were admitted as un-vaxxed, and almost certainly counted as covid patients since they'd have antibodies. I just want to confirm that the 14 day window is a gift to the pro-vax types encompassing virtually all bad outcomes, accounting for them as un-vaxxed.
Per the suggestion on inferring lot size I'll see if there is data available. I think it would also be interesting to get the date range of bad outcomes associated with each lot. That might also suggest how big the lot is/was - bigger lots would be in use longer while smaller ones would be exhausted quicker. Of course that assumes roughly equal rates of delivery. Maybe useful, maybe not. But when you're just starting out analyzing a pile of data, you poke around and look at anything interesting to see if it is useful.
Finally, I'll see if I can do these for both deaths (ultimate bad outcome) and "mere" ER visits or hospitalizations.
Fair amount of grovelling through data, but this is what I do. Heck, the base data file is only about 500 MB and a little over 600K entries. In my "day job" I routinely work with double-digit GB data sets with millions of records. Of course there I have a 64 core machine with 256 GB of memory and a RAID disk subsystem. I'm going to try this on my Raspberry Pi just for kicks... ;-)
When I looked a few months ago, I think two vaccines had a similar rate of AE, and the other was different. The difference was big enough to look significant.
If there is a cumulative effect to the vaccines, the boosters might have higher AE rates. The boosters may not have been around long enough to have enough data.
Somebody ought to have a list of lot sizes. And knowledge of the manufacturing process. Whether they’ll talk is another story.
I did a rough and conservative estimate of the AE rate for these vaccines compared to other vaccines. If I remember correctly, I had 4 billion doses of the other vaccines. The Covid vaccines had AEs 6 times as often, which makes them less safe than the others.
I found clumping of AEs close to vaccination date. I focused on the first week. I would expect reporting to diminish over time, but the clumping was pronounced.
My Windows 7 machine was laboring with the Excel file I had.
I’d be delighted to hear what you come up with.