I just looked at Wiki and according to them GPU’s and FPGA’s in general have higher transistor counts than CPU’s. I wasn’t expecting that.
GPU’s have lots of parallelism and on-chip memory, which makes for easy, fluffy gate counts.
FPGA’s either won’t use a large number of the gates on the die (they’ll be blown open during programming or simply ignored), or at best, the way FPGA’s implement a logic circuit can be non-optimal. That’s the trade-off with FPGA’s - you get field programmability (meaning you don’t need to design a custom chip and then get it fab’ed), but you give up optimal design criteria.