consider giving room to 64 bit variables
My friend, I don’t know what are you talking here. The 32 bits I mentioned above is because ARGB32 has 4 numbers ranges [0,255] and they only need 8 bits, thus 32 bits. About modern computers, they all have special instructions to process 8/16/32 bits data efficiently. If you take a look at assembly, the instructions have suffixes for 8/16/32 bits data [1]. Good video about this topic if you are interested [2].
Anyway, I don’t think these lower level things matters to the problem at our hand. Since we settled all details, maybe put together a PR? Lets this be the first code change of yours to GD.