cap will act as a limit of bytes, if surpassed then the downscale occurs for that unique icon
There is no difference between “limiting bytes” and “limiting of image sizes”. After loading an image to memory, it will be represented as strides of pixels. In fact, it can be calculated by hand. For 64×64 image stored in memory as ARGB32 will consume this much of memory:
32 bits * 64 * 64= 131072 bits = 16 KiB.
I think just down scaling to a certain size is simple and effective to avoid all the troubles.