Light falls onto a sensor with units of watts per square meter, and a pixel has an area of square meters, and the illumination is collected for some number of seconds, so you can see that the signal (gray level) is a unit of energy. Just do the math on the units. Since a watt is a joule per second:
[joules/(sec * m^2)] * m^2 * sec = joules.
So one definition of energy could be to simply sum up all the gray levels
energy = sum(grayImage(:));
Of course that's a proportionality - the value is not the number of joules directly.