So I read in the 3D performance page that "The less amount of different materials in the scene, the faster the rendering will be." and looked into baking/atlas textures in Blender.
Seemed fairly straight forward, so once converting an object with 8 PBR materials that total 110MB to an Object with 1 PBR baked Material at 12MB, I assumed I would be able to measure a difference in performance (albeit small).
I loaded up a blank scene, created 45 dupes of each to crank up the impact and turned on some light an tested toggling the non-baked and baked versions of the models on and off:

So this struck me as a bit of a surprise, namely that draw calls went from ~2.5k+ to ~246with basically no impact on the FPS. Similarly Shader changes, Mat changes Surface changes all dropped in line with the toggle.
I guess my questions are:
Is this even a valid stress test when duplicating the models, as Godot might just be reusing the same materials; so really I'm still comparing 8 vs 1 which is essentially negligible?
If (as I suspect) it's not, how CAN I test the impact. As I don't intend to spend hours in Blender batching uv maps/textures up if the difference is essentially immeasurable, so was hoping I could test what kind of performance gain to expect before trying. I couldn't find anyone else discussing it, but if anyone's done this before and has some numbers that'd be boss.
If Godot IS just reusing materials, why does the debugger still register the changes, or am I just misinterpreting it? My incredibly poor understanding of what a draw call is implies if it was reusing materials for an identical object, the draw calls would stay the same when duplicating it (or does it still need to be 'joined' to consider it the same mesh)? Which leads me to:
Is there somewhere other than the official docs website that explains what all the debugging options mean? This strikes me as more than a little sparse:

Thanks