From the perspective of using synthetic imagery to train machine vision systems,...

From the perspective of using synthetic imagery to train machine vision systems, I think that the idea of fidelity (i.e. how similar synthetic images are to real images) is less than half the story, and has the potential to be dangerously misleading.

Of greater concern are quality measures that look across the entire dataset. Here are some hypothetical metrics which (although impossible to compute in practice) will help get you thinking in the right way.

- How does the synthetic image manifold compare to the natural image manifold?

- Are there any points on the synthetic image manifold where the local number of dimensions is significantly less than at the corresponding point on the natural image manifold? (Would indicate an inability to generalise across that particular mode of variation in that part of feature space).

- For each point on the synthetic image manifold, are there any points where the distance between the synthetic image manifold and the natural image manifold is large AND the variance of the synthetic image manifold in the direction of that difference is small. (Would indicate an inability to generalise across the synthetic-to-real gap at that point in the manifold).

- Does your synthetic data systematically capture the correlations that you wish your learning algorithm to learn?

- Does your synthetic data systematically eliminate the confounding correlations that may be present in nature but which do not necessarily indicate the presence of your target of interest.

Engineering with synthetic data is not data mining. It is much more akin to feature engineering.