NeurIPS 2022 Workshop on Human Evaluation of Generative Models
Rapid advances in generative models for both language and vision have made these models increasingly popular in both the public and private sectors. For example, governments use generative models such as chatbots to better serve citizens. As such, it is critical that we not only evaluate whether these models are safe enough to deploy, but also ensure that the evaluation systems themselves are reliable. Oftentimes, humans are used to evaluate these models. Our goal is to call attention to the discussion on how to best perform reliable