• 1 Post
  • 141 Comments
Joined 1 year ago
cake
Cake day: August 7th, 2023

help-circle

















  • This article is grossly overstating the findings of the paper. It’s true that bad generated data hurts model performance, but that’s true of bad human data as well. The paper used opt125M as their generator model, a very small research model with fairly low quality and often incoherent outputs. The higher quality generated data which makes up a majority of the generated text online is far less of an issue. The use of generated data to improve output consistency is a common practice for both text and image models.