• pirat@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      18 hours ago

      Because it has fewer parameters and (in some cases) it’s quantized. The hardware needed to run local inference on the full model is not really feasible to most people. Though, the release of it will probably still make a wide impact on the quality of other upcoming smaller models being distilled from it, or trained on synthetic data from it, or merged with it, etc.