• 0 Posts
  • 166 Comments
Joined 2 years ago
cake
Cake day: July 18th, 2023

help-circle






  • If SSNs are used as a primary key (a unique identifier for a row of data) then they’d have to be duplicated to be able to merge data together.

    However, even if they aren’t using ssn as an identifier as it’s sensitive information. It’s not uncommon to repeat data either for speed/performance sake, simplicity in table design, it’s in a lookup table, or you have disconnected tables.

    Having a value repeated doesn’t tell you anything about fraud risk, efficency, or really anything. Using it as the primary piece of evidence for a claim isn’t a strong arguement.









  • I mean using proprietary data has been an issue with models as long as I’ve worked in the space. It’s always been a mixture of open weights, open data, open architecture.

    I admit that it became more obvious when images/videos/audio became more accessible, but from things like facial recognition to pose estimation have all used proprietary datasets to build the models.

    So this isn’t a new issue, and from my perspective not an issue at all. We just need to acknowledge that not all elements of a model may be open.