• 0 Posts
  • 31 Comments
Joined 1 year ago
cake
Cake day: July 8th, 2023

help-circle
  • I figure since big tech spent quite a bit of money building those datasets and since they were built before the law, they will be able to keep using them as long as they don’t add anything new but I can’t be certain.

    This is a very weird assumption you are making man. The quoted text you sent above pretty much says the opposite. It says everyone who wants to train their models wirh copyrigthed data needs to get permission from the copyright holders. That is great for me period. No one, not a big company nor the open source community, gets to steal the work of people producing art, code, etc. I honestly don’t get why you assume all the data scrapped before would be exempt. Again, very weird assumption.

    As for ML algorithms having use, of course they have. Hell, pretty much every company I have worked with has used them for decades. But take a look at the examples you provided. None of them requires you or your company scrapping a bunch of information from randoms on the internet. Specially not copyrighted art, literature, or code. And that’s the point here, you are acting like all of that stops with these laws but that’s ridiculous.


  • So you are saying that content scraped before the law is fair game to train new models? If so it’s fucking terrible. But again, I doubt this is the case since this would be against the interests of the big copyright holders. And if it’s not the case you are just creating a storm in glass of water since this affects the companies too.

    As a side point, I’m really curious about LLM uses. As a programmer the only useful product I have seen so far is copilot and similar tools. And I ended up disabling the fucking thing because it produces too much garbage hahaha. But I’m the first to admit I haven’t been following this hype cycle hahahaha, so I’m really curious what the big things will be. You clearly know so much, so want to enligten me?



  • My man, I think you are mixin a lot of things. Let’s go by parts.

    First, you are right that almost all websites get some copyright rights when you post on their platforms. At best, some license the content as Creative Commons or similar licenses. But that’s not new, that has been this way forever. If people are surprised that they are paying with their data at this point I don’t know what to say hahaha. The change with this law would be that no one, big tech companies or open source, gets to use this content for free to train new models right?

    Which brings me back to my previous question, this law applies to old data too right? You say “new data is not needed” (which is not true for chat LLMs that want to include new data for example), but old data is still needed to use the new methods or to curate the datasets. And most of this old data was acquired by ignoring copyright laws. What I get from this law is that no one, including these companies, gets to keep using this “illegaly” acquired data now right? I mean, I’m pretty sure this is the case since movie studios and similar are the ones pushing for this law, they will not go like “it’s ok you stole all our previous libraries, just don’t steal the new stuff” hahahaha.

    I do get your point that the most likely end result is that movie studios, record labels, social media platforms, etc, will just start selling the rights to train on their data and the only companies who will be able to afford this are the big tech companies. But still, I think this is a net possitive (weird times for me to be on the side of these awful companies hahaha).

    First of all, it means no one, including big tech companies, get to steal content that is not theirs or given to them willingly. I’m particularly interested in open source code, but the same applies to indie art and any other form of art outside of the big companies. When we say that we want to stop the plagiarism it’s not a joke. Tech companies are using LLMs to attack the open source community by stealing the code under the excuse of LLMs being transformative (bullshit of course). Any law that stops this is a possitive to me.

    And second of all, consider the 2 futures we have in front of us. Option one is we get laws like this, forcing AI to comply with copyright law. Which basically means we maintain the current status quo for intellectual property. Not great obviously, but the alrtenative is so much worse. Option two is we allow people to use LLMs to steal all the intellectual property they want, which puts an end to basically any market incentives to produce art by humans. Again, the current copyright system is awful. But why do you guys want a system were we as individuals have to keep complying with copyright but any company can bypass that with an LLM? Or how do you guys think this is going to pan out if we just don’t regulate AI?


  • Maybe I’m missing something, but I don’t understand what you guys mean by “the river cannot be dammed”. The LLM models need to be retrained all the time to include new data and in general to get them to change their behavior in any way. Wouldn’t this bill apply to all these companies as soon as they retrain their models?

    I mean, I get the point that old models would be exempt from the law since laws can’t be retroactive. But I don’t get how that’s such a big deal. These companies would be stuck with old models if they refuse to train new ones. And as much hype as there is around AI, current models are still shit for the most part.

    Also, can you explain why you guys think this would stop open source models? I have always though that the best solution to stop these fucking plagiarism machines was for the open source community to create an open source training set were people contribute their art/text/whatever. Does this law prevents this? Honestly to me this panic sounds like people without any artistic talent wanted to steal the work of artists and they are now mad they can’t do it.



  • What a load of BS hahahaha. LLMs are not conversation engines (wtf is that lol, more PR bullshit hahahaha). LLMs are just statistical autocomplete machine. Literally, they just predict the next token based on previous tokens and their training data. Stop trying to make them more than they are.

    You can make them autocomplete a conversation and use them as chatbots, but they are not designed to be conversation engines hahahaha. You literally have to provide everything in the conversation, including the LLM previous outputs to the LLM, to get them to autocomplete a coherent conversation. And it’s just coherent if you only care about shape. When you care about content they are pathetically wrong all the time. It’s just a hack to create smoke and mirrors, and it only works because humans are great at anthropomorphizing machines, and objects, and …

    Then you go to compare chatgpt to literally the worst search feature in google. Like, have you ever met someone using the I’m feeling lucky button in Google in the last 10 years? Don’t get me wrong, fuck google and their abysmal search quality. But chatgpt is not even close to be comparable to that, which is pathetic.

    And then you handwave the real issue with these stupid models when it comes to search results. Like getting 10 or so equally convincing, equally good looking, equally full of bullshit answers from an LLM is equivalent to getting 10 links in a search engine hahahaha. Come on man, the way I filter the search engine results is by reputation of the linked sites, by looking at the content surrounding the “matched” text that google/bing/whatever shows, etc. None of that is available in an LLM output. You would just get 10 equally plausible answers, good luck telling them apart.

    I’m stopping here, but jesus christ. What a bunch of BS you are saying.




  • Hahahahahahahahahaha. Just like crypto obsesion gave us cheap crypto chips in all personal computers? Get real dude, this is just a way for them to steal even more data from people. Nothing good will come from this. Quite the oposite, we will see yet another price increase in hardware because all AI dumbasses will hoard them yet again. Don’t forget these are the same morons that were pushing for cryptos and nfcs hahaha. These are people who believe in magic free money, too bad they sometimes get it because our society is built by retards. I really can’t stand people looking for a possitive in this dissaster …


  • Nah dude. TPMs have always been about implementing DRMs. These companies hate that they can’t control our PCs, they want to be sure we can only run their approved apps. Like it works in iOS and (to a lesser degree for now) in Android. And even there they are pushing hard for even more restrictive DRM.

    For example, some years ago I worked with a SaaS that implemented and sold some security products. One of our customers was a big retailer (for specialized products, not going into more details to avoid doxxing) that was having a problem with scalpers buying all their inventory as soon as they released it. So they were trying to put a show for regulators about stopping scalpers because their customers were complaining. We suggested that the only real solution was to have some real life verification of purchases. But in the end they went with the awful attestation APIs offered by Apple and Google to “fix” this. And in case you are not familiar, these APIs are just TPM based DRMs. So now, if you have a rooted/jailbroken phone you can’t even buy with this retailer anymore.

    Note that this company wasn’t trying to fuck customers directly, they were just lazy and incentivised to not really fix the problem (a sale is a sale, even if to a scalper). But even then the end result is that their customers got their digital freedom rights restricted. It’s just a terrible technology IMO, the incentives from companies are all terrible. And that’s before we start considering the real intentions of awful companies like Microsoft, Apple and Google. IMO they are actually pushing for techno feudalism, but that’s my conspiracy theory hahaha.

    So no, I doubt they were thinking about security with this recall bullshit. As other people have explained in their comments it doesn’t really protect much in practice. Plus this whole AI push has just been a stupid scramble from these companies to grab a big piece of the stupid AI pie from other companies hahaha, there is no long term plan here, don’t lie to yourself and us.




  • Can you point me to the paper/article/whatever where this is being discussed please? I’m actually interested on learning about it. Even if I don’t like the way they are using the technology, I’m still a programmer at hearth and would love to read about this.

    To the point of the conversation, honestly man that was just an example of the many problems I see with this. But you have to understand that people like you keep asking us for proof that LLMs are not smart. But come on man, you are the ones claiming you solved the hard problem of mind, on the first try no less hahaha. You are the ones with the burden of proof here and you have provided nothing of the sort. Do better people or stop trying to confuse us with retoric.



  • Sorry for the late response, busy day hahaha. A few things:

    1. Please don’t get hung out on the particular examole I picked. I just googled Monsanto seed lawsuit and picked the first example. But there are so many many more examples.

    2. I mean, you don’t see that’s the problem I was pointing out exactly? Again, I’m not against GMOs themselves (though again, totally unneducated opinion). My concern, as someone from a third world country, is precisely with the laws and economic pressure these companies use to exploit people in our countries using this technology.

    Let me explain how this works in my experience:

    1. Monsanto or any of these companies create a new GMO. This GMO is usually actually better at something than traditional crops. Though here better is usually economically better, as in cheaper to produce.
    2. These companies start preassuring every farmer in our countries to use their seeds and crops. Usually this is done through economic preassure. That is usually they price their seeds so they are cheaper to use than traditional crops (on it’s own, not terrible). There is usually some preassure thorugh laws anf marketing to force people to switch too.
    3. The farmers using these new crops will outperform, in an economical sense, the farmers that keep using the traditional crops. They will produce better crops for less money for a while. Usually the ones who survuve this are the big farmers, most family farms can’t compete here. After some time of this we end in a situation where all the crops are replaced with the new GMO, patented crop, giving these companies a monopoly over our food.

    If things ended here it would be okish, though I wiould still hate it hahaha. But we all know that companies will always exploit their monopoly positions as much as possible. So this usually ends with even more hunger in our countries even though we now technicslly have better crops. So yeah, I think you are wrong. If our onky options are to continue using old “inneficient” crops, or this shit, I prefer the traditional crops. So good on Greenpeace for blocking this.


  • Thanks for a nice response! My main problem with checking news sites is that it’s quite a bit of work and I’m very lazy hahaha. Specially because of what I said, for me this is mostly about my curiosity, so I can’t invest the time required to properly inform myself this way. Plus there is the issue of bias, I only speak spanish and english, so my sources are biased that way by default (fuck Putin just in case anyone mistakes this for support for that prick lol).

    In my naivety I was hoping to get a few websites with like a minute to minute of the developments. I remember seeing several beautiful sites like these for the Syrian civil war, and I assume there are some for this conflict too. Obviously, as you pointed out, these sites will surely be biased (some to one side, some to the other) and incomplete. But if I have a few of these, I can at least go and take a look at them everytime I see a headline like this and create my own uneducated opinion lol.