he/him

Materials Science PhD candidate in Pittsburgh, PA, USA

My profile picture is the cover art from Not A Lot of Reasons to Sing, But Enough, and was drawn by Casper Pham (recolor by me).

  • 1 Post
  • 9 Comments
Joined 1 year ago
cake
Cake day: June 7th, 2023

help-circle

  • Agreed. Strong (and effectively enforced) worker protections are just as important as tech-specific safety regulations. Nobody should feel like they need to put themselves into a risky situation to make work happen faster, regardless of whether their employer explicitly asks them to take that risk or (more likely) uses other means like unrealistic quotas to pressure them indirectly.

    There are certainly ways to make working around robots safer, e.g. soft robots, machine vision to avoid unexpected obstacles in the path of travel, inherently limiting the force a robot can exert, etc… And I’m all for moving in the direction of better inherent safety, but we also need to make sure that safer systems don’t become an excuse for employers to expose their workers to more risky situations (i.e. the paradox of safety).


  • It seems like you’re working under the core assumption that the trained model itself, rather than just the products thereof, cannot be infringing?

    Generally if someone else wants to do something with your copyrighted work – for example your newspaper article – they need a license to do so. This isn’t only the case for direct distribution, it includes things like the creation of electronic copies (which must have been made during training), adaptations, and derivative works. NYT did not grant OpenAI a license to adapt their articles into a training dataset for their models. To use a copyrighted work without a license, you need to be using it under fair use. That’s why it’s relevant: is it fair use to make electronic copies of a copyrighted work and adapt them into a training dataset for a LLM?

    You also seem to be assuming that a generative AI model training on a dataset is legally the same as a human learning from those same works. If that’s the case then the answer to my question in the last paragraph is definitely, “yes,” since a human reading the newspaper and learning from it is something that, as you say, “any intelligent rational human being” would agree is fine. However, as far as I know there’s not been any kind of ruling to support the idea that those things are legally equivalent at this point.

    Now, if you’d like to start citing code or case law go ahead, I’m happy to be wrong. Who knows, this is the internet, maybe you’re actually a lawyer specializing in copyright law and you’ll point out some fundamental detail of one of these laws that makes my whole comment seem silly (and if so I’d honestly love to read it). I’m not trying to claim that NYT is definitely going to win or anything. My argument is just that this is not especially cut-and-dried, at least from the perspective of a non-expert.







  • Really interesting writeup, thank you for sharing! Many of the technical details go well over my head but nonetheless it’s very interesting to hear some of these success stories, and it also sheds light on how much work running an instance with a lot of users actually is. Here’s hoping that future versions of lemmy with (eg) more optimized database code will make life easier for all the folks in the operations team!