It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law, when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around.

  • CarbonatedPastaSauce@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    5 months ago

    They are capable of accessing and recalling the contents of that information repository, and relaying information from that repository to an end user.

    This is not correct based on my understanding of LLMs, but I am certainly not an expert. As I understand it, it’s basically a statistics exercise in how they determine what order to put words into. They don’t ‘look stuff up’ in their training data. They probably don’t even have access to their training data once the model is complete. These models are trained on terabytes of data but are small enough to fit in memory, so it’s impossible for them to still have access to all that. But it wouldn’t matter if they did, because that’s not how they work.

    • gedaliyah@lemmy.worldOPM
      link
      fedilink
      arrow-up
      5
      arrow-down
      13
      ·
      5 months ago

      To me that sounds like a distinction without a difference. A jpeg is not an image, but a set of data that can be algorithmically processed and rendered as an image - which is why it can fit in a smaller space than a bmp. Despite the technical differences, a jpg and a bmp are legally equivalent. If something is illegal in a bmp, it’s also illegal in a jpg. The same laws apply to EVs and gas vehicles. The same laws apply to vinyl records and cassette tapes. The law does not care about the mechanism.*

      *for the most part

      • CarbonatedPastaSauce@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        5 months ago

        You’re illustrating the issue so many people have with this technology. Without a fundamental understanding of how it works, people will attempt to use it in ways it shouldn’t be used, and won’t understand why it isn’t giving them correct information. It simply doesn’t have the ability to do anything but put words in an order that statistically will resemble how a human might answer the question.

        LLMs don’t know anything. They can’t tell fact from fiction (and are incapable of even trying), and don’t understand concepts such as verifying info when requested. That’s the problem, they don’t ‘understand’ anything, including what they are telling you. But they do spit out words in a statistically probable order, even if the result is complete bullshit. They do it so well that they can fool most people into thinking the computer actually knows what it’s telling you.

      • Grimy@lemmy.world
        link
        fedilink
        arrow-up
        6
        ·
        5 months ago

        Here’s a better metaphor because yours completely misses the mark when it comes to the difference between an LLM and an actual encyclopedia.

        A painter will spend years honing his craft by studying other paintings as well as photos and real life. If you ask him to paint you a house from memory and try to build it with what he gives you, that just makes you an idiot, it doesn’t make him a bad architect.

        Chatgpt is not an encyclopedia and any thing it says that is remotely important to your personal or work life should be verified. They explicitly tell you it can and will give false responses.

      • kurwa@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        5 months ago

        Jpg is a lossy compression algorithm. Statistical probability of words occuring in sequence is not compression. That’s like saying generative images are compression, they aren’t. It’s not producing blurry matches of images, it’s producing something “novel”. Otherwise, that would be considered over fitting the data.