• 0 Posts
  • 23 Comments
Joined 2 years ago
cake
Cake day: August 22nd, 2023

help-circle


  • Except when it comes to LLM, the fact that the technology fundamentally operates by probabilisticly stringing together the next most likely word to appear in the sentence based on the frequency said words appeared in the training data is a fundamental limitation of the technology.

    So long as a model has no regard for the actual you know, meaning of the word, it definitionally cannot create a truly meaningful sentence.

    This is a misunderstanding of what “probabilistic word choice” can actually accomplish and the non-probabilistic systems that are incorporated into these systems. People also make mistakes and don’t actually “know” the meaning of words.

    The belief system that humans have special cognizance unlearnable by observation is just mysticism.


  • Yeah. AI making images with six fingers was amusing, but people glommed onto it like it was the savior of the art world. “Human artists are superior because they can count fingers!” Except then the models updated and it wasn’t as much of a problem anymore. It felt good, but it was just a pleasant illusion for people with very real reasons to fear the tech.

    None of these errors are inherent to the technology, they’re just bugs to correct, and there’s plenty of money and attention focused on fixing bugs. What we need is more attention focused on either preparing our economies to handle this shock or greatly strengthen enforcement on copyright (to stall development). A label like this post is about is a good step, but given how artistic professions already weren’t particularly safe and “organic” labeling only has modest impacts on consumer choice, we’re going to need more.









  • no cognizance, no agency, and no thought

    Define your terms. And explain why any of them matter for producing valid and “intelligent” responses to questions.

    Do you truly believe humans are simply mechanistic processes that when you ask them a question, a cascade of mathematics occurs and they spit out an output?

    Why are you so confident they aren’t? Do you believe in a soul or some other ephemeral entity that wouldn’t leave us as a biological machine?

    People actually have an internal reality. For example, they could refuse to answer your question! Can an LLM do even something that simple?

    Define your terms. And again, why is that a requirement for intelligence? Most of the things we do each day don’t involve conscious internal planning and reasoning. We simply act and if asked will generate justifications and reasoning after the fact.

    It’s not that I’m claiming LLMs = humans, I’m saying you’re throwing out all these fuzzy concepts as if they’re essential features lacking in LLMs to explain their failures in some question answering as something other than just a data problem. Many people want to believe in human intellectual specialness, and more recently people are scared of losing their jobs to AI, so there’s always a kneejerk reaction to redefine intelligence whenever an animal or machine is discovered to have surpassed the previous threshold. Your thresholds are facets of the mind that you both don’t define, have no means to recognize (I assume your consciousness, but I cannot test it), and have not explained why they’re important for fact rather than BS generation.

    How the brain works and what’s important for various capabilities is not a well understood subject, and many of these seemingly essential features are not really testable or comparable between people and sometimes just don’t exist in people, either due to brain damage or a simple quirk in their development. The people with these conditions (and a host of other psychological anomalies) seem to function just fine and would not be considered unthinking. They can certainly answer (and get wrong) questions.




  • This is just the same hand-waving repeated. What does it mean to “know what a word means”? How is a word, indexed into a complex network of word embeddings, meaningfully different as a token from this desired “object model”? Because the indexing and encoding very much does relate words together separately from their likelihood to appear in a sentence together. These embeddings may be learned from language, but language is simply a method of communicating meaning, and notably humans also learn meaning through consuming it.

    What do things like “love” or “want” or “feeling” have to do with a model of objects? How would you even recognize a system that does that and why would it be any more capable than a LLM at producing good and trustable information? Does feeling love for a concept help you explain what a random blogger does? Do you need to want something to produce meaningful output?

    This just all seems like poorly defined techno-spiritualism.



  • does not have a model of the objects to which the words refer

    I’m not even sure what this is supposed to be saying. Sounds kind of like a bullshit generator.

    Words are encodings of knowledge and their expression and use represent that knowledge, and these machines ingest a repository containing a significant percent of written human communication. It encodes that the words “dog” and “bark” are often used together, but it also encodes that “dog” and “cat” are things that are both “mammals” and “mammals” are “animals”, and that the pair of them are much more likely to appear in a human household than a “porpoise”. What is this other kind of model of objects that hasn’t been in some way represented in all of the internet?


  • That’s just not true. Semantic encodings work. It’s not like neural networks are some new untested concept, the LLMs have some new tricks under the hood and are way more extensive in their training goal, but they’re fundamentally the same thing. All neural networks are mimicry machines enabled and limited by their data, but mimicking largely correct data produces largely correct results when the answer, or interpolatable answers exists in the training data. The problem arises when asked to go further and further afield from their inputs. Some interpolation and substitutions work, but it gets increasingly unreliable the more niche the answer is.

    While the LLM hype has very seriously oversold their abilities, the instinctive backlash to say they’re useless is similarly way off-base.


  • They’re both BS machines and fact generators. It produced bullshit when asked about him because as far as I can tell he’s kind of a nobody, not because it’s just a stylistic generator. If he asked about a more prominent person likely to exist more significantly within the training corpus, it would likely be largely accurate. The hallucination problem stems from the system needing to produce a result regardless of whether it has a well trained semantic model for the question.

    LLMs encode both the style of language and semantic relationships. For “who is Einstein”, both paths are well developed and the result is a reasonable response. For “who is Ryan McGreal”, the semantic relationships are weak or non-existent, but the stylistic path is undeterred, leading to the confidently plausible bullshit.


  • we know the meanings of the words we use.

    Uh, but we don’t? Not really. People use the wrong words all the time and each person’s definition (i.e., encoding) is slightly different. We mimic phrases and structures we’ve heard to sound smarter and forge on with uncertain statements because frequently they go unchallenged or simply aren’t important.

    We’re more structurally complex than a LLM, but we fool ourselves in thinking we’re somehow uniquely thoughtful and reliable.