Anthropic's Claude 4 could "blackmail" you in extreme situations

Pro@programming.dev · 1 day ago

Anthropic's Claude 4 could "blackmail" you in extreme situations

sbv@sh.itjust.works · 1 day ago

An LLM is a deterministic function that produces the same output for a given input - I’m using “deterministic” in the computer science sense. In practice, there is some output variability due to race conditions in pipelined processing and floating point arithmetic, that are allowable because they speed up computation. End users see variability because of pre-processing of the prompt and extra information LLM vendors inject when running the function, as well as how the outputs are selected.

I have a hard time considering something that has an immutable state as sentient, but since there’s no real definition of sentience, that’s a personal decision.

enkers@sh.itjust.works · 19 hours ago

I have a hard time considering something that has an immutable state as sentient, but since there’s no real definition of sentience, that’s a personal decision.

Technical challenges aside, there’s no explicit reason that LLMs can’t do self-reinforcement of their own models.

I think animal brains are also “fairly” deterministic, but their behaviour is also dependent on the presence of various neurotransmitters, so there’s a temporal/contextual element to it, so situationally our emotions can affect our thoughts which LLMs don’t really have either.

I guess it’d be possible to forward feed an “emotional state” as part of the LLM’s context to emulate that sort of animal brain behaviour.

Railcar8095@lemm.ee · 24 hours ago

It yet to be proven or disproven that if you put the exact same person in the exact same situation (a perfect to the molecular level) they will behave differently.

We can only test “more or less close”. So we would not know of humans are sentient based on that reasoning, we are only hard to test.

sbv@sh.itjust.works · 23 hours ago

if you put the exact same person in the exact same situation (a perfect to the molecular level) they will behave differently.

I don’t consider that relevant to sentience. Structurally, biological systems change based on inputs. LLMs cannot. I consider that plasticity to be a prerequisite to sentience. Others may not.

We will undoubtedly see systems that can incorporate some kind of learning and mutability into LLMs. Re-evaluating after that would make sense.

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations - Hypertext