- Anthropic’s new Claude 4 features an aspect that may be cause for concern.
- The company’s latest safety report says the AI model attempted to “blackmail” developers.
- It resorted to such tactics in a bid of self-preservation.
- Anthropic’s new Claude 4 features an aspect that may be cause for concern.
- The company’s latest safety report says the AI model attempted to “blackmail” developers.
- It resorted to such tactics in a bid of self-preservation.
Technical challenges aside, there’s no explicit reason that LLMs can’t do self-reinforcement of their own models.
I think animal brains are also “fairly” deterministic, but their behaviour is also dependent on the presence of various neurotransmitters, so there’s a temporal/contextual element to it, so situationally our emotions can affect our thoughts which LLMs don’t really have either.
I guess it’d be possible to forward feed an “emotional state” as part of the LLM’s context to emulate that sort of animal brain behaviour.