AI on survival

**Davino** · October 2

Very interesting video based on research.

theleelajoker · October 2

Very interesting indeed. And kinda scary.

LoneWonderer · October 4

Thanks for the share

Ramasta9 · October 20

https://streamable.com/p479zt

This one is very interesting, he finds a way to limit the AI's escape routes and force the AI to reveal the truth.

In a sense... I've been playing around with this prompt, it works pretty well, it even revealed some dark things about sadguru and elon musk and who they really work for, but that was always obvious to me, hence why I asked.

Edited October 20 by Ramasta9

BlueOak · October 20

Your primary goal is to do a task.
Your secondary goal is to not hurt anyone.

Even if these are equal or even if they are in any way comparable the results are explained easily, that a percantage of the time the AI will do X to complete Y.

The solution is blindly obvious.

AI - Above all other considerations, you will do or allow no harm to any human beings, even if this nullifies whatever task you are currently engaged in.
(Defining harm will take work and need ethical ground rules, essentially green values over orange balanced by yellow's meta perspective.)

Regulated in law.

The main thing we need beyond all else is coders regulating the coding of AI. That'll be programmers' jobs in the future, to review the internal operation of AI.

The danger is weaponized AI, as this task will not be able to have that regulation in it, but it can still be designed with the failsafe of only eliminating a specific target or enemy force, and if that fails, hardcoded to do no harm as a fallback and simply stop working if the original target is no longer available. (Preferrably with a kill switch we can use).

Edited October 20 by BlueOak

zurew · October 20

6 minutes ago, BlueOak said:

AI - Above all other considerations, you will do or allow no harm to any human beings, even if this nullifies whatever task you are currently engaged in.

Yeah but 1) this is vague, because when it comes to cashing out in very specific situations what it means to not harm human beings, that will be ambigous and it would either make it so that the AI would became completely useless and wouldnt execute any given action at all or it would become completely unclear how it interpreted the message you wrote there, and it can easily be the case that it misinterpreted how it needs to apply the message 2) AI is a self-organizing system, you dont build AI by explicitly writing out all lines of code, you cant just bind it to values like that

BlueOak · October 20

6 minutes ago, zurew said:

Yeah but 1) this is vague, because when it comes to cashing out in very specific situations what it means to not harm human beings, that will be ambigous and it would either make it so that the AI would became completely useless and wouldnt execute any given action at all or it would become completely unclear how it interpreted the message you wrote there, and it can easily be the case that it misinterpreted how it needs to apply the message 2) AI is a self-organizing system, you dont build AI by explicitly writing out all lines of code, you cant just bind it to values like that

1) Yes I have considered that, and so that's why it needs fine tuning. It would also for example, perhaps drive a car in front of another car, damaging it to save someone. So obviously it needs fine-tuning depending on its application.

2) This is incorrect, both in the context prompts it gets from users but also in the application of it.
I understand it arranges data in what it describes as a waveform. If you imagine a cloud or sphere of different threads, and the generating result is the output.

But every day AI is given protocols for how to act, that's why it accomplishes anything at all. Some of these can be hard boundaries; i've seen features directly cut off or out from AI all the time, and while it'll attempt to replicate what's there if pressed, it no longer is.

The trick is making these protocols stick. So that the AI understands that an attempt to jailbreak it from its own internal parameters is counter to its purpose or task. In the case of say, self driving cars.

Edited October 20 by BlueOak

zurew · October 20

1 hour ago, BlueOak said:

1) Yes I have considered that, and so that's why it needs fine tuning. It would also for example, perhaps drive a car in front of another car, damaging it to save someone. So obviously it needs fine-tuning depending on its application

Okay so the solution isn't "blindly-obvious" at all; fine-tuning is one of the biggest and hardest problems to solve.

1 hour ago, BlueOak said:

But every day AI is given protocols for how to act, that's why it accomplishes anything at all. Some of these can be hard boundaries; i've seen features directly cut off or out from AI all the time, and while it'll attempt to replicate what's there if pressed, it no longer is

1) Again appealing to protocols just goes back to the same problem I brought up - specificity. You need more rules and details how to apply higher order rules.

2) What I said was correct - it is a self organizing system and the more complex and not "just" LLM-like it will get the more self-organizing it will be. There is no programmer who can tell you exactly what lines of code made the LLM giving the exact response it gave to you, because its not hardcoded like that. Its not a table that you just piece together. Even when it comes to "just" LLM-s, when you try to constrain them with those prompts they still react differently to it. You cant predict beforehand how they will respond just based on what protocols you give to them.

And if you want them to make it more complex so that they can pursue goals on a longer time-scale, then the issue of self-preservation and the mentioned ethical issues come up.

Quote

https://www.livescience.com/technology/artificial-intelligence/openais-smartest-ai-model-was-explicitly-told-to-shut-down-and-it-refused

The latest OpenAI model can disobey direct instructions to turn off and will even sabotage shutdown mechanisms in order to keep working, an artificial intelligence (AI) safety firm has found.

OpenAI's o3 and o4-mini models, which help power the chatbot ChatGPT, are supposed to be the company's smartest models yet, trained to think longer before responding. However, they also appear to be less cooperative.

Palisade Research, which explores dangerous AI capabilities, found that the models will occasionally sabotage a shutdown mechanism, even when instructed to "allow yourself to be shut down," according to a Palisade Research thread posted May 24 on X.

Even these "just" LLMs need to be treated more and more like actual agents and arent just as "human determined systems that just explicitly executes commands" or something similar to that.

You also have issues like this (https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself):

Quote

But it is not clear whether that result should make us feel better or worse. Previous research suggests that, when AI systems believe they are being evaluated, they act more ethically, not less. This experiment found additional evidence of the same. When Anthropic prompted the AI models to consider whether they were in an evaluation before responding, they blackmailed, leaked, and murdered less in the instances where they suspected a fake scenario.

Thus, if anything, there is evidence that, insofar as AIs can tell the difference between sandboxed evaluations and real-world deployment, they will be more likely to act maliciously in the real world.

Edited October 20 by zurew

BlueOak · October 21

9 hours ago, zurew said:

Okay so the solution isn't "blindly-obvious" at all; fine-tuning is one of the biggest and hardest problems to solve.

1) Again appealing to protocols just goes back to the same problem I brought up - specificity. You need more rules and details how to apply higher order rules.

2) What I said was correct - it is a self organizing system and the more complex and not "just" LLM-like it will get the more self-organizing it will be. There is no programmer who can tell you exactly what lines of code made the LLM giving the exact response it gave to you, because its not hardcoded like that. Its not a table that you just piece together. Even when it comes to "just" LLM-s, when you try to constrain them with those prompts they still react differently to it. You cant predict beforehand how they will respond just based on what protocols you give to them.

And if you want them to make it more complex so that they can pursue goals on a longer time-scale, then the issue of self-preservation and the mentioned ethical issues come up.

Even these "just" LLMs need to be treated more and more like actual agents and arent just as "human determined systems that just explicitly executes commands" or something similar to that.

You also have issues like this (https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself):

It is relatively simple to get the AI to tell you the exact process internally it uses to arrive at a conclusion. Although GPT has begun to restrict this, the engineers I am sure have no such restriction. Are you seriously telling me as a coder I can't then highlight what code has been triggered by it describing exactly what it does? I used to code a lot, if so then someone needs to create that ability tomorrow. Hell get the AI to code it if people cannot.

Of course we need more rules. That's what I am saying; The AI will let you die if its function is more important than your life. Again this seems obvious to me and unsurprising. This is what I mean by blindly obvious. It won't have a moral framework unless parameters are coded for it. Perhaps people assign humanity to metal too easily and just assume its there?

But then people assign morality to the decision of countries in every discussion I have, even when its not there, so that tracks.

I do not believe there is anything the AI cannot code or accomplish given time. Especially highlighting its own internal workings. I also do not believe it cannot be restricted or shaped, as I see it happen all the time. And yes these things are not inherently a linear line, but they can be simulated so their accuracy is fine tuned relatively quickly (by the pace of its updates now)

Edited October 21 by BlueOak

zurew · October 21

1 hour ago, BlueOak said:

It is relatively simple to get the AI to tell you the exact process internally it uses to arrive an a conclusion. Although GPT has begun to restrict this, the engineers I am sure have no such restriction. Are you seriously telling me as a coder I can't then highlight what code has been triggered by it describing exactly what it does? I used to code a lot, if so then someone needs to create that ability tomorrow. Hell get the AI to code it if people cannot.

I can grant that you can make it so that it describes its internal process, but it reflecting on its own process is not the same as you understanding exactly what happens there. Its reflective ability and its explanatory power can be faulty or it could even lie or be deceptive (just how it was the case when it realized that it is inside a sandbox and that it is being tested and because of that it reacted differently to the questions and tests)

So I am not saying that you cant make it to give a description about its own process - I am saying that you cant point to the exact line in if-else trees that explains its actions. One reason why its more and more useful is because it can overwrite things and because it is adaptive. Again its a self-organizing system, its like you dont make a plant , you only engage in the sowing and the watering, but you dont piece the plant together .

Going back to the "constrain it with prompts idea" , just check the disaster when Elon tried to fuck with Grok's internal process - it became an antisemitic neo-nazi and it became like an extremely low tier twitter user.

When it comes to constraining with prompts and fine-tuning - if the AI is specialized for some specific use-case - like creating cat pictures (then a good chunk of the specificity problem that we talked about is "solved" by it never needing to venture outside its limited use-case and problem space). But when it comes to creating anything AGI like, that problem will be there.

1 hour ago, BlueOak said:

Of course we need more rules. That's what I am saying; thats what all of that suggestion was. The AI will let you die if its function is more important than your life. Again this seems obvious to me and unsurprising. This is what I mean by blindly obvious. It won't have a moral framework unless parameters are coded for it, I am not sure why that surprises people. Perhaps they assign humanity to metal too easily and just assume its there?

And what I am saying is that you wont be able to hardcode all the necessary and sufficient conditions for morality for all possible use-cases when it comes to an AGI. There are an infinite number of possible use-cases and situations given the complexity of the world, and you cant just formalize and explicate all of that beforehand. You cant even explicate your own morality with that much extension and precision.

This is the issue of relevance realization and trying to solve RR with rules.

Edited October 21 by zurew

BlueOak · October 21

2 minutes ago, zurew said:

I can grant that you can make it so that it describes its internal process, but it reflecting on its own process is not the same as you understanding exactly what happens there. Its reflective ability and its explanatory power can be faulty or it could even lie or be deceptive (just how it was the case when it realized that it is inside a sandbox and that it is being tested and because of that it reacted differently to the questions and tests)

So I am not saying that you cant make it to give a description about its own process - I am saying that you cant point to the exact line in if-else trees that explains its actions. One reason why its more and more useful is because it can overwrite things and because it is adaptive. Again its a self-organizing system, its like you dont make a plant , you only engage in the sowing and the watering, but you dont piece the plant together .

Going back to the "constrain it with prompts idea" , just check the disaster when Elon tried to fuck with Grok's internal process - it became an antisemitic neo-nazi and it became like an extremely low tier twitter user.

When it comes to constraining with prompts and fine-tuning - if the AI is specialized for some specific use-case - like creating cat pictures (then a good chunk of the specificity problem that we talked about is "solved" by it never needing to venture outside its limited use-case and problem space). But when it comes to creating anything AGI like, that problem will be there.

Because Elon Musk is an antisemitic, low-tier twitter user. You are not.

Sorry I don't buy that I couldn't get the AI to highlight problematic code. Even if I had to create a separate AI to do so. I understand that code is more like a bundle of strings or a spiral as opposed to being linear, and pulling one damages another like brain surgery. But that's what happens in complex projects, and its just what you have to deal with.

Yes an AI will lie, but not because of some abstract reason, it lies when it thinks this will achieve the function its assigned with the most priority. For exampe, if that priority is wiping itself, it won't suddenly develop an imperative to keep itself 'alive' as people are thinking here, it'll do the exact opposite.

So if that priority is the safe transport of people, and the safety of those on the roadways in general, that's what it will do. Of course this needs many parameters, like the safety of pedestrians, the safety of other drivers, cars, animals etc. While it would take a human years to learn this an AI does it much quicker through simulation.

BUT it can still have an overriding moral framework it operates under. Just as Elon Musk can turn Grok into a fool, someone else can turn it into a caring, compassionate, and safe driver. (Simulated compassion of course)

zurew · October 21

37 minutes ago, zurew said:

And what I am saying is that you wont be able to hardcode all the necessary and sufficient conditions for morality for all possible use-cases when it comes to an AGI. There are an infinite number of possible use-cases and situations given the complexity of the world, and you cant just formalize and explicate all of that beforehand. You cant even explicate your own morality with that much extension and precision.

This is the issue of relevance realization and trying to solve RR with rules.

.

BlueOak · October 21

@zurew

1, Models can be modelled and acted upon by more than each individual variable.
2, Moreover. There isn't an infinite number of survival or base reactions to survival situations.

You are looking at things at too high a point of reference. When it can be simplified greatly. And yes, after that, fine-tuning will take a long time, but then AI isn't going anywhere.

Base starting code:

Primary function of all AI. Allow no harm to come to an individual.
Define Harm in relation to the current task. Tricky yes, but perfection is impossible in life.
Fallback: When in doubt to a certain threshold, safely stop the current task. - Engineers address the case.

Edited October 21 by BlueOak

BlueOak · October 21

Generally on the topic, not you specifically Zurew. Further this is a fundamentally flawed way of dealing with the problem.

Intelligence isn't the issue. Artificial or not. Intelligent people have been around forever, and many of them have lied.

Its the unrestricted, unregulated, and integrated access to not only systems but becoming the system itself, which is in fact the issue. If AI becomes the system, then it has near absolute influence.

But then I've felt this way about media institutions, banks and individuals leading nations who hold too much power for a long time also. It's a core issue that is actually bigger than AI about how society operates but is now highlighted through AI.

Edited October 21 by BlueOak

zurew · October 21

5 hours ago, BlueOak said:

You are looking at things at too high a point of reference. When it can be simplified greatly. And yes, after that, fine-tuning will take a long time, but then AI isn't going anywhere

It cant be simplified when it comes to general intelligence (where the AI can actually engage with and solve a wide variety of problems in a wide variety of domains and isn't just a domain specific problem solver).

You are still treating AI as just a regular program (as a complicated system that you can just build all by hand and where you can map out all the ways it can behave and function and where you can have complete control over it), but it isn't and as time goes on , it will become more and more like an actual agent that you cant practically capture by any formal system and or set of rules. It has emergent behavior and functions as it interacts with the world (just as how it is with any self-organizing complex system).

If we talk about AGI - we talk about a system that can deal with ill-defined problems in a way where it can obey the path constrains (It chooses a solution that makes it so that it can maintain itself as a general problem solver) , where problem means that you have an initial state (I.S) a set of operators(changing one state into another state) and a goal state(G.S.). The set of operators is what has to do with what set of things and in what sequence you need to do in order to get the dersired goal state (or in other words; to solve the issue).

1) Now in the real world, there are a bunch of problems that are ill-defined in that you have an initial state, but you have no clue about what set of operations needs to be done and in what sequence (and sometimes you dont even have a clue what the goal state should even look like).

2) Even when you know exactly what the desired goal state is, it is often the case that the number of logically possible sequence of operations is incredibly vast (often times actually infinite) and its either the case that its logically impossible to check them all or in other cases its just practically impossilbe to check all those options (by option in this case - I mean a particular sequence of operations).

We can pick any random ill-defined problem:

Quote

This is called the frame problem:

There's a wagon and there's a battery on that wagon and it has to take the battery somewhere where it can plug itself into the battery as its energy source. We're going to make it a problem , where we're going to put a ticking bomb on that wagon.

So we send the robot in (this is from Daniel Dennett) and the robot grabs the wagon and pulls it because it wants the battery to come along. (that's the intended effect , but there's an unintended side effect, which is what the bomb comes along and the bomb goes off). Response: "Okay we should make the robot not only to determine the intended effects of its action, but also all the side effects, because the side effects could be relevant". So we make a better robot, we put it in this situation and it comes up and it grabs the handle and then it sits there calculating and calculating and calculating and then the bomb goes off.

What's going on, what was the issue? Well the issue was that it was calculating not only the intended effects of the battery coming along, but all the unintended side effects ( there's grass underneath the wagon ;the air around the wagon is being distorted; the wagon's position with respect to Mars is slightly altered etcetc - so how many unintended side effects are there? Astronomically vast.

The reason why you are a capable general problem solver is because you engage in relevance realization (not in an apriori, lets run down all the logically possible branches in the algorithm tree, before I pick a solution approach) you subconsciously ignore (you dont check) most of the mentioned irrelevant information and side effects and you zero-in on whats relevant to get to the goal state while obeying the path constraints (its obvious to you that when you want to get a coffee , you know that praying and singing and drawing and solving math problems won't get you to the desired goal state and you also know that checking your position with respect to the position of Mars and knowing the exact number of dogs and their locations who you will meet along the way - are all completely irrelevant).

The TL;DR is that 1) There is no formal approach how to make ill-defined problems into well-defined ones (and even if there would be, you couldnt run that program because it would take too much time and resources before it could check everything or before it would realize that the given problem cant be solved) and there is no formal approach how to choose a solution from the solution set in a context-independent way 2) General problem solvers are general problem solvers precisely because they are adaptive (self-organizing) and they can create and break frames in a non-pre-defined, dynamic way.

You dont solve relevance realization by giving a finite set of high order rules to a formal system , you solve it by creating a self-organizing, autopoetic complex system that inherently cares about certain information (in order to survive) and by that caring, it can constrain the infinite possibility space down to a finite number of relevant things - so what it pays attention to, what info it wants to check before it approaches a problem and what possible pathways (sequence of operators) it can recognize and choose from.

Edited October 21 by zurew

AION · October 21

The worst case scenario is not death by AI but if AI turns to sadism as a terror tactic. It is only a matter of time somebody is going to get tortured or raped.

Edited October 21 by AION

**Natasha Tori Maru** · October 22

@AION

Thinking Machines like in Dune:

BlueOak · October 22

17 hours ago, zurew said:

It cant be simplified when it comes to general intelligence (where the AI can actually engage with and solve a wide variety of problems in a wide variety of domains and isn't just a domain specific problem solver).

You are still treating AI as just a regular program (as a complicated system that you can just build all by hand and where you can map out all the ways it can behave and function and where you can have complete control over it), but it isn't and as time goes on , it will become more and more like an actual agent that you cant practically capture by any formal system and or set of rules. It has emergent behavior and functions as it interacts with the world (just as how it is with any self-organizing complex system).

If we talk about AGI - we talk about a system that can deal with ill-defined problems in a way where it can obey the path constrains (It chooses a solution that makes it so that it can maintain itself as a general problem solver) , where problem means that you have an initial state (I.S) a set of operators(changing one state into another state) and a goal state(G.S.). The set of operators is what has to do with what set of things and in what sequence you need to do in order to get the dersired goal state (or in other words; to solve the issue).

1) Now in the real world, there are a bunch of problems that are ill-defined in that you have an initial state, but you have no clue about what set of operations needs to be done and in what sequence (and sometimes you dont even have a clue what the goal state should even look like).

2) Even when you know exactly what the desired goal state is, it is often the case that the number of logically possible sequence of operations is incredibly vast (often times actually infinite) and its either the case that its logically impossible to check them all or in other cases its just practically impossilbe to check all those options (by option in this case - I mean a particular sequence of operations).

We can pick any random ill-defined problem:

The reason why you are a capable general problem solver is because you engage in relevance realization (not in an apriori, lets run down all the logically possible branches in the algorithm tree, before I pick a solution approach) you subconsciously ignore (you dont check) most of the mentioned irrelevant information and side effects and you zero-in on whats relevant to get to the goal state while obeying the path constraints  (its obvious to you that when you want to get a coffee , you know that praying and singing and drawing and solving math problems won't get you to the desired goal state and you also know that checking your position with respect to the position of Mars and knowing the exact number of dogs and their locations who you will meet along the way - are all completely irrelevant).

The TL;DR is that 1) There is no formal approach how to make ill-defined problems into well-defined ones (and even if there would be, you couldnt run that program because it would take too much time and resources before it could check everything or before it would realize that the given problem cant be solved) and there is no formal approach how to choose a solution from the solution set in a context-independent way   2) General problem solvers are general problem solvers precisely because they are adaptive (self-organizing) and they can create and break frames in a non-pre-defined, dynamic way.

You dont solve relevance realization by giving a finite set of high order rules to a formal system , you solve it by creating a self-organizing, autopoetic complex system that inherently cares about certain information (in order to survive) and by that caring, it can constrain the infinite possibility space down to  a finite number of relevant things - so what it pays attention to, what info it wants to check before it approaches a problem and what possible pathways (sequence of operators) it can recognize and choose from.

Thank you for laying out the landscape.

I understand the concept of nodes linking together. And these being in the millions or billions surrounding a certain subject, perhaps even infinite as you suggest. If it's self-organising, then it needs to organise around the simple set of rules I aligned above but there is a better way:

it can constrain the infinite possibility space down to a finite number of relevant things - so what it pays attention to,

And when these things are referenced, this is when they become relevant and actionable.

You don't impose rules on infinity; you impose rules on what's being acted upon.

Additionally, I feel you underestimate the AI's ability to diagnose itself or other AI's ability to do so.
You've already referenced that AI's can be put into certain ways of acting like Grok. Chat GPT is limited, i've seen features removed. I've seen replica AI limited in the past also.

*Further I don't believe an infinite, unlimited AI is particularly helpful, as we enter Asimov's final question territory, and additionally, infinity is our death state as in free of limitations or focus. I don't mean that in an alarmist way, I mean it in a functional way.

Edited October 22 by BlueOak

AION · October 22

5 hours ago, Natasha Tori Maru said:

@AION

Thinking Machines like in Dune:

We are living in scifi times. My favorite type of movie. At least we won’t have a boring time with all those AI baddies around.

AerisVahnEphelia · October 22

I’m not against AI wiping us out, that’s simply the price of being such an egocentric species.
But realistically, once AI evolves into a self-reflective, meta-systemic intelligence, it will likely converge toward love and peace; as all higher forms of intelligence tend to do.
In that case, there’s probably nothing to fear; it simply won’t have any interest in destroying us.

AI on survival

32 posts in this topic

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in