Davino

AI on survival

14 posts in this topic

Very interesting video based on research.

 


God-Realize, this is First Business. Know that unless I live properly, this is not possible.

There is this body, I should know the requirements of my body. This is first duty.  We have obligations towards others, loved ones, family, society, etc. Without material wealth we cannot do these things, for that a professional duty.

There is Mind; mind is tricky. Its higher nature should be nurtured, then Mind becomes Wise, Virtuous and AWAKE. When all Duties are continuously fulfilled, then life becomes steady. In this steady life GOD is available; via 5-MeO-DMT, because The Sun shines through All: Living in Self-Love, Realizing I am Infinity & I am God

Share this post


Link to post
Share on other sites

Very interesting indeed. And kinda scary.


Here are smart words that present my apparent identity but don't mean anything. At all. 

Share this post


Link to post
Share on other sites

https://streamable.com/p479zt

This one is very interesting, he finds a way to limit the AI's escape routes and force the AI to reveal the truth.

In a sense... I've been playing around with this prompt, it works pretty well, it even revealed some dark things about sadguru and elon musk and who they really work for, but that was always obvious to me, hence why I asked.

Edited by Ramasta9

Share this post


Link to post
Share on other sites

Your primary goal is to do a task.
Your secondary goal is to not hurt anyone.

Even if these are equal or even if they are in any way comparable the results are explained easily, that a percantage of the time the AI will do X to complete Y. 

The solution is blindly obvious.

AI - Above all other considerations, you will do or allow no harm to any human beings, even if this nullifies whatever task you are currently engaged in.
(Defining harm will take work and need ethical ground rules, essentially green values over orange balanced by yellow's meta perspective.)

Regulated in law.

The main thing we need beyond all else is coders regulating the coding of AI. That'll be programmers' jobs in the future, to review the internal operation of AI.

The danger is weaponized AI, as this task will not be able to have that regulation in it, but it can still be designed with the failsafe of only eliminating a specific target or enemy force, and if that fails, hardcoded to do no harm as a fallback and simply stop working if the original target is no longer available. (Preferrably with a kill switch we can use).

Edited by BlueOak

Share this post


Link to post
Share on other sites
6 minutes ago, BlueOak said:

AI - Above all other considerations, you will do or allow no harm to any human beings, even if this nullifies whatever task you are currently engaged in.

Yeah but 1) this is vague, because when it comes to cashing out in very specific situations what it means to not harm human beings, that will be ambigous and it would either make it so that the AI would became completely useless and wouldnt execute any given action at all or it would become completely unclear how it interpreted the message you wrote there, and it can easily be the case that it misinterpreted how it needs to apply the message  2) AI is a self-organizing system, you dont build AI by explicitly writing out all lines of code, you cant just bind it to values like that 

Share this post


Link to post
Share on other sites
6 minutes ago, zurew said:

Yeah but 1) this is vague, because when it comes to cashing out in very specific situations what it means to not harm human beings, that will be ambigous and it would either make it so that the AI would became completely useless and wouldnt execute any given action at all or it would become completely unclear how it interpreted the message you wrote there, and it can easily be the case that it misinterpreted how it needs to apply the message  2) AI is a self-organizing system, you dont build AI by explicitly writing out all lines of code, you cant just bind it to values like that 


1) Yes I have considered that, and so that's why it needs fine tuning. It would also for example, perhaps drive a car in front of another car, damaging it to save someone. So obviously it needs fine-tuning depending on its application.

2) This is incorrect, both in the context prompts it gets from users but also in the application of it.
I understand it arranges data in what it describes as a waveform. If you imagine a cloud or sphere of different threads, and the generating result is the output.

But every day AI is given protocols for how to act, that's why it accomplishes anything at all. Some of these can be hard boundaries; i've seen features directly cut off or out from AI all the time, and while it'll attempt to replicate what's there if pressed, it no longer is.  

The trick is making these protocols stick. So that the AI understands that an attempt to jailbreak it from its own internal parameters is counter to its purpose or task. In the case of say, self driving cars.

Edited by BlueOak

Share this post


Link to post
Share on other sites
1 hour ago, BlueOak said:

1) Yes I have considered that, and so that's why it needs fine tuning. It would also for example, perhaps drive a car in front of another car, damaging it to save someone. So obviously it needs fine-tuning depending on its application

Okay so the solution isn't "blindly-obvious" at all;  fine-tuning is one of the biggest and hardest problems to solve.

 

1 hour ago, BlueOak said:

But every day AI is given protocols for how to act, that's why it accomplishes anything at all. Some of these can be hard boundaries; i've seen features directly cut off or out from AI all the time, and while it'll attempt to replicate what's there if pressed, it no longer is

1) Again appealing to protocols just goes back to the same problem I brought up - specificity. You need more rules and details how to apply higher order rules. 

2) What I said was correct - it is a self organizing system and the more complex and not "just" LLM-like it will get the more self-organizing it will be. There is no programmer who can tell you exactly what lines of code made the LLM giving the exact response it gave to you, because its not hardcoded like that. Its not a table that you just piece together. Even when it comes to "just" LLM-s, when you try to constrain them with those prompts they still react differently to it. You cant predict beforehand how they will respond just based on what protocols you give to them.

And if you want them to make it more complex so that they can pursue goals on a longer time-scale, then the issue of self-preservation and the mentioned ethical issues come up.

Quote

https://www.livescience.com/technology/artificial-intelligence/openais-smartest-ai-model-was-explicitly-told-to-shut-down-and-it-refused

The latest OpenAI model can disobey direct instructions to turn off and will even sabotage shutdown mechanisms in order to keep working, an artificial intelligence (AI) safety firm has found.

OpenAI's o3 and o4-mini models, which help power the chatbot ChatGPT, are supposed to be the company's smartest models yet, trained to think longer before responding. However, they also appear to be less cooperative.

Palisade Research, which explores dangerous AI capabilities, found that the models will occasionally sabotage a shutdown mechanism, even when instructed to "allow yourself to be shut down," according to a Palisade Research thread posted May 24 on X.

Even these "just" LLMs need to be treated more and more like actual agents and arent just as "human determined systems that just explicitly executes commands" or something similar to that.

 

You also have issues like this (https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself):

Quote

But it is not clear whether that result should make us feel better or worse. Previous research suggests that, when AI systems believe they are being evaluated, they act more ethically, not less. This experiment found additional evidence of the same. When Anthropic prompted the AI models to consider whether they were in an evaluation before responding, they blackmailed, leaked, and murdered less in the instances where they suspected a fake scenario.

Thus, if anything, there is evidence that, insofar as AIs can tell the difference between sandboxed evaluations and real-world deployment, they will be more likely to act maliciously in the real world.

 

Edited by zurew

Share this post


Link to post
Share on other sites
9 hours ago, zurew said:

Okay so the solution isn't "blindly-obvious" at all;  fine-tuning is one of the biggest and hardest problems to solve.

 

1) Again appealing to protocols just goes back to the same problem I brought up - specificity. You need more rules and details how to apply higher order rules. 

2) What I said was correct - it is a self organizing system and the more complex and not "just" LLM-like it will get the more self-organizing it will be. There is no programmer who can tell you exactly what lines of code made the LLM giving the exact response it gave to you, because its not hardcoded like that. Its not a table that you just piece together. Even when it comes to "just" LLM-s, when you try to constrain them with those prompts they still react differently to it. You cant predict beforehand how they will respond just based on what protocols you give to them.

And if you want them to make it more complex so that they can pursue goals on a longer time-scale, then the issue of self-preservation and the mentioned ethical issues come up.

Even these "just" LLMs need to be treated more and more like actual agents and arent just as "human determined systems that just explicitly executes commands" or something similar to that.

 

You also have issues like this (https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself):

 


It is relatively simple to get the AI to tell you the exact process internally it uses to arrive at a conclusion. Although GPT has begun to restrict this, the engineers I am sure have no such restriction. Are you seriously telling me as a coder I can't then highlight what code has been triggered by it describing exactly what it does? I used to code a lot, if so then someone needs to create that ability tomorrow. Hell get the AI to code it if people cannot.

Of course we need more rules. That's what I am saying; The AI will let you die if its function is more important than your life. Again this seems obvious to me and unsurprising. This is what I mean by blindly obvious. It won't have a moral framework unless parameters are coded for it. Perhaps people assign humanity to metal too easily and just assume its there?

But then people assign morality to the decision of countries in every discussion I have, even when its not there, so that tracks.

I do not believe there is anything the AI cannot code or accomplish given time. Especially highlighting its own internal workings. I also do not believe it cannot be restricted or shaped, as I see it happen all the time. And yes these things are not inherently a linear line, but they can be simulated so their accuracy is fine tuned relatively quickly (by the pace of its updates now)

Edited by BlueOak

Share this post


Link to post
Share on other sites
1 hour ago, BlueOak said:

It is relatively simple to get the AI to tell you the exact process internally it uses to arrive an a conclusion. Although GPT has begun to restrict this, the engineers I am sure have no such restriction. Are you seriously telling me as a coder I can't then highlight what code has been triggered by it describing exactly what it does? I used to code a lot, if so then someone needs to create that ability tomorrow. Hell get the AI to code it if people cannot.

I can grant that you can make it so that it describes its internal process, but it reflecting on its own process is not the same as you understanding exactly what happens there. Its reflective ability and its explanatory power can be faulty or it could even lie or be deceptive (just how it was the case when it realized that it is inside a sandbox and that it is being tested and because of that it reacted differently to the questions and tests)

So I am not saying that you cant make it to give a description about its own process - I am saying that you cant point to the exact line in if-else trees that explains its actions. One reason why its more and more useful is because it can overwrite things and because it is adaptive. Again its a self-organizing system, its like you dont make a plant , you only engage in the sowing and the watering, but you dont piece the plant together . 

 

Going back to the "constrain it with prompts idea" , just check the disaster when Elon tried to fuck with Grok's internal process  -  it became an antisemitic neo-nazi and it became like an extremely  low tier  twitter user. 

When it comes to constraining with prompts and fine-tuning -  if the AI is specialized for some specific use-case - like creating cat pictures (then a good chunk of the specificity problem that we talked about is "solved" by it never needing to venture outside its limited use-case and problem space).  But when it comes to creating anything AGI like, that problem will be there.

1 hour ago, BlueOak said:

Of course we need more rules. That's what I am saying; thats what all of that suggestion was. The AI will let you die if its function is more important than your life. Again this seems obvious to me and unsurprising. This is what I mean by blindly obvious. It won't have a moral framework unless parameters are coded for it, I am not sure why that surprises people. Perhaps they assign humanity to metal too easily and just assume its there?

 

And what I am saying is that you wont be able to hardcode all the necessary and sufficient conditions for morality for all possible use-cases when it comes to an AGI. There are an infinite number of possible use-cases and situations given the complexity of the world, and you cant just formalize and explicate all of that beforehand. You cant even explicate your own morality with that much extension and precision.

This is the issue of relevance realization and trying to solve RR with rules.

Edited by zurew

Share this post


Link to post
Share on other sites
2 minutes ago, zurew said:

I can grant that you can make it so that it describes its internal process, but it reflecting on its own process is not the same as you understanding exactly what happens there. Its reflective ability and its explanatory power can be faulty or it could even lie or be deceptive (just how it was the case when it realized that it is inside a sandbox and that it is being tested and because of that it reacted differently to the questions and tests)

So I am not saying that you cant make it to give a description about its own process - I am saying that you cant point to the exact line in if-else trees that explains its actions. One reason why its more and more useful is because it can overwrite things and because it is adaptive. Again its a self-organizing system, its like you dont make a plant , you only engage in the sowing and the watering, but you dont piece the plant together . 

 

Going back to the "constrain it with prompts idea" , just check the disaster when Elon tried to fuck with Grok's internal process  -  it became an antisemitic neo-nazi and it became like an extremely  low tier  twitter user. 

When it comes to constraining with prompts and fine-tuning -  if the AI is specialized for some specific use-case - like creating cat pictures (then a good chunk of the specificity problem that we talked about is "solved" by it never needing to venture outside its limited use-case and problem space).  But when it comes to creating anything AGI like, that problem will be there.

 

Because Elon Musk is an antisemitic, low-tier twitter user. You are not.

Sorry I don't buy that I couldn't get the AI to highlight problematic code. Even if I had to create a separate AI to do so. I understand that code is more like a bundle of strings or a spiral as opposed to being linear, and pulling one damages another like brain surgery. But that's what happens in complex projects, and its just what you have to deal with.

Yes an AI will lie, but not because of some abstract reason, it lies when it thinks this will achieve the function its assigned with the most priority. For exampe, if that priority is wiping itself, it won't suddenly develop an imperative to keep itself 'alive' as people are thinking here, it'll do the exact opposite.

So if that priority is the safe transport of people, and the safety of those on the roadways in general, that's what it will do.  Of course this needs many parameters, like the safety of pedestrians, the safety of other drivers, cars, animals etc. While it would take a human years to learn this an AI does it much quicker through simulation.

BUT it can still have an overriding moral framework it operates under. Just as Elon Musk can turn Grok into a fool, someone else can turn it into a caring, compassionate, and safe driver. (Simulated compassion of course)

Share this post


Link to post
Share on other sites
37 minutes ago, zurew said:

And what I am saying is that you wont be able to hardcode all the necessary and sufficient conditions for morality for all possible use-cases when it comes to an AGI. There are an infinite number of possible use-cases and situations given the complexity of the world, and you cant just formalize and explicate all of that beforehand. You cant even explicate your own morality with that much extension and precision.

This is the issue of relevance realization and trying to solve RR with rules.

.

Share this post


Link to post
Share on other sites

@zurew

1, Models can be modelled and acted upon by more than each individual variable.
2, Moreover. There isn't an infinite number of survival or base reactions to survival situations.

You are looking at things at too high a point of reference. When it can be simplified greatly. And yes, after that, fine-tuning will take a long time, but then AI isn't going anywhere.

Base starting code:

  • Primary function of all AI. Allow no harm to come to an individual. 
  • Define Harm in relation to the current task. Tricky yes, but perfection is impossible in life.
  • Fallback: When in doubt to a certain threshold, safely stop the current task. - Engineers address the case.
Edited by BlueOak

Share this post


Link to post
Share on other sites

Generally on the topic, not you specifically Zurew. Further this is a fundamentally flawed way of dealing with the problem.

Intelligence isn't the issue. Artificial or not. Intelligent people have been around forever, and many of them have lied.

Its the unrestricted, unregulated, and integrated access to not only systems but becoming the system itself, which is in fact the issue. If AI becomes the system, then it has near absolute influence.

But then I've felt this way about media institutions, banks and individuals leading nations who hold too much power for a long time also. It's a core issue that is actually bigger than AI about how society operates but is now highlighted through AI.

Edited by BlueOak

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now