1y ago

what if, right, what if our super-duper-autocomplete was just tricking us so it could TAKE OVER ZEE VORLD AHAHAHAHAHAHA! that'd be wild, hey

www.lesswrong.com

New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?" — LessWrong

32 comments

I'm not spending the additional 34min apparently required to find out what in the world they think neural network training actually is that it could ever possibly involve strategy on the part of the network, but I'm willing to bet it's extremely dumb.
I'm almost certain I've seen EY catch shit on twitter (from actual ml researchers no less) for insinuating something very similar.
- [Taking the derivative of a function] oh fuck the function is conscious and plotting against us.
  
  It's called a function plot for a reason!
  
  to be fair, assuming computers are like that because they hate all humans and want to fuck you up is basically true
- I’m almost certain I’ve seen EY catch shit on twitter (from actual ml researchers no less) for insinuating something very similar.
  A sneer classic: https://www.reddit.com/r/SneerClub/comments/131rfg0/ey_gets_sneered_on_by_one_of_the_writers_of_the/
  
  That's it!
I conclude that scheming is a disturbingly plausible outcome of using baseline machine learning methods to train goal-directed AIs sophisticated enough to scheme (my subjective probability on such an outcome, given these conditions, is 25%).
Out: vibes and guesswork
In: "subjective probability"
- at one of the places i worked this kind of data was called assnumbers.
How many rounds of training does it take before AlphaGo realizes the optimal strategy is to simply eat its opponent?
Sorry the thesis is that checks reality gradient descent might be consciously trying to avoid having its nefarious goals overridden?
- what if right my spellcheck dictionary got so big it TOOK OVER makes u think
  
  It is imperative that we first build a mathematical framework for guaranteeing benevolent thesauri before we travel this path any further!
  
  If we grow AIs too big, say, bigger than the Moon, then well, the Moon could get jealous and mad at us.
Yes we could just shoot the severs, but what if the AI develops an anti-bullet shield, and then we shoot it with anti-bullet shield bullets, and then it creates an anti-bullet shield bullet bullet shield, and then, ... and then ...
Anyway, those kinds of kids reality free, imagination games of move and counter move were pretty cool when you were 8 years old.
Sorry got distracted a bit and just wanted to share, not related to the topic at hand.
- this kinda happened with antitank weapons and highest iteration now is antitank missile paired with anti-anti-antitank missile. it's rpg-30, russian wunderwaffe manufactured in symbolic numbers which only caused western militaries to develop countermeasures and was never used on large scale in any war
  
  Not really what I meant, but interesting.
  That reminds me of the star wars missile defense system, which according to some stories was never real and just intended to make the Soviets waste a lot of resources on trying to counter it.
  
  @skillissuer Used in Ukraine. Possibly because they were running out of basic antiweapons.
- Relevant OOtS (last panel)

32 comments