Hey David, if you're serious about being rigorous, I can give a formalized argument based on your decomposition S,A,U,H. That kind of thing was a major part of my PhD work.
Turning it into a lower bound argument, with a conclusion like P(doom) ≥ 12.7%, is easy. But I don't think that is what you want, since you said this effort led to lowering your estimation of P(doom).
You give some hints that would be useful in formalizing your opinions for an upper bound argument (e.g. "we can assume that an ASI will be far more capable of reasoning than humans, and wouldn’t be likely to accidentally kill humans"), but not enough. The simplest illustration of this: the assertion "These new AI paradigms are as-yet unknown, and therefore we can have almost zero confidence in predicting their properties, characteristics, behaviors, and architectures" does not, unfortunately, excuse us from having to somehow deal with the term P(doom|not soon). An easy, though probably unsatisfactory to you, way to address that is simply to change your conclusion so that it speaks of P(doom|soon) instead P(doom).
Maybe Wild rather than Agentic. In the sense of wild animal.
p(doom) doesn't usually mean just within 10 years but many are worried about p(doom) within 15-20 years and p(doom) values are much higher when looking at before 2100 as opposed to just the next 10 years where values could be as much as half lower. My personal p(doom) is ~35-40% before 2100 but closer to 17.5% within the next 10 years
My bad outcome is derived entirely by how much it is controlled. I believe ALL of the bad comes from human control. 20% likely to be controlled, 20% likely to be hostile as a result of that control.
Good post! But I have trouble with your definitions of agentic vs. autonomous.
The root word of agentic is "agent" – that is, an entity that acts on behalf of someone else. Autonomous comes from the Greek word "autonomos," meaning having its own laws.
From that, I would think autonomous refers to "systems that possess and pursue their own goals," whereas agentic refers to systems that "operate independently to achieve human-defined goals without constant oversight." But your definitions in this post are the opposite.
I've just flown over the article so I might have missed it: As far as I'm aware, people tend to assign a higher probability to a bad outcome than is the case in reality (people are biased towards bad outcomes). According to ChatGPT (I couldn't find something quick online, so there should be more thorough research done) people tend to overestimate the likelihood of negative events on the range of 10-30%, this number is partially dependent on the severity of the bad outcome. Given the assumed numbers the bias adjusted P(DOOM) would be P(DOOM)' = P(DOOM) *(0.7 to 0.9)⁴ = 3.05% to 8.33%
Edit: instead of 0.7 to 0.9 it should be 1/1.3 to 1/1.1, so P(DOOM)' = 4.45% to 8.67%
But if AI is not agentic, does it actually qualify as AGI or ASI? Isn't the point for it to be able to do anything a human can do, or better? Humans are agentic, so would a non-agentic AI only ever qualify as narrow AI? Let's say you are an employer, looking to hire an employee. Let's say you can hire a human who is either agentic or non-agentic. Which one do you think will be a better employee?
Thanks for your thoughtful post and framework, Dave.
What framework or process did you use to inform your previous 70% p(doom) estimate from whenever you mentioned it in a past video? (Correct me if I'm misremembering the 70% figure.)
Hey David; what happened to your p(doom) calculator?
Hey David, if you're serious about being rigorous, I can give a formalized argument based on your decomposition S,A,U,H. That kind of thing was a major part of my PhD work.
Turning it into a lower bound argument, with a conclusion like P(doom) ≥ 12.7%, is easy. But I don't think that is what you want, since you said this effort led to lowering your estimation of P(doom).
You give some hints that would be useful in formalizing your opinions for an upper bound argument (e.g. "we can assume that an ASI will be far more capable of reasoning than humans, and wouldn’t be likely to accidentally kill humans"), but not enough. The simplest illustration of this: the assertion "These new AI paradigms are as-yet unknown, and therefore we can have almost zero confidence in predicting their properties, characteristics, behaviors, and architectures" does not, unfortunately, excuse us from having to somehow deal with the term P(doom|not soon). An easy, though probably unsatisfactory to you, way to address that is simply to change your conclusion so that it speaks of P(doom|soon) instead P(doom).
Maybe Wild rather than Agentic. In the sense of wild animal.
p(doom) doesn't usually mean just within 10 years but many are worried about p(doom) within 15-20 years and p(doom) values are much higher when looking at before 2100 as opposed to just the next 10 years where values could be as much as half lower. My personal p(doom) is ~35-40% before 2100 but closer to 17.5% within the next 10 years
It is insanely irrational to think we can predict technology beyond 10 years.
My bad outcome is derived entirely by how much it is controlled. I believe ALL of the bad comes from human control. 20% likely to be controlled, 20% likely to be hostile as a result of that control.
Good post! But I have trouble with your definitions of agentic vs. autonomous.
The root word of agentic is "agent" – that is, an entity that acts on behalf of someone else. Autonomous comes from the Greek word "autonomos," meaning having its own laws.
From that, I would think autonomous refers to "systems that possess and pursue their own goals," whereas agentic refers to systems that "operate independently to achieve human-defined goals without constant oversight." But your definitions in this post are the opposite.
I've just flown over the article so I might have missed it: As far as I'm aware, people tend to assign a higher probability to a bad outcome than is the case in reality (people are biased towards bad outcomes). According to ChatGPT (I couldn't find something quick online, so there should be more thorough research done) people tend to overestimate the likelihood of negative events on the range of 10-30%, this number is partially dependent on the severity of the bad outcome. Given the assumed numbers the bias adjusted P(DOOM) would be P(DOOM)' = P(DOOM) *(0.7 to 0.9)⁴ = 3.05% to 8.33%
Edit: instead of 0.7 to 0.9 it should be 1/1.3 to 1/1.1, so P(DOOM)' = 4.45% to 8.67%
I wish for "losing control isn't a bad thing":
S == 100%
A == 100%
U == 100%
H == 0%
-> P(DOOM) of 0%
But more realistically I guess I'm at:
S == 50%
A == 50%
U == 50%
H == 10%
-> P(DOOM) of 1.25%
But if AI is not agentic, does it actually qualify as AGI or ASI? Isn't the point for it to be able to do anything a human can do, or better? Humans are agentic, so would a non-agentic AI only ever qualify as narrow AI? Let's say you are an employer, looking to hire an employee. Let's say you can hire a human who is either agentic or non-agentic. Which one do you think will be a better employee?
Thanks for your thoughtful post and framework, Dave.
What framework or process did you use to inform your previous 70% p(doom) estimate from whenever you mentioned it in a past video? (Correct me if I'm misremembering the 70% figure.)
I was never 70, my highest was 30, and it was a gut check
My fault, I see that I misremembered; the video was 70% P(WIN).