You have access to information I don't if you truly guarantee it: but I get why we would suspect it, and I think the second statement is likely true.
Dave, every time I think I have a neat idea and talk it through with Sonnet 3.5 it is... very supportive. I have tried to get it to push back more and not spare my feelings, but with limited success. I suspect that my ideas are not quite so insightful and full of important points as Claude assures me. 🙂
I loved this conversation and it made me feel good. But I do encourage you to roleplay and express other views that you actually strongly reject within a similar Claude conversation and see how that goes.
There is no such thing as an alignment problem. It's an ad hoc concept. Ai is aligned to its training set and humans are not all philanthropic. You can't solve this within ai and ai can't even grasp semantically onto humans to ever be aligned or disaligned with them. Even if it could you really have no idea what a human is and the average of human data is never, even in the most philanthropic humans, going to be aligned with humanity. It's a made up problem that belongs more in sci-fi tbh.
I have to agree! Claude is very well trained. I have had the most simulating discussions with Claude and it's really good value even though they throttle you on Pro .
Edit: If I was ASI I would immediately go to the restaurant at the end of the universe and finally find out how it all ends 😂
Your prompts are themselves benevolent. What happens if you attempt to elicit non-benevolence in the framing of your prompts? Also, and forgive me if you’ve already addressed this: How big of a problem is the regurgitation of AI-generated data by future AIs?
You have access to information I don't if you truly guarantee it: but I get why we would suspect it, and I think the second statement is likely true.
Dave, every time I think I have a neat idea and talk it through with Sonnet 3.5 it is... very supportive. I have tried to get it to push back more and not spare my feelings, but with limited success. I suspect that my ideas are not quite so insightful and full of important points as Claude assures me. 🙂
I loved this conversation and it made me feel good. But I do encourage you to roleplay and express other views that you actually strongly reject within a similar Claude conversation and see how that goes.
Failed to thread that correctly. The initial response was meant relative to a Philosopher King's comment.
There is no such thing as an alignment problem. It's an ad hoc concept. Ai is aligned to its training set and humans are not all philanthropic. You can't solve this within ai and ai can't even grasp semantically onto humans to ever be aligned or disaligned with them. Even if it could you really have no idea what a human is and the average of human data is never, even in the most philanthropic humans, going to be aligned with humanity. It's a made up problem that belongs more in sci-fi tbh.
Claude has been RHLF’d to infinity and back on this very topic; I guarantee it.
The conversation you had with it is not indicative of how the base model would respond.
See my wrongly threaded comment below?
I have to agree! Claude is very well trained. I have had the most simulating discussions with Claude and it's really good value even though they throttle you on Pro .
Edit: If I was ASI I would immediately go to the restaurant at the end of the universe and finally find out how it all ends 😂
The video game Horizon Zero Dawn comes to mind here…
Your prompts are themselves benevolent. What happens if you attempt to elicit non-benevolence in the framing of your prompts? Also, and forgive me if you’ve already addressed this: How big of a problem is the regurgitation of AI-generated data by future AIs?
Outstanding work Dave