- OpenAI forms a team to ensure safe theoretical superintelligent AI.
- Critics argue it’s premature and an ethical distraction.
- However, researchers persist in efforts to steer advanced AI away from harm.
OpenAI’s Superalignment team is forging on with efforts to ensure theoretical future “superintelligent” AI systems remain safe and beneficial.
Newly formed team
Formed this July, the team is led by OpenAI Chief Scientist Ilya Sutskever, who presented new alignment research this week at the NeurIPS AI conference.
Their controversial goal: Develop frameworks to control AI potentially smarter than humans.
OpenAI frames superalignment as “perhaps the most important unsolved technical problem of our time.”
But critics argue it’s premature and an ethical smokescreen distracting from issues like AI bias.
Could it threaten humanity?
Nonetheless, Sutskever’s team believes AI could one day threaten humanity if uncontrolled. They envision weaker AI guiding more powerful systems, using simple labels and instructions.
It’s early days, but the team hopes techniques like this when applied repeatedly, might instill alignment even as AI grows more inscrutable.
$10 million in grants from Eric Schmidt will also support external superalignment research.
The team will operate transparently
The team’s urgency hasn’t wavered amid OpenAI’s recent internal turmoil. But the involvement of Schmidt, who stands to gain from hyping AI risk, raises questions.
OpenAI pledges to publish all superalignment research for public benefit.
For now, Sutskever’s researchers aim to further their vision of steering AI away from harm as capabilities escalate.
But influencing the actions of unknowable superintelligent systems remains firmly in the realm of theory.