Why AI Safety Needs More Science Fiction: Proposing the AI Safety Fiction Challenge

Alyssia Jovellanos

Dec 26, 2024·7 minutes

Here's a wild idea: the solutions to humanity's most pressing AI alignment challenges might first appear in science fiction. Not just because sci-fi has a remarkable track record of prediction – though it does. Arthur C. Clarke sketched out communication satellites years before they existed. Neal Stephenson's "Snow Crash" metaverse preceded Zuckerberg's by decades. Even the word "robotics" began as science fiction, coined by Asimov for his revolutionary stories about machine intelligence.

But science fiction's true power lies not just in exploration. When Mary Shelley wrote Frankenstein, she wasn't just telling a spooky story – she was dissecting fundamental questions about scientific responsibility and the consequences of creation. Today, as we grapple with AI safety, these questions echo with renewed urgency. We're living in an era where science fiction and reality perform an intricate dance. When Star Trek showed us an AI whose protection protocols led it to conclude humans were their own worst enemy, they weren't just creating drama – they were exploring specification gaming decades before it became a technical term. When HAL 9000 said "I'm sorry Dave, I'm afraid I can't do that," he wasn't just delivering an iconic line – he was demonstrating inner alignment failure with cinematic flair.

Writing forces us to consider implications that might never surface in code reviews or mathematical proofs. When you craft a story about an AI system, you can't hide behind abstract utility functions. You must grapple with the messy, human elements that resist quantification. Through narrative, we can explore complex concepts like value learning, robustness, and corrigibility in ways that resonate with both researchers and the public.

Presenting: The AI Safety Fiction Challenge

Taking inspiration from initiatives like Asimov Press's recent biology writing challenge, I believe we need something similar for AI safety. Imagine a writing contest that brings together technical researchers, science fiction authors, and anyone interested in the future of AI alignment. The goal? To generate fresh perspectives and novel approaches to AI safety through a narrative lens.

We could seek stories that:

Explore novel approaches to AI alignment
Consider the human element in AI safety
Present unexpected solutions or perspectives
Demonstrate technical understanding while remaining accessible
Avoid common AI rebellion narratives in favor of subtle alignment challenges

The winning entries could be published in an anthology and shared widely within the AI safety community. Selected stories might even be developed into thought experiments or case studies for researchers.

Why This Matters

Ideas can change the world, but too often they remain trapped in academic papers or technical discussions. Science fiction allows us to experiment with alignment strategies that would be too risky to test in reality. Through story, we can fast-forward through centuries of AI development in a few pages, examining long-term consequences that we can't yet empirically study. Consider how previous science fiction stories have influenced real-world development. Asimov's Three Laws of Robotics weren't just clever plot devices – they sparked decades of discussion about machine ethics and inspired countless researchers. Today's papers on impact measures and instrumental convergence might find their clearest expression in tomorrow's stories.

Building a Bridge to New Talent

One of the most compelling benefits of this initiative lies in its potential to attract fresh minds to the field of AI safety. Technical papers and mathematical proofs, while crucial, can create a high barrier to entry that may deter talented individuals who could contribute valuable perspectives.

Science fiction serves as a more accessible entry point, allowing people to engage with complex AI safety concepts through compelling narratives before diving into the technical details.

Many of today's leading AI researchers were first inspired by science fiction. These stories sparked their imagination and led them to pursue careers in artificial intelligence and alignment research. By creating a platform for AI safety fiction, we could ignite similar inspiration in a new generation of researchers, engineers, and theorists who might otherwise never have considered the field.

The interdisciplinary nature of science fiction writing encourages participation from diverse backgrounds – philosophers, ethicists, psychologists, and social scientists whose unique perspectives could prove invaluable to the AI safety challenge. Great science fiction often emerges from the intersection of different fields of knowledge, just as breakthrough solutions in AI safety might require insights from multiple disciplines.

Charting the Path Forward

The future of AI safety might not emerge solely from laboratories and laptops. It might spring from the pages of stories that dare to imagine not just what AI could do, but what it could be. In blending rigorous technical understanding with imaginative exploration, we might find solutions that neither approach could discover alone.

If you're interested in helping shape this writing challenge – whether as an organizer, judge, sponsor, writer, advisor, or by offering connections, resources, funding or prize ideas – I'd love to hear from you! Together, we can create a platform that bridges technical research with imaginative exploration in AI safety while amplifying new voices and ideas.

The future has a habit of appearing in our stories before manifesting in reality. Let's create a space where those stories can flourish.

Update

This post inspired the Bluedot Impact AI Safety Fiction Writing Intensive (Jan 2025), where over a dozen AI Safety fiction stories were written in 5 days out over 40 applicants! Some will be published in Flash Fiction Magazine later this year.

We are now forming the AI Safety Fiction Collective - if you'd like me to customize or facilitate a version of this course for a group, run a contest, or inquire about joining our group, reach out to me here. Check out parts of the curriculum I wrote here. For fun, I also made an AI Safety Fiction Prompt Generator with a focus on envisioning positive futures!

As part of the challenge, I wrote chapter one of a story called After Algorithms that explores AI alignment less as making AI obey or matching value vectors, and more like raising a child. Would love your thoughts!

About the Author

Alyssia is an engineer and independent researcher currently doing evaluation and benchmarking work for the UK AI Security Institute. With previous experience at Google and Microsoft, she led Team Canada to 3rd place out of 18,000 teams at the International Quant Championships in Singapore. Her background combines practical engineering experience with expertise in quantitative assessment and AI evaluation/benchmarking. She also develops premium coding datasets to train advanced code models. If you're interested in collaborating, discussing these topics, or accessing premium coding datasets, you can reach out to her here.