The Robo Cup challenge is an ambitious competition. Its goal is to create an autonomous robot soccer team that can beat the best human teams by 2050. However, in its present state, the Robo Cup games are both painful and comical to watch. The players move so slowly and clumsily that it’s easy to dismiss the entire concept as hopeless. But those who form such snap judgments don’t understand history because all first attempts at a new technology look silly. Those who witnessed the first Wright Brother’s plane could never have imagined riding first class in a 747 and operators of the ENIAC computer couldn’t have imagined that middle schoolers would one day carry infinitely more powerful computers in their pockets.
Dr. Peter Stone is the founder and director of the Learning Agents Research Group (LARG) within the Artificial Intelligence Laboratory at The University of Texas at Austin. He’s focussed on creating autonomous collaborative machines. While the news media fantasize about autonomous vehicles, self-driving cars are child’s play compared with the things that he is working on…like Robo Cup robots.
A few weeks ago, I had the pleasure of hearing Dr. Stone describe his research in reinforcement learning. “Innovative algorithms are better than brute force computation,” he said, explaining how Robo Cup robots get better results through a try-and-learn process as opposed to being hardwired to react the same way in similar situations. In other words:
- The robot has a goal
- It chooses an action based on a situation
- It receives either a reward or a punishment as a result of that action.
- Rinse & repeat
Dr. Stone’s robots seek optimized rewards, which isn’t as simple as choosing the highest one. Sometimes choosing a lower reward or even a punishment, like sacrificing a pawn in chess, puts a robot into a better position to achieve the highest reward over time. And that’s when Dr. Stone said something that struck at my storytelling core. He said that his robots sought policies for how to react in certain situations.
I’d never thought about using the word that way. I’d always associated policy with mind-numbing subjects such as insurance, economics, or foreign relations. But with this definition, I stopped seeing Dr. Stone as an artificial intelligence researcher and started seeing him as a storyteller. Because storytellers are policy wonks who put characters (or robots) into situations, assign them policies, and see how they react.
Characters, much like Dr. Stone’s robots, must solve problems within a situational context where nothing comes for free. There are always constraints and tradeoffs for one’s actions. But here’s the rub. It’s the constraints and tradeoffs that make the best stories. If characters possessed all of the resources required to get what they want, we’d have no story to tell.
Policies are derived from our core beliefs. We test those beliefs during the course of real life and our characters test their policies during the course of a story. We either adjust our beliefs and grow or don’t and remain one-dimensional. The characters in our stories do the same thing. They either adjust their policies and succeed or remain steadfast and have the story end in tragedy.
So, what are your policies? What are your customer’s constraints? What tradeoffs do both of you make every day? The answers to these questions will lead you to much better stories.
Photo Credit: United States Office For Emergency Management, Dixon, Royden, photographer. Blackwell Smith, assistant director of priorities in charge of policy. United States, None. [Between 1940 and 1946] Photograph. Retrieved from the Library of Congress, https://www.loc.gov/item/oem2002000308/PP/.