The original version of this story appeared in Quanta Magazine.
In the movie Oppenheimer, Niels Bohr challenges the physicist early in his career:
Bohr: Algebra is like sheet music. The important thing isn’t “can you read music?” It’s “can you hear it?” Can you hear the music, Robert?
Oppenheimer: Yes, I can.
I can’t hear the algebra, but I feel the machine.
I felt the machine even before I touched a computer. In the 1970s I awaited the arrival of my first one, a Radio Shack TRS-80, imagining how it would function. I wrote some simple programs on paper and could feel the machine I didn’t yet have processing each step. It was almost a disappointment to finally type in the program and just get the output without experiencing the process going on inside.
Even today, I don’t visualize or hear the machine, but it sings to me; I feel it humming along, updating variables, looping, branching, searching, until it arrives at its destination and provides an answer. To me, a program isn’t static code, it’s the embodiment of a living creature that follows my instructions to a (hopefully) successful conclusion. I know computers don’t physically work this way, but that doesn’t stop my metaphorical machine.
Once you start thinking about computation, you start to see it everywhere. Take mailing a letter through the postal service. Put the letter in an envelope with an address and a stamp on it, and stick it in a mailbox, and somehow it will end up in the recipient’s mailbox. That is a computational process—a series of operations that move the letter from one place to another until it reaches its final destination. This routing process is not unlike what happens with electronic mail or any other piece of data sent through the internet. Seeing the world in this way may seem odd, but as Friedrich Nietzsche is reputed to have said, “Those who were seen dancing were thought to be insane by those who could not hear the music.”
This innate sense of a machine at work can lend a computational perspective to almost any phenomenon, even one as seemingly inscrutable as the concept of randomness. Something seemingly random, like a coin flip, can be fully described by some complex computational process that yields an unpredictable outcome of heads or tails. The outcome depends on myriad variables: the force and angle and height of the flip; the weight, diameter, thickness, and distribution of mass of the coin; air resistance; gravity; the hardness of the landing surface; and so on. It’s similar for shuffling a deck of cards, rolling dice, or spinning a roulette wheel—or generating “random” numbers on a computer, which just involves running some purposely complicated function. None of these is a truly random process.
The idea goes back centuries. In 1814, in his Philosophical Essay on Probabilities, Pierre-Simon Laplace first described an intelligence, now known as Laplace’s demon, that could predict these outcomes:
We ought to regard the present state of the universe as the effect of its antecedent state and as the cause of the state that is to follow. An intelligence knowing all the forces acting in nature at a given instant, as well as the momentary positions of all things in the universe, would be able to comprehend in one single formula the motions of the largest bodies as well as the lightest atoms in the world, provided that its intellect were sufficiently powerful.
The reverse implication is that for someone without a vast enough intellect, processes such as a coin flip would appear random. The language of computation lets us formalize this connection.
Earlier this year, Avi Wigderson received the Turing award, the “Nobel Prize of computing,” partly for formally connecting randomness with mathematical functions that are hard to compute. He and his colleagues created a process that takes a suitably complex function and outputs “pseudorandom” bits that can’t be efficiently distinguished from truly random bits. Randomness, it seems, is just computation we cannot predict.
Do we have a way to manage this randomness and complexity? The recent progress we have seen in artificial intelligence through machine learning gives us a glimpse into what it would mean to do just that. Information can be split into a structured part and a random part. Take English for example. There is an underlying complex structure that describes the language, and the sentences that society has produced over time are, in effect, a random sampling from that structure. Recent advances in machine learning have allowed us to take these random samples and recover much of the underlying structure underneath. Often that structure appears opaque, but we can still use it to simulate the random samples, generating new English sentences on demand.
Consider the problem of translation. Imagine a woman, Sophie, who grew up speaking English and French and now works as a translator. She can easily take an English text, fully understand it, and produce the equivalent in French. Computationally speaking, the machine in this case is Sophie’s brain, as it must follow some process that converts English into French. Sophie likely doesn’t understand the entire process, or even think of it as a process, but it’s happening nevertheless.
Suppose now we want to translate text on a computer. Simply using a French-English dictionary to translate word by word doesn’t work, as different languages have different structures, and words have different meanings in different contexts. Applying linguistic tools only goes so far; the computational process of understanding language goes beyond what we can describe.
Sophie understands the languages because she grew up in a bilingual household, being exposed to both languages and all their complexities. Machine learning takes a similar approach, training language models on large amounts of data. These models consist of a complex neural net, a collection of artificial neurons cleverly connected to each other, and these connections have associated weights, which alter the signals moving through the system. When properly trained, the neural net will predict the probability of the next word in a sequence being translated from English to French.
While we typically cannot understand the underlying process of a trained neural net any more than Sophie understands her complete translation process, we can easily simulate that process to get the probability of the next word. If the neural net is trained perfectly, it will be impossible to distinguish the probability of the next word it predicts from the probability of what Sophie would say. Just as Wigderson connected complexity functions and pseudorandomness, predicting the probabilities of the next word lets us capture the complex calculations behind it.
English and French are themselves “random” samples from a concept known as human language, and the newest tools have discovered enough of this underlying structure to allow reasonable translation even between relatively obscure languages.
The learning algorithms themselves are processes, and I feel the weights updating as we feed in more and more examples to train the models. The advances we have seen in machine learning over the past decades have helped us accomplish complex processes, from human ones like translation, vision, art and conversation, to biological ones like protein folding.
Machine learning models are still prone to mistakes and misinformation, and they still have trouble with basic reasoning tasks. Nevertheless, we’ve entered an era where we can use computation itself to help us manage the randomness that arises from complex systems.
I’ve been very lucky. I could build a research career around the machines that encompass the way I feel the world. I have found my calling—or, more precisely, it has found me. Whether you hear the music, the algebra, computation, biology, magic, art, or some other way of understanding the world, listen to it. Who knows what secrets you may learn?
Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.