The mummy returns —

After 3,000 years, we can hear the “voice” of a mummified Egyptian priest

It's a single vowel sound, not a running string of speech. But it's a start.

The mummy of Nesyamun, a priest who lived in Thebes about 3,000 years ago, is ready for his CT scan.
Enlarge / The mummy of Nesyamun, a priest who lived in Thebes about 3,000 years ago, is ready for his CT scan.
Leeds Teaching Hospitals/Leeds Museums and Galleries

Around 1100 BC, during the reign of Ramses XI, an Egyptian scribe and priest named Nesyamun spent his life singing and chanting during liturgies at the Karnak temple in Thebes. As was the custom in those times, upon death, Nesyamun was mummified and sealed in a coffin, with the inscription "Nesyamun, True of Voice (maat kheru)." His mummy has become one of the most well-studied artifacts over the last 200 years. We know he suffered from gum disease, for instance, and may have died in his 50s from some kind of allergic reaction. The coffin inscription also expressed a desire that Nesyamun's soul would be able to speak to his gods from the afterlife.

And now, Nesyamun is getting his dearest wish. A team of scientists has reproduced the "sound" of the Egyptian priest's voice by creating a 3D-printed version of his vocal tract and connecting it to a loudspeaker. The researchers revealed all the gory details behind their project in a new paper in Scientific Reports.

"He had a desire that his voice would be everlasting," co-author David Howard of Royal Holloway University of London told IEEE Spectrum. "In a sense, you could argue we've heeded that call, which is a slightly strange thing, but there we are."

Studying vocal tracts is a very active area of research. For instance, voice scientist Ingo Titze of the University of Utah has experimented with the excised vocal tracts of lions and Siberian tigers. (The body parts were acquired from animals who died from natural causes at various zoos.) In one memorable 2006 experiment, Titze mounted an excised tiger larynx—which is three times the size of a human vocal tract—onto a lab bench for a series of experiments.

In 2016 Italian scientists reconstructed Ötzi the Iceman’s vocal tract.
Enlarge / In 2016 Italian scientists reconstructed Ötzi the Iceman’s vocal tract.
Rolando Fustos

One experiment involved blowing air through the structure while taking CT scans. From that, Titze was able to build a computer model capable of simulating the four types of tiger vocalizations: the roar, growl, moan, and "prusten." Sure, the computerized vocalizations sounded more like a cow experiencing extreme gastric distress, but such studies still yield insight into the intricacies of how vocal tracts function across different species.

There have been multiple studies placing human singers into MRIs to monitor the mechanics at play, including showcasing polyphonic singing and how the larynx changes to produce different singing styles. For instance, in 2016, German baritone Michael Volle performed "Song to the Evening Star" from Wagner’s Tannhäuser during an MRI scan. More relevant to the current paper: also in 2016, a team of Italian researchers reconstructed Ötzi the Iceman's vocal cords and used it to reproduce what his voice may have sounded like. (He mostly sounded like he was burping.) Many prior attempts have attempted to recreate the voice of an ancient person using software to animate a reconstructed image of the person's face, yielding a good approximation of what they might have sounded like.

But according to Howard, no living person today has been able to hear the sound of human speech prior to the earliest audio recordings in the mid- to late-19th century. That's one reason he and his colleagues chose Nesyamun as their subject, encouraged further by the priest's clear desire to have his voice live on. "Given Nesyamun's stated desire to have his voice heard in the afterlife in order to live forever, the fulfillment of his beliefs through the synthesis of his vocal function allow us to make direct contact with ancient Egypt by listening to a sound from a vocal tract that has not been heard for over 3,000 years," the authors wrote.

Human beings produce sound via the vocal cords, or folds. Air from the lungs passes through, and the folds vibrate to produce sounds that are subsequently modified by the shape of the vocal tract. The positions of the lips and tongue, and the soft palate, also influence what kinds of sound are produced.

This latest work is based in part on Howard's development of a "vocal tract organ" in 2013, a device that plays vowel sounds through a 3D-printed replica of a larynx. It caught the attention of an archaeologist at the University of York, co-author John Schofield, and an interdisciplinary project was born. Nesyamun's mummy was the perfect choice because it was remarkably well-preserved, and the shape of his vocal tract was largely intact, although the actual tissue had dried up.

First, Howard et al. took CT scans of the mummy, using those images to create a digital model, which dictated the shape of the 3D printed vocal tract. Then they synthesized an input signal, relying on modern speech synthesis, and played that through a loudspeaker into the artificial larynx. There were some additional touches, such as adding a coupling cylinder to connect the end of the larynx with the loudspeaker and tailoring the resulting frequency range to something akin to a male voice's falling intonation.

In this way, Howard et al. were able to reproduce one sound in particular, falling somewhere between the English language vowel sounds "bed" and "bad." (The BBC likened the sound to a sheep's bleat.) It's a bit buzzy sounding, given that the vocal tract is made of plastic, but Howard maintains that this is nonetheless very close to how the priest would have uttered those same vowel sounds, citing prior studies using 3D vocal tracts of living people. "I can recreate my vocal tract and then you can hear it next to me and tell if it's similar or not," he told IEEE Spectrum. "The answer is: It is. We are using that fact to transpose this back 3,000 years and say we have something like Nesyamun would have sounded."

This is certainly very cool research, tailor-made to generate irresistibly clickable fodder for our Internet age. But a few caveats are in order. Howard acknowledges, for instance, that their 3D model is based on the priest's vocal tract shape while lying down, not standing in the temple chanting or singing. The neck is tilted backward, the tongue is positioned on top of the lower teeth, and there isn't any actual air passing through (unlike Titze's experiments with the Siberian tiger larynx). And even though Nesyamun's vocal tract was very well preserved, he had no soft palate, and the tongue had atrophied.

"One day it will be possible to produce words that are as close as we can make them to what he would have sounded like.”

"Overall, I think it was a well-made study," Daniel Aalto of the University of Alberta—who was not involved in the study—told Scientific American. "[But] what makes a voice recognizable in humans—and what creates our unique voice—is not only the vocal tract shape but how we are using our vocal cords." The audio in the current study was created via a mechanical source, not living vocal cords. Thus, it's still only an approximation of Nesyamun's voice.

“Even if we have the precise 3-D-geometric description of the voice system of the mummy, we would not be able to rebuild precisely his original voice,” speech scientist Piero Cosi, of the Institute of Cognitive Sciences and Technologies in Italy, told The New York Times. Cosi was a member of the team that reproduced the "voice" of Ötzi the Iceman.

The next step in the project is to replicate the sound of Nesyamun chanting, which would require precise computational manipulation to change the shape of the vocal tract as needed—possibly even having him "speak" the original words inscribed on the priest's coffin. “He certainly can’t speak at the moment,” Howard acknowledged to the Times. “But I think it’s perfectly plausible to suggest that one day it will be possible to produce words that are as close as we can make them to what he would have sounded like.”

DOI: Scientific Reports, 2020. 10.1038/s41598-019-56316-y  (About DOIs).

Channel Ars Technica