The Case for Free-Range Lab Mice

A growing body of research suggests that the unnatural lives of laboratory animals can undermine science.
An illustration of lab mouse hiking in nature.
Illustration by Laurie Rowan

The experiment that became known as the Elephant Man trial began one spring morning, in 2006, when clinicians at London’s Northwick Park Hospital infused six healthy young men with an experimental drug. Developers hoped to market TGN-1412, a genetically engineered monoclonal antibody, as a treatment for lymphocytic leukemia and rheumatoid arthritis, but they found that in just over an hour, the men grew restless. “They began tearing their shirts off complaining of fever,” one trial participant, who received a placebo, told a London tabloid. “Some screamed out that their heads were going to explode. After that they started fainting, vomiting and writhing around in their beds.” The heads of some of the subjects swelled to elephantine proportions. Within sixteen hours, all six were in the intensive-care unit suffering from multiple organ failure. They had narrowly survived a potentially fatal inflammatory response known as a cytokine storm.

The trial grabbed headlines and sent a “shock wave” through the scientific community, as one of the developers of the drug later wrote. A subsequent review found a few sloppy medical records and an underqualified physician associated with the study, but nothing that could explain a central mystery: the drug had already been tested on rodents and monkeys. Lab animals had tolerated doses that—after adjusting for the animals’ weights—were five hundred times greater than the ones that nearly killed the young men. Why did animal experiments fail to warn scientists that TGN-1412 was dangerous?

Because so many of our genes are shared with other vertebrates, scientists have generally assumed that whatever harms lab animals is likely to harm humans, too. The Food and Drug Administration requires preclinical tests, traditionally on two species of non-human animals, before drugs can be tested on people. Yet a 2014 analysis of more than two thousand drugs found that animal tests were “highly inconsistent” predictors of toxic responses in humans and “little better than what would result merely by chance.” More than eighty per cent of novel drugs fail in Phase I and Phase II trials—when they’re first tried in healthy volunteers and patients—and others fail in Phase III, which are large-scale efficacy trials; as of 2009, these unsuccessful human trials were consuming seventy-five per cent of drug-research and development costs. Fifteen per cent of drugs, including blockbuster remedies for conditions such as depression and arthritis, turn out to have dangerous toxicities even after they’re approved by the F.D.A.

When lab-animal studies fail to predict human responses, scientists typically scrutinize them for mistakes (maybe lab workers contaminated cell lines; perhaps they failed to authenticate reagents) or blame the differences between species. “A mouse is not a person” has become a running joke. The problems with animal experimentation, however, go deeper than that: some studies of standardized lab animals can’t even be replicated on identically standardized lab animals. In 2012, a Nature paper revealed that scientists at Amgen, a multibillion-dollar biotech company, had spent a decade trying to repeat landmark animal studies and had succeeded only eleven per cent of the time. The following year, at an N.I.H. review board meeting, Elias Zerhouni, a pharmaceutical executive who had directed the N.I.H. during the Bush Administration, likened science’s reliance on lab-animal research to a mass hallucination. “We all drank the Kool-Aid on that one, me included,” he said. “It’s time we stopped dancing around the problem.” (Later, after an outcry from advocates of the biomedical-research industry, Zerhouni walked back his comments.)

The global animal-testing industry is worth billions of dollars and counting. Scientists experiment on some hundred and twenty million lab mice and rats per year. But, as the industry continues to grow, problematic results continue to emerge. Last May, European scientists reported in the journal PLOS Biology that they had conducted an identical experiment on identical mice in three separate labs. They found that the mice behaved differently in each setting, a result that they could only attribute to Rumsfeldian “interactions between known but also unknown factors we are not even aware of.” Can animal experiments still be trusted?

Scientists have been experimenting on animals for centuries to solve anatomical and physiological mysteries. In the twentieth century, researchers used animals to calibrate therapeutic doses: one “rabbit unit,” for example, was the amount of insulin required to produce convulsions in a rabbit. However, animals from the same species varied in their responses to drugs, in part because scientists acquired them from pet breeders and hobbyists. One study in the forties found that a batch of diphtheria antitoxin protected some guinea pigs from the disease, but not others, depending on whether they’d been reared on green vegetables or beets. The British Medical Journal published an article with the title “Wanted—standard guinea-pigs.”

Many mid-century scientists viewed lab animals as lower creatures, even automatons; some hoped to breed them into “pure” and “uniform” animals, as the geneticist Clarence Cook Little put it during a congressional hearing in 1937. They assumed that variation between animals was determined by genes and germs, so they bred mouse siblings with one another, shielded the mice offspring from a range of microbes, and then repeated the process for many generations of inbreeding. (James A. Reyniers, who was later nominated for the Nobel Prize in Physiology or Medicine, went so far as to surgically remove animals from the wombs of their mothers and rear them in airtight steel chambers; in 1949, Life published photographs of monkeys in his lab and declared, “The research possibilities are virtually limitless.”)

Commercial suppliers marketed lab animals to all manner of scientists—geneticists, immunologists, neuroscientists, oncologists—in thick catalogues that described their technical specifications as though they were test tubes or Bunsen burners. Standards for the certification and transportation of lab animals were codified by UNESCO. Experiments on standardized lab animals spread across the globe and led to new insights into human biology, accelerated the development of breakthrough medical products such as vaccines and cancer drugs, and earned lab-animal researchers dozens of Nobel Prizes.

Animal experiments rested on the notion that humans and other mammals are kindred creatures, but for many scientists that kinship was solely physical, not mental. They tended to dismiss the idea that animals have minds and emotions that are comparable to our own, which Charles Darwin argued in the nineteenth century, or that “each and every living thing is a subject that lives in its own world,” as the Estonian biologist Jakob Johann von Uexküll wrote, in 1934. Such beliefs were even caricatured as symptomatic of “zoophil-psychosis,” a supposed psychiatric condition defined in 1909 as “an inordinate and exaggerated sympathy for the lower animals” and the “delusion that they are persecuted by man.”

This may be why puzzling irregularities in early studies did not prevent lab-animal experiments from becoming an industry standard. A 1954 Nature paper, for example, reported that when scientists injected inbred mice with sedatives, the inbred mice took wildly different times to fall into a stupor, whereas hybrid mice reacted to the drugs within a more predictable window of time. Just because two mice have near-identical genes does not mean that they will develop the same physical traits, the authors wrote; they may even be “strikingly more variable” than genetically diverse mice. That same year, another paper reported that lab animals with nearly indistinguishable genes had dramatically different skeletal structures—a finding that British geneticist Hans Grüneberg vaguely blamed on “intangible factors” and “accidents of development.” But as long as animal studies were unlocking new biomedical insights and therapies, there were few incentives to contemplate the lives of lab mice.

The idiosyncrasies of lab animals garnered new attention after the explosive Amgen paper in Nature, in 2012. In a wave of subsequent papers, other scientists described failures to reproduce published research in medicine, psychology, and many other fields. In 2014, as concern about a “replication crisis” grew, a cover story in the medical journal The BMJ declared animal research a “shaky basis for predicting human benefits.” A growing body of evidence was suggesting that a variety of subtle, uncontrolled factors affected lab animals’ bodies and behaviors.

Rodents respond differently to experimental drugs depending on the levels of phytoestrogens in their chow—levels that can vary between different batches from the same vender. Their microbiomes, which contribute to their immune function, vary from vender to vender and from lab to lab. Many lab mice today come from an inbred strain known as C57BL/6, or Black 6, which originated with a pair who were mated in the nineteen-tens or twenties. Yet “there is no such thing as a Black 6 mouse,” Joseph Garner, a professor of comparative medicine at Stanford, argued recently when we spoke via Zoom. “There’s the Black 6 mouse in my lab, on my diet, in my cages, with my noise exposure, my light exposure, and my technician. And literally in the lab down the hall, the Black 6 mouse is different.” The dream of scientists like Little, of animals that had completely lost their individuality, never came true.

Standardized laboratory conditions turn out to affect the animals that scientists are trying to study, potentially distorting the results. According to a recent meta-analysis co-published by Georgia Mason, the director of the Campbell Centre for the Study of Animal Welfare, at the University of Guelph, who mentored Garner, the standard lab-mouse cage—a plastic container the size of a shoebox—sickens its inhabitants and increases their risk of death. These cages can make its inhabitants cognitively pessimistic, mess up their sleep, and reduce their physiological resilience, compared with rodents who are given the opportunity to burrow, explore, and exercise. Researchers have also found that mice experience a spike in stress hormones when their cages are moved, and their behavior can change depending on the height at which their cages are stacked. The ambient temperature in lab-animal facilities, though comfortable for humans, inflicts chronic thermal stress on rodents; Cindy Buckmaster, a former director of the Center for Comparative Medicine at Baylor College of Medicine, compared their experience to that of a human unclothed in forty-five-degree-Fahrenheit weather. Imagine a study in which subjects are chronically cold, sleep-deprived, inbred, and held captive in cramped conditions. If the subjects were human, the scientific establishment would dismiss such a study as not only unethical but also irrelevant to normal human biology. Yet, if the subjects were non-human, the study could be treated as perfectly valid.

Jeffrey Mogil is a neuroscientist at McGill University who studies pain perception. In 2010, he and his collaborators filmed mice before and after they received shots of pain-inducing acetic acid. They used the footage to develop a “Mouse Grimace Scale,” which uses mouse facial expressions to measure their level of pain. Then, in 2014, one of his postdocs told him about a strange occurrence in the lab. The postdoc had administered a pain-inducing chemical to lab mice, but the mice had failed to lick themselves in response. Then he turned his back to depart, and they started licking. “They were just waiting for me to leave the room,” he told Mogil.

The mice’s pain response, Mogil said, seemed to be more than a mindless reflex: they seemed to adjust it in response to a human’s presence. “People at meetings for a number of years had sort of whispered about this,” Mogil told me. In a series of subsequent experiments, his team observed fewer “pain behaviors” when a man—or even a T-shirt that a man had worn—was nearby. An editorial accompanying these findings noted their “extremely wide-ranging implications for physiological and behavioral research.” When Mogil went back and analyzed his past work, he found that in all his experiments, the mice had shown a higher threshold for pain when handled by male researchers. If so, animal studies of painkillers or drugs with painful side effects could contain systematic errors, simply because of the makeup of a laboratory’s staff.

We will probably never know all of the factors that influence the lives of lab rodents. Some rodents can sense magnetic objects and ultrasounds that go undetected by humans. Even animals that have lived indoors for hundreds of generations, under artificial light and in hermetically sealed cages, notice the seasons and adjust their behavior accordingly. “Are they detecting odors that come in through the outside air, and they can, like, smell the smell of new leaves?” Mason asked. “Is it something to do with sunspots? I’ve got no idea.” Animal-welfare specialists such as Mason and Garner argue that scientists should worry about these hidden variables not only because they could influence research but also because of their ethical implications. If even an inbred and microbially sterile lab mouse, living in isolation, has subjective experiences that go beyond human detection—something that Darwin and von Uexküll may have argued long ago—then perhaps we need to treat the lab animal as more than biological machinery. Perhaps we should see her as a sentient being entitled to our moral attention.

There’s no straightforward way to purge the literature of the innumerable lab-animal studies that in retrospect seem suspect. “It’s like the Titanic,” a retired pharmaceutical-industry insider, who requested anonymity to speak openly about problems in lab-animal research, said. “We found it, but getting it up from the bottom of the ocean is gonna be impossible.” Flawed studies may have smothered life-saving insights and interventions, because animal experiments rendered a misleading negative result. Others have likely sent researchers down scientific dead ends. “Young researchers could base an entire career on that path and then find out much, much later, ‘Well, wait a minute, I can’t replicate this,’ ” Buckmaster told me. “What if your whole career was based on what was found previously, and you’ve got hundreds of papers? Do you have to retract those papers? Do you lose your job? Do you lose your reputation and standing?”

Still, some critics of animal experimentation have difficulty imagining a future without any lab-animal experiments. When scientists try to understand complex biological systems by simplifying them—an approach known as reductionism—an experiment on standardized lab animals can sometimes work “beautifully,” Mason told me. Mice share a large majority of their genes with humans and suffer many of the same illnesses, and they can be bred cheaply and have a conveniently compressed lifespan. It’s true that many lab-animal experiments render indeterminate results, Buckmaster told me. But “there’s this sort of small percentage of really translatable stuff happening,” she said, which is why “millions and millions of people are still alive and healthy.” She argued that incremental refinements in study design and animal husbandry could boost that percentage, and also improve animal welfare.

Over time, scientists have adopted expensive and complicated safeguards to separate fact from fiction: increased sample sizes, elaborate statistical techniques, arduous peer review. Papers that survive this gantlet are subject to follow-up studies in other labs, using different protocols and different species. At the same time, government agencies have acknowledged the need to reduce reliance on animal experiments. The F.D.A. has set aside five million dollars in funding aimed at developing reliable alternatives to animal trials, such as in-vitro experiments and “organs on a chip,” which are three-dimensional organoids grown from human stem cells. The N.I.H. encourages biomedical researchers to use fewer lab animals and minimize their suffering, if it’s possible to do so without impinging on scientific integrity—and also recommends that the rest of us temper any starry-eyed expectations of sweeping scientific progress. “All research is not expected to translate to human treatments, as there is no perfect model,” a working group that advises the N.I.H. on animal research declared in a 2021 report. “Scientific process is as much about failure as it is about success.”

A few scientists are experimenting with a more radical approach. In 2013, Stephan Rosshart, then a postdoc at the N.I.H., tried to find out why animal experiments had failed to warn scientists about TGN-1412, the experimental drug in the Elephant Man trial. He wondered whether sterile lab conditions had stunted the animals’ immune systems, so he developed a new kind of lab mouse by trapping wild female mice, implanting lab-mouse embryos into their wombs, and raising the offspring in a remote quarantine facility in Poolesville, Maryland. His superiors probably saw him as “one crazy postdoc,” Rosshart told me—but his mice, which he dubbed “wildlings” after independent tribes in “Game of Thrones,” proved him right. When he exposed them to TGN-1412, the level of cytokines in their blood spiked, just as it had in the unfortunate volunteers at Northwick Park Hospital. His wildlings also correctly predicted the human response to an experimental sepsis treatment, which had appeared effective in standard laboratory mice but had failed when tested on sepsis patients. If researchers had these mice, they “could have potentially prevented these trials from happening,” Rosshart, who now directs the department of microbiome research at the University of Freiburg Medical Center, in Germany, said at a recent conference.

Early successes like Rosshart’s suggest a striking possibility—that in some cases, scientists could learn more about human biology from animals that live less sterile and more natural lives. A handful of scientists have co-housed lab mice with pet-store mice, or have given them fecal transplants from wild mice to naturalize their immune systems. At the University of Utah, the biologist Wayne Potts unleashed some of his lab mice into barn-like structures, where they can socialize and mate; so far, his cage-free mice have accurately predicted the health effects of high-fructose corn syrup, the statin Baycol, and the antidepressant Paxil. (They failed to predict side effects to a discontinued arthritis drug, Vioxx.)

Neither Rosshart nor Potts is calling for an end to traditional experiments on caged lab animals, which they still consider useful for reductionist research. Still, their vision of the uncaged lab animal acknowledges that lab animals have many of the same needs as humans—which could undermine the usual justifications for standardizing their bodies and their lives in the first place. What if the best animal to model our shared biology is one that lives a relatively unconstrained life, like we do? Garet Lahvis, a neuroscientist who abandoned animal experiments a few years ago, has urged his former peers to move lab animals into “research barns,” and to treat them as sentient beings, not “psychologically inert automatons.” Mogil, the McGill neuroscientist, told me, “I do see a day where we might be able to do experiments without ever interrupting the normal or semi-normal social life of the animals we’re testing.” Such animals would suffer less; humans might learn more.

On a gold-tinted morning last July, I visited Andrea L. Graham, a Princeton evolutionary immunologist, at her field site in the hills of central New Jersey. We stood by a waist-high fence that encircled a clearing in the woods. Graham, who wore a floppy hat and purple examination gloves, told me about the mice that live here. Just a few weeks ago, their cages were stacked in the basement of a biology building at Princeton. Graham had whisked them away in a white van and, in a process she describes as rewilding, released them into her grassy enclosure.

Graham’s team had risen at dawn to collect mice from traps inside the area. Afterward, Alec Downie, a gangly graduate student with a bushy red ponytail, sat at a folding table and inspected a small female mouse with soft, gray-black fur. “Yeah—this is non-standard procedure,” Downie told me.

As the mouse furiously sniffed the air, I thought about the smooth plastic walls within which she and hundreds of generations of her ancestors had spent their entire lives. In this clearing, she had smelled fresh air and felt rain on her fur for the first time.

Exactly how these experiences might have changed her was not immediately apparent. But Graham’s team would extract some of the mouse’s blood, test it in a field lab that they had set up in a nearby barn, and compare the results with the blood of laboratory mice that had never been outdoors. So far, Graham has found that even brief spells in natural settings transform the immune systems of lab mice, making them more like the immune systems of humans. The difference between a rewilded mouse and a lab mouse can be even more dramatic, she told me, than the difference between two lab mice from separate genetic lines. In a 2021 review paper, Graham called on her colleagues to venture beyond their immaculate labs and “go where the wild things are.”

When their field work wound down for the day, Graham’s team fetched cans of seltzer and retreated to folding chairs. As they relaxed, a sharp call emanated from the trees.

“That’s a weird noise,” Graham said quietly. “It sounds like a raptor.”

The group visibly tensed. A few put down their seltzers and craned at the foliage; one offered that the call might have come from a harmless blue jay. Graham, unconvinced, called out to any predators that might be nearby: “Go away!” But there’s only so much that can be done to protect a rewilded animal.

That day, despite the team’s best efforts, about a third of Graham’s free-range mice eluded recapture. Perhaps some were hiding in the leaf litter or had fallen victim to birds of prey. Later that week, the team would try to retrieve more. This was not the shining vision of limitless science that once appeared in Life magazine but a murky thicket in which animals themselves were helping to inscribe borders around what science can know. I imagined lab mice that had dug into the soft earth, tunnelled under the metal fenceposts, and slipped into the woods. They took their data with them, and were lost to science forever. ♦