Geoff-Hart.com:
Editing, Writing, and Translation

Home Services Books Articles Resources Fiction Contact me Français

You are here: Articles --> 2007 --> Editorial: Examining our assumptions
Vous êtes ici : Essais --> 2007 --> Editorial: Examining our assumptions

Editorial: Examining our assumptions: lies, damned lies, and statistics

by Geoff Hart

Previously published as: Hart, G. 2007. Editorial: Examining our assumptions: lies, damned lies, and statistics. the Exchange 14(4):2, 9–11.

Every so often, it's worth our while to take a large step back from what we're doing and ask a few important questions. To me, the most difficult one to answer requires an unusual form of introspection. Ask yourself the following question: To someone watching us "from the outside", without any stake in what we're doing, what assumptions do we appear to be making? Then there are important and equally challenging follow-up questions:

If I tell you that these questions originate in the field of cultural studies, you might be tempted to stop reading right now, but stick with me a little longer. Among other things, cultural studies research is based on the assumption that our thoughts and actions are strongly shaped by the context (the culture) in which we create, interpret, and build upon those thoughts. Academics with deep roots in cultural studies and related fields such as social construction are frequently mocked by science communicators because some of them—oblivious to the irony of their position—appear to consider themselves somehow above the constraints of social construction and thus free to remain ignorant of their own assumptions when they critique our work. Most have a more balanced view, and if you want to find out more about why I believe we have much to learn from them, check out my recent article on the subject (Hart 2007). In the context of this editorial, the important point is that we science folk also subscribe to certain constraining assumptions that are, perhaps, equally worthy of mockery and re-examination.

Modern science represents the triumph of mathematics and abstract reasoning over straightforward observation. Whereas the naturalists of Darwin's day could still achieve fame and shape the course of their field solely by observing nature and reporting what they saw, the modern scientist will have a hard time presenting their data in a peer-reviewed journal, at a conference, or in any other respectable forum unless they can support their observations with a body of hard data that is both large and robust, supplemented by rigorous statistical testing of the probability of each experimental outcome being real rather than merely the result of unfortunate coincidence. On the whole, this change has been a good thing. Among other things, it has replaced I think with I have reason to think, and has forced us to develop new ways to test whether what we think we have found is what we really found.

It's worth interrupting with a brief historical footnote: even in Darwin's time, observation alone was not sufficient. Researchers such as Gregor Mendel, famed as the father of genetics, relied heavily on large quantities of accumulated data on which to base their hypotheses and theories. The difference between Mendel's time and our own lies in the development of a science of statistics. Not only do we collect data, but we also collect data about that data. The power of statistics is that it tells us not just what we think, but also how strongly we have reason to believe what we're thinking. Statistics, according to the prevailing modern dogma, makes knowledge and our confidence about the reality of that knowledge objective.

Or so we tend to assume. In fact, it would be more accurate to state that statistics makes us more objective, or perhaps less subjective. For example, most statistical tests are considered to produce significant (i.e., real) results at a probability level of 5%; that is, statisticians have made the arbitrary and entirely subjective decision that if an experimental result could have occurred by chance only 5% of the time, we can be reasonably confident that the result is real. But why not 10% or 1%? Indeed, 5% means half the risk of a random result implied by a significance level of 10%, but compared with a 1% probability level, represents five times the risk of not having found something real. Even the concept of relying on a percentage scale is a subjective and unfounded assumption: surely a scale of 1 to 1000 would be even better because of the higher level of precision it affords? Clearly, the importance of statistics is not, as some seem to think, that it provides certainty. Rather, its importance lies in its ability to provide a consistent, standardized, mathematically rigorous measure of our uncertainty.

There's also a cliché that science is an objective process and that scientists are more objective than non-scientists. Though this cliché is broadly true, it's also a myth that we have created about our profession. Consider even the limited example of statistics, and you'll see both the assumption and how over-reliance on that assumption can lead us astray. The assumption is that because the tools of mathematics allow for no subjectivity, statistics eliminate subjectivity: a number is a number, independent of the observer, and the same set of data will produce the same statistical results no matter who analyzes the data. This is true as far as it goes, and it's particularly true for the clean data generated by a rigorously designed experiment.

The problem is that most data are not clean. Results that are perfect or "too good to believe" are a common red flag that suggests not everything is kosher; for example, a graph with a series of data points that fall almost exactly along a mathematical equation is unlikely to represent real data outside of the relatively predictable worlds of physics and chemistry. Outside the lab, human error, random variations in the environment in which data are collected, errors inherent in the measurement devices and the process by which we use them, and factors unaccounted for in designing an equation (such as neglecting to include certain key parameters) introduce variation into our measurements and predictions. The better the researcher and the equipment, the better our knowledge of the physical processes, and the more carefully controlled the experiment, the smaller this variation will be, but particularly in early experiments—those first tentative explorations of a particular dark area that is as yet unilluminated by knowledge—some variation is inevitable. Indeed, in most graphs of experimental results, you'll actually see a cloud of data, concentrated to a greater or lesser degree around a trend line that nominally describes (in objective mathematical form) the process that was observed. Equally often, you'll see an occasional data point that lies far outside the main cloud of data points. These points are called outliers, though sometimes outright errors might be a more honest description.

In messy sciences such as biology, where seemingly infinite factors in the environment of the phenomenon being studied can vary, thereby influencing the results, researchers tend to ignore outliers on the assumption that they represent random errors, usually because some uncontrolled aspect of the environment exerted a stronger than usual influence on the system being observed. Because that influence is unusual (witness its absence in the cloud of points lying nearer to the trend line), they assume that it can be safely ignored. Often this is a reasonable assumption, and the correct one, but it's no less an assumption despite its reasonableness, and because it's an assumption, it bears further examination.

Careful scientists recognize this, and most concede that before one can ignore an outlier, it's necessary to at least attempt to provide a reasonable explanation. After all, those factors that are unaccounted for in the equation bear watching because they can sometimes surprise us. Physics, in marked contrast with biology, is considered a clean science because it's so much easier to control the external variables than it is in messy biology. Because of this degree of control, physicists know that the part of the universe they're observing usually functions with the smooth precision of a clock, and that any deviations from that smooth function have a reason—and possibly an important one. In physics, researchers are more likely to recognize that outliers cannot simply be ignored, nor can they simply be "explained away". On the contrary, the explanation itself becomes an important hypothesis that must be tested in the hope that the results will reveal the cause of the discrepancy. Those results can sometimes lead to important findings that a less rigorous thinker might miss. For example, astrophysicists long believed that they had nailed down the basic workings of galaxies, until a few troubling "outliers" led them to identify the fact that galaxies are accelerating away from each other faster than originally predicted (http://en.wikipedia.org/wiki/Cosmic_acceleration), and that the vast majority of the mass in the universe has not yet been accounted for (http://en.wikipedia.org/wiki/Dark_matter).

Please note that I'm not suggesting any inherent superiority of physics over biology. As a former biologist, I clearly understand that the two fields of research operate under very different constraints, and that those constraints lead to very different working assumptions. Indeed, the physicist's belief in a clockwork universe is also an assumption—though by the evidence collected thus far, it's a darned good one. I'm using this comparison solely to help focus on the differences in the working assumptions and what those assumptions mean for the advancement of knowledge.

How does all this relate to scientific communication? For one thing, working with scientists all the time may lead us to make the assumption that it is sufficient for us to honor the conventions of science—hard data, logic, and an attempt to eliminate subjectivity from our conclusions and any decisions we make based on those conclusions. It's not. Except in the limiting case in which we are writing for scientists, this assumption leads us astray because we forget how different our real audience may be from our scientist colleagues. Arguments based solely on numbers and rigorous logic have great power, but cannot by themselves persuade audiences who don't share our assumptions about that power. Without understanding the emotional and cultural and historical contexts of our non-scientist audiences, we occasionally risk a disastrous communication failure.

A recent example makes this clear. When it was announced that scientists at the Relativistic Heavy Ion Collider (RHIC) of the Brookhaven National Laboratory expected that some of their experiments would create microscopic black holes (www.bnl.gov/rhic/black_holes.htm), they recognized that this would raise some concerns. If, as the popular stereotype suggests, black holes will devour everything in their vicinity, wouldn't those black holes then devour the accelerator, the scientists, the lab, and shortly thereafter, the entire planet and everyone on it? "No," the scientists blithely replied. "There's only a very small chance of that happening."

I'll pause here for a moment while you digest that "reassuring" statement.

What the scientists meant, of course, was that the laws of physics make it inevitable that such small black holes will "evaporate" long before they attain enough mass to become self-sustaining. Thus, based on their confidence that they understand the laws of physics at that scale, they felt that there was little fear that any other process could intervene and lead to an insoluble and quite fatal problem for our world. Of course, as I've noted above, there are always outliers in the data, and in physics, those outliers represent opportunities to improve our understanding of the universe by revealing new phenomena. Being good physicists, they felt it was important to explicitly acknowledge this possibility and the fact that there was a small chance they were wrong and that something... interesting... might happen. The consequences of that something being unthinkable to anyone but a physicist overjoyed at the prospect of new horizons of discovery appears to have escaped them.

As scientific communicators, it's exactly this kind of problem we must be aware of and must explicitly confront. In any effort to communicate, we must remain very aware of our assumptions and the possibility that those assumptions may lead us astray—possibly not to the extent that our world is devoured by a black hole, but occasionally to the extent that the audience reaction will prove to be an unpleasant surprise, as the RHIC physicists discovered.

References

Hart, G. 2007. Bridging the gap between cultural studies theory and the world of the working practitioner. KnowGenesis International Journal for Technical Communication 2(3):14–31.


My essays on scientific communication have now been collected in the following book:

Hart, G. 2011. Exchanges: 10 years of essays on scientific communication. Diaskeuasis Publishing, Pointe-Claire, Que. Printed version, 242 p.; eBook in PDF format, 327 p.


©2004–2017 Geoffrey Hart. All rights reserved