Sense & Falsifiability

This post is my attempt to make sense of the concept of falsifiability as concretely as possible, and to determine the merit of different types of unfalsifiability accusations against evolutionary psychology.

Acknowledgements: At one point, the Buss lab was trying to publish a paper responding to critiques of evolutionary psychological hypotheses as unfalsifiable. Though we ultimately decided to give up on publishing the paper, I became quite attached to my sections, so I decided to publish them here. Many thanks to David Buss, Courtney Crosby, and Patrick Durkee for all the work they put into this paper and their edits to the sections included here.

Criteria for Falsifiability

In the simplest sense, a statement is falsifiable if and only if another statement can logically contradict it. In a scientific context, one way to frame this criterion is that any given hypothesis must have negative predictions as well as positive: some things must not be the case if the hypothesis is true. Philosopher of science Karl Popper, popularizer of the falsifiability criterion (Popper, 1959), referred to hypotheses with this characteristic as prohibitive. But in order to explain how to evaluate whether a hypothesis is prohibitive, I need to clear up a poorly understood concept: the distinction between predictions and hypotheses.

Despite the term’s inconsistent use in practice, most science textbooks recognize a hypothesis as a tentative explanation for an observed phenomenon (Gibbs & Lawson, 1992). In other words, it is a description of a potential causal mechanism. A quick way to check if this definition applies is to try to formulate the hypothesis with some version of “because.” For example, take this Lamarckian hypothesis:

[H1] Species change over time because traits acquired during an individual organism’s lifetime are passed on to their offspring.

“Species change over time” is an observed phenomenon, for which “traits acquired during an individual organism’s lifetime are passed on to their offspring” is offered as a potential causal explanation.

Based on the simple definition of falsifiability offered above, we run into a problem when evaluating certain hypotheses: weak or underspecified hypotheses are not easy to logically oppose with concrete observations in the form of basic statements. For instance, the hypothesis of inheritance of acquired characteristics formulated in H1 says only, “traits acquired during an individual organism’s lifetime;” it doesn’t specify how the traits may be acquired. Thus, H1 would not be directly contradicted by observing that, for instance, experimentally manipulated traits aren’t passed onto offspring. The Lamarckian could argue that traits will only be passed on if they occur as the result of endogenous changes, in which case the failed results of the experiment could be dismissed. It is easy to see how the goalposts could be continually shifted to accommodate such an imprecise hypothesis.

But what is it about a vague, imprecisely framed hypothesis that makes it difficult to refute directly with evidence? The relevant feature is that its imprecision doesn’t allow it to logically entail predictions that can be contradicted. While a hypothesis is a tentative explanation, predictions ideally take the form of explicit, concrete statements about what will happen in a highly specific context (e.g., an experiment) if the hypothesis were correct. The distinction and relationship between hypothesis and prediction is crucial because a refuted prediction, by itself, can only tell us about observable facts. Counterintuitively, the strength of the scientific method does not lie in the accumulation of empirical observations. Its strength is in leveraging those controlled observations to inform causal explanations for broader observable phenomena — explanations that are, by their nature, unobservable on their own. If we could directly observe how everything in the universe worked, we wouldn’t need science.

So, hypotheses that make concrete predictions are the backbone of science. The hypothesis:

[H2] Species change over time because traits that are altered by exogenous influences during an individual organism’s lifetime are passed on to their offspring.

is more precise than the previous formulation, H1 — but its precision affects falsifiability in a specific way. The new formulation H2 now logically entails the prediction that, if a trait is influenced by the environment, the organism’s offspring will have the modified version of the trait. As a consequence, the formulation in H2 would be contradicted by an observation that an experimentally induced change was not passed on, since the hypothesis’ inherent prediction cannot logically coexist with the observation — and therefore with the hypothesis.

Alternatively, we could have made H1 more precise but in a different way:

[H3] Species change over time because traits that an organism develops spontaneously in response to novel environmental stimulus during their lifetime are passed on to their offspring.

But had we made this alteration, we would be faced with the verification problem that initially inspired Karl Popper to devise his falsifiability criterion. Popper illustrated the flaw in verificationism by pointing out that the statement “all swans are white” can never be empirically verified (though it can be falsified by finding one black swan). Similarly, we will never be able to say for certain that no organism can pass on such an acquired trait. As long as we never observe an event where an organism spontaneously acquires a trait in response to a novel environmental stimulus, we will never contradict H3. Thus, while H3 may sound more precise than H1, this change would not improve the falsifiability of the hypothesis because it does not add a prediction that can be negated by collectable evidence. Precision in a hypothesis is necessary — though not sufficient — to guarantee that empirical predictions will axiomatically follow from it.

So, in order for a hypothesis to be falsifiable, its predictions must (a) logically follow from the hypothesis that generated them, and (b) be explicit and concrete in such a way that they can be contradicted unambiguously by feasibly collectable evidence. The second criterion is simple to assess: reverse the prediction and ask whether the reversal is a possible observation in the experiment. If so, the prediction is in good shape. While slightly more abstruse, the first criterion could be assured equally easily through disproof by contraposition: asking, “If the prediction turns out to be empirically false, could the hypothesis still be true?” If the answer to that question is no, then the hypothesis is falsifiable. This drives home the importance of hypotheses being prohibitive — generating predictions that would contradict the hypothesis. When a hypothesis is prohibitive, empirical data should also be able to discriminate it from other hypotheses. But when a hypothesis can generate few negative predictions, it becomes difficult to falsify.

In summary, a hypothesis should be considered falsifiable if its veracity hinges on the accuracy of the concrete predictions that are entailed by it. This may sound obvious, but these criteria are often not met in practice. As a result, some hypotheses can be saved from nearly any empirical test. And when predictions and hypotheses are conflated, either the “explicit, concrete, and specific” criteria are left at the wayside, or the predictions are not tied clearly enough to their hypotheses.

Complications to Falsificationism

Unfortunately, like many neat and tidy explanations, the preceding section is an oversimplification. Earlier I mentioned that it would be easy to shift the goalposts to accommodate weak hypotheses, and this is often done with auxiliary hypotheses. Auxiliary hypotheses generated to qualify the predictions of a seemingly falsified hypothesis are often viewed with disdain, and rightly so in some cases. But despite the ferocity with which critics assail the falsifiability of isolated hypotheses, the truth is that auxiliary hypotheses are and must be routinely postulated in all areas of science to explain anomalies and progress in our scientific knowledge.

In 1970, Imre Lakatos logically demonstrated that the results of any given hypothesis test cannot invalidate the focal hypothesis itself without first ensuring — to a reasonable degree — that no sensible extraneous factors influenced the results. This may sound obvious, but he showed also that the process of establishing this ceteris paribus clause in a hypothesis test necessarily involves (a) tentatively accepting the focal hypothesis and (b) generating auxiliary hypotheses.[i] Certainly, when an auxiliary clause is used to explain away falsifying observations without improving a theory’s predictive or generative ability, the result is not a revised or replaced hypothesis: it is a post-hoc rationalization disguised as an auxiliary hypothesis. But so long as they improve and expand upon the overarching theory and are themselves ultimately tested empirically, it is counterproductive to denounce the proposition of auxiliary hypotheses that allow a reevaluation of the relationship between the focal hypothesis and the empirical result.

A good demonstration of the proper use of auxiliary hypotheses comes from particle physics. At one point in time, the phenomenon of beta decay threatened to undermine the explanatory framework behind the law of conservation of energy, since beta decay is the decay of a neutron into a proton and electron that, together, weighed less than the neutron. However, Wolfgang Pauli and later Enrico Fermi generated the auxiliary hypothesis that the phenomenon was explained by the existence of what would ultimately become known as the neutrino (Bilenky, 2013). This auxiliary hypothesis was then tested and supported over the course of the following decades. Had the scientific process not made room for auxiliary hypotheses, a highly accurate model of the world would have been thrown out with the bathwater. Popper (1959) recognized the importance of auxiliary hypotheses, to the extent that these hypotheses could be tested. When they improve and expand upon overarching theory and are themselves tested empirically, auxiliary hypotheses are desirable and necessary for scientific progress.

Misconceptions About Unfalsifiability in Evolutionary Psychology

Though many critiques of evolutionary psychology related to falsifiability are perfectly valid, several implicit misconceptions about evolutionary hypotheses may guide some of these accusations. Schaller and Conway (2000) note that scientists sometimes intuitively mistake unverifiability for unfalsifiability. For example, while it is certainly true that some claims about events millions of years in the past cannot be verified,[ii] such claims can still make predictions about what should be the case today. But hypotheses about the past are no less falsifiable than hypotheses about unobservable proximate causal mechanisms — which, as I pointed out earlier, are the main purview of science.

Beyond the issue of unverifiability of past events, critics of evolutionary psychology sometimes seem to view the replacement of one adaptationist hypothesis with another as post-hoc goalpost shifting. This is especially true in mainstream psychological articles that pit a single evolutionary or “biological” hypothesis against numerous “non-evolutionary” hypotheses (e.g., Bourgeois & Perkins, 2003): the idea that there is a single evolutionary or biological account for any given phenomenon is a prevalent misconception. For example, though the kin altruism hypothesis of male homosexuality has been largely falsified, the front-runner in explaining the evolution of male homosexuality is now the sexually antagonistic selection hypothesis, which suggests that male homosexuality is related to genes that increase fitness when they occur in female relatives (Zietsch et al., 2008). These hypotheses are distinct, but some critics of evolutionary psychology treat hypothesis revisions and replacements like these as attempts to sidestep disconfirmation.[iii]

Another source of perceptions of unfalsifiability in evolutionary psychology is the premature impression of “just-so-storyism.” In 1978, Stephen Jay Gould borrowed the title of Kipling’s “just-so stories” — a series of fanciful accounts of the origins of animal traits — to characterize the speculative, almost narrative quality of many contemporary adaptationist hypotheses. This comparison highlighted the pitfall of generating plausible adaptive explanations without proper scientific verification. But the problem with just-so stories from a scientific perspective is not that they are fanciful accounts of the past: the problem is that they remain speculative or make no testable predictions. I agree with Gould and Lewontin (1979) that those who pose untested or untestable hypotheses as certitudes or assumptions are behaving unscientifically. Yet any sufficiently imaginative hypothesis in any domain of science is a just-so story only until it is tested.

The just-so story critique often conflates “the process [and] products of scientific inquiry” by equating the prolific generation of competing hypotheses in the early stages of scientific inquiry with unscientific speculation (Ketelaar & Ellis, 2000). With respect to evolutionary hypotheses in particular, Schaller and Conway (2000) suggested that this misconstrual ignores the distinction between accounts about the distant past and the claims about our modern psychology that are derived from those ultimate accounts — illustrating the importance of the distinction between hypothesis and prediction. An adaptive explanation for a modern phenomenon will be supported or refuted by tests of predictions about what would currently be the case if that adaptive explanation were true, as compared to the predictions of other explanations on a similar timescale. Although it is true that some historical claims are often unverifiable, the proximate hypotheses abduced from these historical accounts are on equal footing with any proximately derived psychological hypotheses.

Some hypotheses may be superficially convincing but not scientifically testable; others sound far-fetched but may ultimately be empirically fruitful. In either case, any far-reaching explanatory framework “can be handled scientifically or unscientifically” (Godfrey-Smith, 2003, p. 71). To handle a theory scientifically is to expose its tenets to refutation. But the human mind is drawn to information that supports our favored hypotheses (Klayman, 1995), so we are fighting our very nature in our scientific quest. Scientists in all domains and from all perspectives must dedicate their research efforts to honestly attempting to falsify specific empirical predictions from the hypotheses they generate. Your goal as a scientist should not be to prove yourself right, but to prove yourself wrong.


Bilenky, S. M. (2013). Neutrino. History of a unique particle. The European Physical Journal H, 38(3), 345-404.

Bourgeois, M. J., & Perkins, J. (2003). A test of evolutionary and sociocultural explanations of reactions to sexual harassment. Sex Roles, 49(7-8), 343-351.

Gibbs, A., & Lawson, A. E. (1992). The nature of scientific thinking as reflected by the work of biologists & by biology textbooks. The American Biology Teacher, 54(3), 137-152.

Godfrey-Smith, P. (2003). Theory and reality: An introduction to the philosophy of science. University of Chicago Press. Retrieved from”>

Gould, S. J. (1978). Sociobiology: the art of storytelling. New Scientist, 80(1129), 530-33.

Gould, S. J., & Lewontin, R. C. (1979). The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme. Proceedings of the royal society of London. Series B. Biological Sciences, 205(1161), 581-598.

Ketelaar, T., & Ellis, B. J. (2000). Are evolutionary explanations unfalsifiable? Evolutionary psychology and the Lakatosian philosophy of science. Psychological Inquiry, 11(1), 1-21.

Klayman, J. (1995). Varieties of confirmation bias. Psychology of Learning and Motivation, 32, 385-418. Academic Press.

Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and the growth of knowledge: Proceedings of the International Colloquium in the Philosophy of Science, London, 1965 (Vol. 4). Cambridge University Press.

Popper, K. R. (1959). The logic of scientific discovery. University Press.

Schaller, M., & Conway, L. G. (2000). The illusion of unfalsifiability and why it matters. Psychological Inquiry, 11(1), 49-52.

Zietsch, B. P., Morley, K. I., Shekar, S. N., Verweij, K. J., Keller, M. C., Macgregor, S., … & Martin, N. G. (2008). Genetic factors predisposing to homosexuality may increase mating success in heterosexuals. Evolution and Human Behavior, 29(6), 424-433.

[i] An observation that seems to be contradicted by your theory is actually contradicting the triple-conjunction of [(A) the overarching theory & (B) the known initial conditions (i.e., background knowledge) & (C) the statement that no other factors are involved (ceteris paribus clause)]. Once you rigorously investigate the known initial conditions, you can assume that (B) is TRUE. However, the contradictory observation from earlier means that the triple conjunction FALSE; and if the triple-conjunction is FALSE but one of the conjuncts is TRUE, then the double conjunction of the other two (A&C) must be FALSE:

~(A&B&C) [assumption]

~A v ~B v ~C [DeMorgan]

~B v (~A v ~C) [associativity]

B [assumption]

(~A v ~C) [disjunctive syllogism]

[ii] In colloquial use, “verified” is used quite differently than in science and philosophy of science. Verification here means empirical verification: in other words, being verifiable through the senses. Since the past is inherently unobservable, one can only make empirical (first-hand) observations about what is currently the case and, from those, infer things about the past. We know many things with a high degree of scientific certainty, but for the reasons explicated by Popper (1959), this is not the same thing as verification.

[iii] This confusion may stem from a narrow conception of the scope of evolutionary psychology, reducing it to specific mid-level theories (e.g., sexual strategies theory) or domains (e.g., mating). Alternatively, the conflation could arise from a view of distinct evolutionary hypotheses as two versions of the broader hypothesis that evolution by natural selection has had some direct causal influence on the construct in question.

Comments are closed