Monday, February 3, 2020

Yet another decline effect in the deception cue literature

Last year, Lessons from Pinocchio, my paper critically reviewing the literature on deception cues, was published (Luke, 2019). Much of that paper involved reexamining data from the widely-cited meta-analytic review of cues to deception by DePaulo et al. (2003). I won’t summarize the whole paper here, but one of my many observations about the deception literature was that the absolute values of the meta-analytic effect sizes for deception cues were negatively related to the number of studies in which that cue had been studied, as well as the total number of participants with which that cue had been studied (see Bond, Levine, & Hartwig, 2015). In other words, the more a cue is studied, the less strong it tends to be. It is easy to see this pattern when plotting the total sample size used to study each cue against the cues’ effect sizes.

As I argued in Pinocchio, there is really no good reason for meta-analytic effects for deception cues to be distibruted like this. Since this isn’t a funnel plot, the effects should not necessarily, for a priori reasons, be getting more tightly grouped as the sample sizes increase. Once possible explanation for this is that the deception cue literature has been sampling from statistical noise. That would explain why there is a lot of spread in the effects at small sample sizes, and at larger sample sizes, the effects are tightly clumped around zero.

A few months ago, Tim Levine aptly pointed out to me that these estimates include a lot of imputed values for nonsignificant effects that could be calculated precisely. In these cases, DePaulo and her colleagues imputed d = 0 (or d = -.01 or .01 in cases where the direction could be discerned). Thus, the effects would naturally tend to move toward zero when more of these imputed values are added.

If we remove those imputed values, this is what the plot looks like:

You can see the same basic pattern, but it’s substantially less clear-cut than when using the full dataset. For example, two of the most studied cues get very large indeed. It’s important to bear in mind, however, that we’ve lost a substantial amount of information here by excluding effects that were too small to be significant. By only plotting the effects that could be extracted precisely, the effect sizes are all biased upward in this dataset. Nevertheless, we still see a tendency for effects to decline at higher samples, and the largest effects are virtually only observed at smaller samples.

Indeed, if you plot these effect sizes over the “Land of Toys” simulation from Pinocchio that assumes no true cues to deception and no publication bias, we can see that the distribution continues to overlap substantially. For some added perspective, bear in mind that this simulation assumed all researchers were on their very best behavior and reporting everything without bias. There is plenty of evidence that is not the case in the deception cue literature.

Recently, I was poking around in the data from DePaulo et al. (2003) again, and I was fascinated to discover that a pattern similar to what we see above occurs within studies. There is a negative relationship between the number of cues a study measured and the absolute value of its across-cue mean effect size. That is, the more cues a given study measured, the closer the average of all its cue effects was to zero. You can see this in the figure below.

Each point in this figure is a sample from DePaulo et al. (2003). Its mean effect size (calculated across all the cues it measured) is plotted on the horizontal axis, and the number of cues it measured is plotted on the vertical axis. The pattern is striking. With few exceptions, large within-study average effects are only observed when the number of cues is relatively small.

This pattern mostly holds even when we remove data the imputed effect sizes. Again, bear in mind that with this dataset, the effects are virtually guaranteed to be overestimates.

As with the cue-level analysis, there is no a priori reason to expect this relationship between the number of cues in a study and the average effect in that study. The only plausible explanation I can think of for this pattern is that deception researchers have been studying effects that are mostly or entirely statistical noise. That is, if you assume you’re sampling noise, when studying just one or a few cues, it is possible to observe large spurious effects, as a function of sampling error. But as you add cues, the average effect you observe converges on zero, since the random errors cancel each other out in the long run. This pattern in the empirical data isn’t positive evidence that all deception cues are noise (or are, at best, very small), but it is yet another observation that is consistent with that possibility and difficult to explain otherwise.

I didn’t report these patterns in Pinocchio because I hadn’t noticed them until recently. But the previous conclusions hold. In fact, the situation looks even bleaker than I had previously described.


Bond, C. F., Levine, T. R., & Hartwig, M. (2015). New findings in nonverbal lie detection. Deception detection: Current challenges and new directions, 37–58. Wiley Online Library.

DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74–118.

Luke, T. J. (2019). Lessons from pinocchio: Cues to deception may be highly exaggerated. Perspectives on Psychological Science, 14(4), 646–671. doi: 10.1177/1745691619838258

1 comment: