The puzzle that just isn't


Refuting: «Exchange Rates, Interest Rates, and the Risk Premium» by Charles Engel (AER, 2016, 106(2) pp. 436-74)


.... a scientist must also be absolutely like a child. If he sees a thing, he must say he sees it, whether it was what he thought he was going to see or not. See first, think later, then test. But always see first. Otherwise you will only see what you were expecting.

Wonko the Sane




The so-called forward premium puzzle is notorious for at least two reasons. First, it challenges the notion of efficient financial markets and second, it is very fugitive by nature. In a recent AER article Charles Engel has made a renewed effort to shed some light on this puzzle. In consideration of the largely inconclusive evidence for the puzzle in the first place solid proof of its existence appears imperative before moving on to explaining it. Based on a specific sample Engel does indeed present some evidence in favour of the puzzle. However, a simple robustness check reveals that this evidence almost entirely rests on a few observations more than thirty years in the past. Slightly varying the same data set to more recent and hence more relevant observations shows that the puzzle and with it basically all other results vanish.

Given the factual absence of the puzzle, an explanation seems very much unwarranted if not nonsensical. Instead, an explanation of the inconclusiveness of the empirical evidence itself is needed. Some of the explanation can be found here and here. Visit for a detailed reconciliation taylored to Engel's article coming up soon.


Recursive analyses of Engel's regressions


In the absence of an objective stochastic process the constructivist approach to economics seems the very appropriate. Econometric methods based on the positivist view are, by contrast, bound to fail to produce reliable, robust evidence for whatever theory is put forth. Therefore, the FAMA regressions presented in Engel's (2016) had to be expected to deliver only fragile, inconclusive evidence. The figure below shows that this indeed is the case. It displays the results of standard robustness checks by systematically varying the sample size in the Engel’s (2016) regression analyses.

  The puzzle that just isn't  
The figure displays estimated lower bounds for the 90-percent confidence intervals for the four key coefficients in Engel's (2016) regressions (4), (7), (8), (9) (from top). The various points correspond to different sample definitions. Any crossing of the zero line is evidence for non-robustness of the coefficient estimates. The lower bound for the key coefficient in (9) is multiplied by -1. Equations (7), (8), (9): bootstrapped percentile intervals.
  Figure: Recursive analyses of «Exchange Rates, Interest Rates, and the Risk Premium»  

The Figure illustrates that the lack of robustness must be conceded all key regressions (equations (4), (7), (8) and (9)) reported in tables 1 and 3 through 5 respectively in Engel (2016).

To that aim the sample start is varied between June 1979 and June 1984. Obviously, a conclusion drawn from a sample that covers 1979:6 through 2009:10 should be identical to the conclusion based on the sample 1984:6-2009:10 or any sample in between. Minor deviations of the results should be allowed for in one out of ten cases or so, depending on the actual level of significance.

The above figure shows, that these deviations are way too frequent, to be acceptable by any meaningful margin of error. The panels depict the lower bounds for the confidence intervals of the key coefficient estimates based on the interval bootstrap procedures if applicable. The confidence intervals cover 90 percent of the possible true parameter values according to the respective estimations.

All coefficients must to be positive with the exception of equation (9) for supporting the claims of the paper. Graphically, this means that their lower bounds are supposed to be above the zero line. In equation (9) the key coefficient should be negative which is why the lower bounds are multiplied by minus one in order to simplify the interpretation of the corresponding (bottom) panel.

Apparently, with the exception of equation (8) which corresponds to the results in Engel's (2016) table 4, all regression results are very much dependent on the specific sample definition. To see that, consider first the left-most points in all panels. They represent the results shown in Engel's article. Obviously, in the majority of cases the lower bounds are above zero and hence support Engel's line of arguments. As we drop one observation after the other, the evidence deteriorates, however.

Arguably, the second panel from the top shows the worst results. This panel checks the robustness of the FAMA regressions in real terms. With the full sample range (364 observations) the forward premium puzzle in real terms finds support with the exception of Canada and Italy (10 percent level of significance, interval bootstrap). Dropping 60 observations from the beginning of the sample (leaving 304 observations) makes all confidence intervals include the zero. Therefore, based on the sample 1984:6-2009:10 the forward premium puzzle in real terms disappears entirely.

The «central empirical finding» (p.450) of Engel (2016) is scrutinised in the bottom panel. Again, changing the sample definition topples the evidence. While the 1979 - 2009 sample supports the main finding at large (all cases except France and UK), the sample 1984 - 2009 clearly contradicts it (with the exception of Germany and - again - UK). In between, some countries' lower bounds drop below zero and bounce back and vice versa quite often. These multiple crossings of the zero line also safeguard against possible sample size effects. Dropping some observations should, in principle, widen the confidence bands but in the present situation that does not lead to a systematic loss of significance. In summary, the empirical evidence presented in Engel (2016) must be considered very fragile.



Choosing the paper's particular sample definition certainly permits the conclusions drawn by the author. In the light of the aggregate of evidence the whole argument falls apart, however.

Further reading



Christian Mueller, April 2016