Essays in Experimental Macroeconomics
vorgelegt von
Muhammed Bulutay, M.Res.
ORCID: 0000-0002-8148-8815
an der Fakult¨at VII - Wirtschaft und Management
der Technischen Universit¨at Berlin
zur Erlangung des akademischen Grades
Doktor der Wirtschaftswissenschaften
(Dr. rer. oec.)
genehmigte Dissertation
Promotionsausschuss:
Vorsitzende: Prof. Dr. Dorothea K¨ubler
Gutachter: Prof. Dr. Frank Heinemann
Gutachter: Prof. Dr. Georg Weizs¨acker
Tag der wissenschaftlichen Aussprache: 18. Juli 2024
Berlin 2024
Acknowledgements
I would like to begin by thanking my principle supervisor, Frank Heinemann, for
giving me the opportunity and autonomy to pursue my research aspirations. His
relentless rigor, patience, availability, and adeptness at seeing between the lines have
enriched my research. I am also grateful to Georg Weizs¨acker for agreeing to be the
second reader of this dissertation. His intellectual guidance has been elemental from
the very beginning of my doctoral studies. I could not have asked for better role
models in my career than my supervisors.
I am deeply indebted to my co-authors, Lisa Bruttel, Camille Cornand, Dave
Hales, Frank Heinemann, Patrick Julius, Weiwei Tasch, and Adam Zylbersztejn.
Their contributions have improved the chapters of this thesis and resulted in several
publications, of which I am proud. I am grateful to my colleagues, friends, and
mentors at the Berlin School of Economics for their feedback and support, includ-
ing Ciril Bosch-Rosa, Michael Burda, Francesco Capozza, Dirk Engelmann, Samuel
Fahim, G¨okhan Ider, Dorothea K¨ubler, Levent Neyse, and Verena Fuhr. I have
also received help and advice from many people outside of Berlin during seminars,
summer schools, and conferences, including John Duffy, Nicolas Jacquemet, Ryan
Rholes, and Tobias Schmidt.
Last but not least, I would like to express my gratitude to all those who gave
me moral support and encouragement when I needed it. My friends have made this
journey a joyful one. My mother S¸emsig¨ul and my brother Fatih have made me the
man I am and have never wavered in their support. My partner Lˆal Yolge¸cenli has
left her fingerprints on every part of this thesis, from brainstorming research ideas
to polishing every draft. Her calming presence has carried me through the ups and
downs. I dedicate this dissertation to her.
I
Summary
This dissertation consists of four essays that use controlled experiments to address
macroeconomic issues.
Chapter 1(coauthored with Camille Cornand and Adam Zylbersztejn) investi-
gates the robustness of experience effects in laboratory experiments. We compare
two scenarios in a beauty-contest game characterized by strategic complementarity:
A baseline in which the same shock hits the economy four times, and a treatment
in which the shocks are heterogeneous. We find that convergence to equilibrium
accelerates with each shock in both scenarios. We then run a horse-race exercise
between models of expectation formation.
Chapter 2(coauthored with David Hales, Patrick Julius, and Weiwei Tasch)
examines the sources of asymmetric price transmission. We show that prices respond
asymmetrically to cost shocks in laboratory markets where subjects act as producers
and demand is computerized. We show that subjects engage in tacit collusion, which
is accentuated after negative shocks and is not expectation driven. Increasing the
number of competitors from 3 to 10 does not significantly change the degree of
asymmetry.
Chapter 3(coauthored with Lisa Bruttel, Camille Cornand, Adam Zylbersztejn,
and Frank Heinemann) develops an experimental method for measuring strategic-
uncertainty attitudes. We compare certainty equivalents and beliefs across lotteries
where the payoff depends on either the outcome of a stag-hunt game, a market-entry
game, an ambiguous lottery, or a risky lottery. We use a model to determine whether
participants exhibit optimism/pessimism or strategic-uncertainty aversion/seeking.
We find that the median player is pessimistic in the stag-hunt game and optimistic
in the entry game, but neutral toward strategic uncertainty.
Chapter 4employs survey experiments to measure German households’ beliefs
about the European Central Bank’s inflation forecasts. I find that the accuracy
of these forecasts is severely underestimated. Information treatments are effective
in changing inflation expectations and uncertainty. A causal mechanism analysis
shows that intervention increases trust in the central bank, which in turn changes
inflation expectations.
Chapter 5concludes with a discussion of these results.
II
Zusammenfassung
Diese Dissertation umfasst vier Aufs¨atze, die kontrollierte Experimente zur Behand-
lung makro¨okonomischer Fragestellungen nutzen.
Kapitel 1(mitverfasst von Camille Cornand und Adam Zylbersztejn) untersucht
die Robustheit von Erfahrungseffekten in Laborversuchen. Wir vergleichen zwei
Szenarien in einem Sch¨onheitswettbewerbsspiel, das durch strategische Komplemen-
tarit¨at gekennzeichnet ist: ein Baseline-Szenario, in dem die gleiche ¨okonomische
Schockwirkung viermal auftritt, und ein Treatment, in dem die Schocks heterogen
sind. Wir stellen fest, dass die Konvergenz zum Gleichgewicht mit jedem Schock
schneller verl¨auft. Anschließend f¨uhren wir einen Vergleich der Vorhersagen zwis-
chen verschiedenen Modellen der Erwartungsbildung durch.
Kapitel 2(mitverfasst von David Hales, Patrick Julius und Weiwei Tasch) un-
tersucht die Ursachen asymmetrischer Preis¨ubertragungen. Wir zeigen, dass Preise
asymmetrisch auf Kostenschocks in Laborm¨arkten reagieren, in denen Probanden
als Produzenten agieren und die Nachfrage computergesteuert ist. Wir zeigen, dass
die Probanden stillschweigende Absprachen treffen, die sich nach negativen Schocks
verst¨arken und nicht erwartungsgesteuert sind. Eine Erh¨ohung der Anzahl der Wet-
tbewerber von 3 auf 10 ¨andert das Ausmaß der Asymmetrie nicht wesentlich.
Kapitel 3(mitverfasst von Lisa Bruttel, Camille Cornand, Adam Zylbersztejn
und Frank Heinemann) entwickelt eine experimentelle Methode zur Messung von
Einstellungen zur strategischen Unsicherheit. Wir vergleichen Sicherheits¨aquivalente
und Erwartungen von Probanden f¨ur Lotterien, deren Auszahlungen entweder vom
Ausgang eines stag hunt game, eines Markteintrittsspiels, einer Lotterie mit gegebe-
nen Wahrscheinlichkeiten und einer Lotterie mit ungewissen Wahrscheinlichkeiten.
Mithilfe eines Modells bestimmen wir, ob die Teilnehmer Optimismus/Pessimismus
oder eine Abneigung/Vorliebe gegen¨uber strategischer Unsicherheit zeigen. Wir
finden heraus, dass der durchschnittliche Spieler im stag hunt game pessimistisch
und im Eintrittsspiel optimistisch ist, jedoch neutral gegen¨uber strategischer Un-
sicherheit.
Kapitel 4verwendet Umfrageexperimente, um die ¨
Uberzeugungen deutscher
Haushalte bez¨uglich der Inflationsprognosen der Europ¨aischen Zentralbank zu messen.
Die Genauigkeit dieser Prognosen wird stark untersch¨atzt. Informations erweisen
III
sich als wirksam, um Inflationserwartungen und die damit verbundene Unsicherheit
zu ver¨andern. Eine Analyse des kausalen Mechanismus zeigt, dass die Intervention
das Vertrauen in die Zentralbank erh¨oht, was wiederum die Inflationserwartungen
ver¨andert.
Kapitel 5schließt mit einer Diskussion dieser Ergebnisse ab.
IV
Contents
0 Introduction 1
1 Learning to Deal with Repeated Shocks under Strategic Comple-
mentarity: An Experiment 4
1.1 Introduction ................................ 4
1.2 Related Literature and Hypotheses ................... 6
1.3 Method .................................. 9
1.3.1 Guessing Game under Strategic Complementarity ....... 9
1.3.2 Experimental Design ....................... 10
1.3.3 Procedures ............................. 12
1.4 Results ................................... 14
1.4.1 Adjustment Dynamics ...................... 14
1.4.2 Expectation Formation ...................... 17
1.5 Discussion ................................. 24
1.6 Conclusion ................................. 25
A Appendix 28
A.1 Experimental Material .......................... 28
A.1.1 Instructions and Comprehension Questions Translated to En-
glish ................................ 28
A.1.2 Tests Translated to English ................... 31
A.1.3 Score Measurement, Procedures and References for Tests . . . 33
A.2 Additional Figures and Tables ...................... 34
A.3 Robustness Analyses ........................... 35
A.3.1 Calibration of Free Parameters in HSM ............. 35
A.3.2 Cognitive Skills and Individual Expectations .......... 37
2 Imperfect Tacit Collusion and Asymmetric Price Transmission 42
2.1 Introduction ................................ 42
2.2 Related Literature ............................ 46
2.2.1 Field Evidence .......................... 46
V
2.2.2 Theoretical Explanations ..................... 46
2.2.3 APT and Experiments ...................... 48
2.3 Method .................................. 50
2.3.1 Pricing Game ........................... 50
2.3.2 Experimental Design ....................... 52
2.3.3 Procedures ............................. 52
2.4 Hypotheses ................................ 53
2.5 Results ................................... 56
2.5.1 Estimation of the Asymmetry .................. 57
2.5.2 Market Power ........................... 60
2.5.3 Deviations from Best-Response ................. 63
2.6 Discussion ................................. 66
B Appendix 69
B.1 Quadratic Utility and Linear Demand .................. 69
B.2 Experimental Material .......................... 71
B.2.1 Instructions ............................ 71
B.2.2 Comprehension Questions .................... 73
B.2.3 Payoff Table and Experimental Interface ............ 74
B.3 Additional Analysis ............................ 77
B.3.1 Regression Results of Asymmetry ................ 77
B.3.2 Pass-through Rates ........................ 79
B.3.3 Non-parametric Test on Excess Market Power ......... 79
3 Measuring Strategic-Uncertainty Attitudes 81
3.1 Introduction ................................ 81
3.2 Related Literature ............................ 83
3.3 Experimental Design and Procedures .................. 86
3.3.1 Treatments ............................ 88
3.3.2 Implementation Details ...................... 91
3.4 Theoretical Framework .......................... 92
3.4.1 Identification and Uncertainty Attitudes ............ 94
3.4.2 Hypotheses ............................ 98
3.5 Results ................................... 98
3.5.1 Data Selection ........................... 99
3.5.2 Comparison of Certainty Equivalents ..............100
3.5.3 Main Results ...........................101
3.6 Conclusion .................................107
VI
C Appendix 110
C.1 Instructions ................................110
C.1.1 Instructions for Games ......................114
C.1.2 Example of Comprehension Quiz for the Treatment . . . . . . 118
C.2 Additional Tables and Figures ......................119
C.3 Individual Underpinnings of Attitudes towards Uncertainty ......120
C.4 Screenshots ................................122
4 Better than Perceived? Correcting Misperceptions about Central
Bank Inflation Forecasts 124
4.1 Introduction ................................124
4.2 First Experiment .............................128
4.2.1 Design and Implementation ...................128
4.2.2 Results ...............................131
4.3 Second Experiment ............................137
4.3.1 Design and Implementation ...................137
4.3.2 Results ...............................139
4.4 Conclusion .................................142
D Appendix 144
D.1 Experimental Materials ..........................144
D.2 A Model of Belief Updating with Trust .................144
D.2.1 Model ...............................144
D.2.2 Hypotheses ............................146
D.3 Supporting Material ...........................147
D.3.1 First Experiment .........................147
D.3.2 Second Experiment ........................153
5 Conclusion 155
VII
List of Figures
1.1 Decision Screen .............................. 13
1.2 Guesses by treatments and rounds .................... 14
1.3 Median individual guesses ........................ 17
1.4 Impact of heuristics across periods ................... 23
A.1 Test picture in the RMET ........................ 33
A.2 Individual guesses across periods .................... 36
A.3 Cluster guesses across periods ...................... 36
A.4 Impact factors from various HSMs .................... 38
2.1 Average pricing behavior across periods and group sizes. ....... 56
2.2 Cumulative response to shocks ...................... 57
2.3 Excess market power across periods and group size .......... 61
2.4 Deviations from best-response action and errors in expectations . . . 64
B.1 Payoff matrix ............................... 74
B.2 Screen: Price setting ........................... 75
B.3 Screen: Guess setting ........................... 75
B.4 Screen: Feedback ............................. 76
3.1 Cumulative density functions of uncertainty attitude parameters across
conditions .................................107
C.1 Screen used in RISK treatment .....................123
C.2 Screen used in STRATEGICUNCERTAINTY treatment (stag-hunt
game) ...................................123
4.1 Flow of the first experiment .......................128
4.2 Beliefs about inaccuracy and the actual distribution of forecast errors 131
4.3 Directed acyclic graph showing the causal mechanisms ........134
4.4 Flow of the first experiment .......................138
4.5 Expected inflation and perceived inflation forecast ...........140
D.1 Expected inflation (for Germany) and perceived inflation forecast . . 154
VIII
List of Tables
1.1 Convergence in the previous experiments ................ 7
1.2 Experimental design parameters ..................... 11
1.3 Post-shock deviations from NE ..................... 16
1.4 The effect of heterogeneity in shocks on guesses of last post-shock phase 17
1.5 Description of selected expectation rules for comparison ........ 19
1.6 RMSE of expectation rules ........................ 24
A.1 Description of subject pool ........................ 34
A.2 Improvements from similarity refinement in the initial post-shock period 35
A.3 Average of impact factors in the HSM ................. 35
A.4 RMSE of HSM under different values of free parameters ........ 37
A.5 RMSE of expectation rules and individual characteristics ....... 40
A.6 Prediction errors and individual test scores ............... 41
2.1 Experimental design parameters ..................... 52
2.2 Asymmetry in the immediate pass-through rates ............ 59
2.3 Excess market power ........................... 62
2.4 Deviations from best-response ...................... 66
B.1 Estimation of asymmetry ......................... 78
B.2 Asymmetry in the pass-through rates after 14 periods ......... 79
B.3 Non-parametric test on excess market power .............. 80
3.1 Game 1 and associated payoffs ...................... 87
3.2 Game 2 and associated payoffs ...................... 87
3.3 Decision table in the RISK treatment .................. 90
3.4 Comparison of certainty equivalents ...................100
3.5 Summary of estimated uncertainty attitudes ..............102
3.6 Uncertainty attitudes across treatments: parametric estimates from
seemingly unrelated regressions .....................104
3.7 Results of mean testing across specifications ..............105
3.8 Nonparametric comparisons of uncertainty attitudes across treatments 106
IX
C.1 Seemingly unrelated regressions with treatment order effects . . . . . 119
C.2 Nonparametric comparisons of strategic uncertainty attitudes across
treatments .................................119
C.3 Seemingly unrelated regressions with individual characteristics: re-
stricted sample ..............................121
C.4 Seemingly unrelated regressions with individual characteristics: un-
restricted sample .............................122
4.1 Treatment effects on learning and uncertainty .............133
4.2 Causal Mechanisms with 2SLS ......................135
4.3 Treatments effects on trust, attention, and consumption one month
later ....................................137
4.4 Statements used to indirectly measure trust in the ECB ........139
4.5 Perceived direction of the ECB’s forecast error and trust in the ECB 141
4.6 Effects of information treatments on the public’s perception of the ECB142
D.1 Demographic profile ...........................147
D.2 Ordered logistic regression on the ECB perception ...........148
D.3 Demographic characteristics of respondents by the level of mispercep-
tion ....................................149
D.4 Robustness regressions for reduced-form treatment effects: With con-
trols and without winsorization .....................150
D.5 Robustness regressions for the causal mechanisms: Only T4 ......151
D.6 Robustness regressions for the causal mechanisms: Only cjeu trust as
instrument .................................152
D.7 Demographic profile ...........................153
D.8 Correlation matrix for different facets of public trust in the ECB . . . 154
Chapter 0
Introduction
(I would) rather discover one cause than
gain the Kingdom of Persia.
—Democritus1
Many questions in macroeconomics, such as what drives (dis)inflation or how
fiscal stimulus affects economic outcomes, are causal in nature. The most credi-
ble approach to establishing causality is through controlled experiments, because
this method allows researchers to manipulate isolated variables at will. Tradition-
ally, many have viewed such a method as unavailable for macroeconomic research.
Despite these reservations, the utility of controlled experiments in macroeconomic
research is increasingly recognized (see Cornand and Heinemann 2019a;Hommes
2021 for reviews). Experiments can be used to evaluate models under ideal con-
ditions, to test behavioral assumptions underlying microfoundations, to estimate
otherwise unobservable parameters, and to resolve theoretical ambiguities, such as
those regarding equilibrium selection. This dissertation consists of four papers that
use controlled experiments, both in and out of the laboratory, to demonstrate how
experiments can help address macroeconomic issues.
Chapter 1(coauthored with Camille Cornand and Adam Zylbersztejn) con-
tributes to the long-standing debate about whether individual irrationality can have
aggregate effects. A common argument of those who claim that irrationality has
weak consequences is that biases cannot persist in the long run due to learning. In-
deed, a large literature shows that inertia in convergence to equilibrium in repeated
games fades as subjects gain experience with the environment. In this chapter,
we investigate the robustness of such experience effects, documented in controlled
experiments, to a real-world complication. Subjects in our experiments play a re-
1Democritus, fragment 118, in Ancilla to the Pre-Socratic Philosophers: A Complete Trans-
lation of the Fragments in Diels, Fragmente der Vorsokratiker, translated by Kathleen Freeman
(Cambridge: Harvard University Press, 1948).
1
peated beauty contest game characterized by strategic complementarity.2We shock
the Nash equilibrium in two ways. In the baseline, the Nash equilibrium is repeat-
edly shifted by the same shock, and in the treatment, the shocks are non-identical.
We argue that non-identical shocks lead to reduced transfer among adaptive learn-
ers. Our data do not support this hypothesis. Subjects in our treatment and control
groups learn to deal with shocks at similar rates. Thus, we confirm the robustness
of the experience effect to our particular real-world complication. The rest of the
chapter is devoted to determining the best fitting expectation formation model.
In chapter 2(coauthored with David Hales, Patrick Julius, and Weiwei Tasch),
we use another laboratory experiment to investigate the sources of the so-called
”rockets and feathers” effect. This effect refers to the observation that producer
prices rise rapidly but fall slowly in response to corresponding changes in marginal
costs. It is difficult to reconcile such an asymmetric response function with the
monopolistic competition at the core of most contemporary DSGE models. Many
explanations have been put forward to address this puzzle. Most attribute it to
market frictions, while some see imperfect competition as the root cause. However,
econometric tests based on observational data are inconclusive, especially due to the
tacit nature of collusion and the joint presence of frictions and imperfect competition.
In this experiment, we address these challenges by setting up a frictionless price-
setting game with exogenously determined marginal costs and number of producers.
Participants interact for a finite number of periods and experience positive and
negative shocks to the marginal cost of the unique good they sell, while demand is
computerized. We document asymmetric price transmission in most markets. This
asymmetry is robust to the number of sellers in a market, except in the case of
duopoly, where we observe near perfect collusion. In other markets, we also observe
periods of ”excess” market power that accentuate after negative cost shocks. We
use reported beliefs about other sellers’ prices to show that asymmetric increases in
market power are not driven by expectational errors.
Chapter 3(coauthored with Lisa Bruttel, Camille Cornand, Adam Zylbersztejn,
and Frank Heinemann) focuses on the role of strategic-uncertainty attitudes in
equilibrium selection in coordination games. Coordination problems underlie many
macroeconomic phenomena, including bank runs, debt crises, liquidity crises, and
speculative attacks. A major challenge is that standard theoretical frameworks, such
as Nash equilibrium, offer limited guidance on which equilibria are likely to prevail.
Strategic uncertainty can be used as a refinement tool, especially if individuals ex-
hibit systematic attitudes toward strategic uncertainty. However, such attitudes
2Mauersberger and Nagel (2018) argue that this game is analogous to New Keynesian dynamic-
stochastic general equilibrium (DSGE) models that incorporate sentiment (e.g., Angeletos and
La’o,2010).
2
are difficult to measure because one must compare the utility of a game with the
utility of an appropriately matched non-game situation and take beliefs and risk
attitudes into account. In this study, we develop a method with these properties.
We test the method experimentally on two coordination games that differ only in
their strategic environment: A stag-hunt game (complements) and a market-entry
game (substitutes). We measure certainty equivalents and beliefs for lotteries whose
payoff depends on either the outcome of a stag-hunt game, a market-entry game, an
ambiguous lottery, or a risky lottery. Using a model, we decompose differences in
certainty equivalents into either (a) optimism or pessimism about the probability of
achieving the desired payoff, or (b) a fixed (dis)utility caused by the source of un-
certainty. Overall, we find that participants exhibit optimism in the stag-hunt game
and pessimism in the entry game while they do not experience a flat (dis)utility due
to the game.
Chapter 4examines the sources of disagreement between households and central
banks about future inflation. Such disagreement is bad for central banks, which are
trying to anchor private expectations, and for households, which have lower quality
information. To understand the driving forces, I survey German households’ beliefs
about the European Central Bank’s (ECB’s) inflation forecasts by embedding two
modules in the Bundesbank’s Survey on Consumer Expectations in 2022 and 2023.
Notably, the survey measures perceived forecast accuracy and the expected direction
of forecast error. The results indicate a widespread underestimation of the ECB’s
inflation forecast accuracy. Households also consider the ECB’s forecast in 2022 to
be overly optimistic. Both beliefs are strongly correlated with self-reported trust
in the ECB, even after controlling for socio-economic characteristics. Challenging
misperceptions about forecast accuracy reduces households’ disagreement with the
ECB’s inflation forecast, reduces uncertainty about future inflation, improves self-
reported trust in the ECB, and has some effect on consumption plans. Further
analysis shows that trust in the ECB is causally related to inflation expectations.
The remainder of this chapter is devoted to understanding which specific aspects
of public opinion about the ECB are changed by the communication of inflation
forecasts and how they are related to trust.
The last chapter provides a brief discussion of these papers and an outlook on
future research.
3
Chapter 1
Learning to Deal with Repeated
Shocks under Strategic
Complementarity: An Experiment
An earlier version of this chapter is published as: Bulutay, M., Cornand, C. and
Zylbersztejn, A., 2022. Learning to deal with repeated shocks under strategic com-
plementarity: An experiment. Journal of Economic Behavior & Organization, 200,
pp.1318-1343.
1.1 Introduction
How long would it take for market outcomes to fully adjust to the new equilibrium
level in response to an exogenous shock? In a seminal paper on the rational expecta-
tions (RE) hypothesis, Muth (1961) demonstrates that convergence to equilibrium
is instantaneous in a frictionless economy if the errors in agents’ expectations are
not highly correlated as they cancel out at the aggregate level. However, the em-
pirical evidence points to systematic errors due to heuristic-based reasoning under
which the aggregate outcomes may exhibit substantial inertia. Whether and how
the adjustment would be delayed in the presence of nonrational expectations is a
key question for policy-makers – central banks that aim at engineering structural
changes – and for actors in markets where equilibrium is frequently shifting due to
shocks. If adjustment is sluggish and shocks occur frequently, aggregates may rarely
be in accordance with the equilibrium path predictions generated by the impulse-
response analyses of RE-based models.
Early experimental evidence from double auctions shows that equilibrium prices
emerge within a few periods (Smith,1962). Convergence occurs even in the presence
of zero-intelligence computer traders who submit random bids and asks if these bids
4
are constrained with a budget (Gode and Sunder,1993). Nonetheless, persistent
deviations from equilibrium are reported in different types of competitive markets
(e.g., asset market experiments, AMEs henceforth, Smith et al. 1988). Thus, the
extent to which limited rationality influences market outcomes depends on the char-
acteristics of the market.
The type of strategic environment governing the market is one of the key char-
acteristics determining the impact of limited rationality on behavior and outcomes.
Following the theoretical work of Haltiwanger and Waldman (1985,1989), Fehr and
Tyran (2005,2008) experimentally test the role of the strategic environment on the
adjustment dynamics after a monetary shock. In accordance with the theoretical
predictions, the adjustment is immediate when actions are strategic substitutes,
and gradual when actions are strategic complements. The role of the strategic
environment has been further experimentally investigated in Learning-to-Forecast
Experiments (LtFEs, Heemeijer et al. 2009,Bao et al. 2012), guessing games (Sutan
and Willinger 2009,Cooper et al. 2017,Hanaki et al. 2019) and duopoly games
(Potters and Suetens,2009).1The main pattern emerging from these studies is that
deviations from equilibrium tend to be larger and more persistent under strategic
complementarity as compared to strategic substitutability.2
Herein, we focus on strategic complementarity which comes as an important
feature of various economic contexts including macroeconomic coordination, bank
runs, and oligopoly competition.3As argued by Hommes (2006), strategic comple-
mentarity is crucial for modeling asset markets characterized by a positive feedback
mechanism between expectations on asset prices and the realizations of these prices.
The literature still lacks consensus on how repeated shocks (whether they are
identical or not) could affect adjustment under strategic complementarity. On the
one hand, the initial deviations from RE may subsequently disappear due to expe-
rience effects, as commonly reported in AMEs (e.g., Smith et al. 1988). In a recent
study, Cooper et al. (2017) show that these results can be extended to guessing
games. They introduce three identical shocks into Nash equilibrium (NE) in a peri-
odic manner and report slight acceleration in the adjustment speed over shocks. On
the other hand, experimental studies based on AMEs and LtFEs question the robust-
ness of experience effects (Kop´anyi-Peuker and Weber 2021;Shestakova et al. 2019).
Hussam et al. (2008) argue that experience effects critically rely on the stationarity
of the environment. Accordingly, both Cooper et al. (2017, p. 207) and Fehr and
Tyran (2008, p. 387) conjecture that in case of repeated nonidentical shocks, the
impact of nonrational expectations would persist. However, neither paper provides
1See Hommes (2011) and Arifovic and Duffy (2018) for an overview of the Learning-to-Forecast
literature.
2Hanaki et al. (2019) are the first to term this phenomenon as the strategic environment effect.
3See Milgrom and Roberts (1990) for more examples.
5
an empirical test of this conjecture. Our work aims at filling this gap.
We experimentally test the conjecture of a relative persistence of nonidentical
shocks in a guessing game with strategic complementarity (based on Cooper et al.,
2017). We introduce large periodic negative shocks to the NE and compare adjust-
ment dynamics between two experimental conditions: one where shocks are identical
and another where they are not. During the first and last post-shock phases, the NE
are the same in both conditions. Through this design, we are able to measure the
treatment effect of experiencing nonidentical shocks (i) on the aggregate adjustment
speed, and (ii) on the way individuals form expectations. Related to (i), we find
that post-shock adjustment accelerates due to repetition. Compared to the initial
post-shock adjustment, it takes fewer periods for the adjustment to occur after fur-
ther shocks. However, we fail to identify a significant effect of nonidentical shocks
on the pace of adjustment. Related to (ii), our results show that experience may
not be enough to deplete na¨ıvety, at least not within four repetitions of the game.
Our contribution to the literature is twofold. Firstly, we document the robustness
of the findings of Cooper et al. (2017) in the context of identical shocks, and fur-
ther extend their findings to a more complex environment with nonidentical shocks.
Based on this experimental variation of negative shocks, we report that the inertia
in adjustment is a robust feature of markets governed by strategic complementarity
and that it does not depend on the stationarity of periodic shocks. Secondly, the
data on expectations across subjects and over time allow us to study the individual
underpinnings of the observed aggregate dynamics. To avoid arbitrariness in model
selection, we consider a wide range of backward-looking expectation rules and take
their predictions to the experimental data. This novel horse race exercise reveals
that upgrading expectation rules with similarity-based learning approach improves
their predictive power under identical shocks. Notably, the best performing model
is a simple nonparametric reformulation of na¨ıve expectations with similarity-based
learning (first proposed by Cooper et al.,2017). We discuss its behavioral founda-
tions and relate it to the previous literature.
The remainder of this chapter is organized as follows. Section 1.2 outlines our
research hypotheses. Section 1.3 presents our methodology: the guessing game and
the way we implement it in the lab. Section 1.4 summarizes the main results which
are then discussed in Section 1.5. Lastly, Section 1.6 concludes by summarizing the
main findings, as well as the implications and limitations of the study.
1.2 Related Literature and Hypotheses
6
Table 1.1: Convergence in the previous experiments
Study Type of Shock size Convergence
environment (in %) period
Fehr and Tyran (2001)1Pricing decision -67% & +100% 13 & 4
Fehr and Tyran (2008)2Pricing decision -50% 9
Davis and Korenok (2011)3Monopolistic competition +100% 21
Petersen and Winn (2014)1Pricing decision -67% & +92% 8 & 4
Cooper et al. (2017)4Guessing game -77% 8
1Shock as the change in average equilibrium price in the nominal treatment with human opponents.
2Shock as the change in average equilibrium price in nominal treatment.
3Shock as the change in monopolistically competitive prices in the BASE/PUB treatment. Prices remain significantly
different than competitive level in the first reported 20 post-shock periods.
4Shock as the change in the NE guess in first round.
Table 1.1 summarizes the evidence from experimental studies that investigate the
dynamics of convergence following large shocks when actions are strategic comple-
ments. Here, the ”Convergence period” reported in column 4 indicates the number
of periods for the general activity level (price, guess, etc.) to become statistically in-
distinguishable (at the 5% level) from the post-shock theoretical equilibrium value.4
The general pattern in those data is that convergence takes time when actions are
strategic complements. In particular, adjustment to the NE tends to be slow after
the initial shock, even though acceleration may still occur when markets are re-
peated (Cooper et al.,2017).5We expect to observe the same pattern in a slightly
modified environment.6
Hypothesis 1: When shocks are identical, adjustment to the NE is slow and
gradual after the initial shock, but accelerates over repetition of the same market.
Albeit robust in stationary environments, experience effects are argued to be
sensitive to the complexity of the environment. For instance, in the AME and LtFE
studies by Kop´anyi-Peuker and Weber (2021) and Hussam et al. (2008), bubbles
do not disappear despite repetition.7Hussam et al. (2008) also report that bub-
bles reignite even with twice-experienced subjects following drastic changes in the
environment (e.g., the amount of liquidity in the market). Moreover, Cooper et al.
4We retrieved the information about the convergence period directly from each article. Thus,
the table does not account for the differences in experimental designs and statistical methods used
across these studies.
5This is also a standard finding across AMEs. For instance, Smith et al. (1988), Dufwen-
berg et al. (2005) and Haruvy et al. (2007) show that repeating market interactions three times
eliminates bubbles.
6The scope of these modifications along with their rationale are described in Section 1.3.
7According to Kop´anyi-Peuker and Weber (2021), a possible explanation is that this occurs
because interactions in their experiment have an indefinite horizon.
7
(2017) and Fehr and Tyran (2008) conjecture that nonidentical shocks may thwart
experience effects which constitutes one rationale of our second hypothesis.8The
other rationale builds on Cooper et al. (2017)’s theoretical result of adjustment ac-
celeration under identical shocks. Extending their reasoning to nonidentical shocks,
we formulate the following hypothesis.
Hypothesis 2: The adjustment accelerates with market repetition at a slower
pace in the presence of nonidentical shocks than with identical shocks.
We now turn to the possible explanations of adjustment dynamics. Several
studies provide a descriptive explanation of the observed inertia based on nonrational
expectations. Yet, they strongly diverge in terms of the best fitting model. For
instance, Fehr and Tyran (2008) report that their data are best organized by a model
in which all agents exhibit na¨ıve expectations.9Cooper et al. (2017), in turn, obtain
the best fit with heterogeneous groups: one rational player and three nonrational
players whose expectations follow a version of na¨ıve expectations rule adapted to a
repeated shocks design. Other studies point to trend-following expectations (Haruvy
et al. 2007) or even RE (Marquardt et al. 2019) as best describing their experimental
evidence.
We note, however, that the aformentioned studies either do not compare the fit
of their model with other expectation rules, or only consider a relatively narrow set
of competing rules.10 More systematic comparisons exist in the LtFE literature.
The design of LtFEs is particularly well-suited for investigating expectations since
the experimental task is to forecast the prices one-period-ahead. Trend-following
has been repeatedly shown to outperform all the others under homogeneous ex-
pectations (Bao et al. 2012;Anufriev et al. 2013;Heemeijer et al. 2009). Pfajfar
and ˇ
Zakelj (2014) estimate the share of RE and simple expectations in their New
Keynesian LtFEs. They arrive to the conclusion that the RE (simple rules) cannot
be rejected for 30-45% (35-50%) of subjects. This finding has been subsequently
confirmed by Marquardt et al. (2019). In the context of the evolutionary heuristic
switching model (HSM, Anufriev and Hommes 2012), Cornea-Madeira et al. (2019)
8Bao et al. (2012) study large nonidentical shocks in LtFEs by introducing two large shocks
to the rational expectations (RE) equilibrium. However, their design does not propose a way to
test the effect of nonidentical shocks with respect to identical ones. To the best of our knowledge,
our study is the first to directly test the impact of the heterogeneity of shocks in a controlled
environment.
9In their model of fully adaptive expectations, players expect the outcome of the last period to
reoccur.
10For instance, Marquardt et al. (2019) only consider three models: myopic, trend and RE.
Moreover, the parameters of their trend model resemble what other studies denote as strong trend-
following rule (e.g., Anufriev and Hommes 2012). In Section 1.5, we discuss why strong trend-
following may not be a suitable rule for environments like AMEs.
8
estimate the weights of na¨ıve and fundamentalist rules in inflation expectations in
the U.S. inflation data spanning from 1968:Q4 to 2015:Q2. Despite a substantial
time variation, they find that 65% of individuals form na¨ıve expectations, the share
of which increases in reaction to large inflationary shocks, thus creating self-fulfilling
inflation persistence.
Based on this body of empirical literature, we conclude the following. First, the
best fitting expectation models vary across different experimental settings. Second,
for the experimental settings closest to ours (i.e., guessing games and LtFEs) simple
backward-looking expectation models outperform RE. This observation leads us to
our third hypothesis.
Hypothesis 3: Backward-looking expectation rules in the form of heuristics fit
the data better than RE.
Finally, we provide the first out-of-the-sample test of the relative performance of
the expectation rule proposed by Cooper et al. (2017). This rule seems promising in
the context of repeated shocks since it echoes the similarity-based learning approach
(Gilboa and Schmeidler,1995;Plonsky et al.,2015). Accordingly, a player expects
the outcome of the last period to reoccur in stable phases. After observing a shock,
the player reviews all the past periods and expects the outcome of the period fol-
lowing the previous occurence of the same shock. We denote this rule as similarity-
based na¨ıve expectations (SBNE). In the same vein, we extend the adaptive and
trend-following models and denote them respectively as similarity-based adaptive
expectations (SBAE) and similarity-based trend-following expectations (SBTF). We
conjecture that this class of rules best explains behavior under repeated identical
shock.
Hypothesis 4: Under identical shocks, the rules that are augmented with the
similarity-based learning approach provide the best fit to the experimental data.
1.3 Method
1.3.1 Guessing Game under Strategic Complementarity
To investigate whether repeating identical shocks improves the speed of adjustment,
and whether nonidentical shocks slow down this process, we refer to a repeated
guessing game under strategic complementarity that is adapted from Nagel (1995).
Our experimental game also resembles those used in LtFEs with positive feedback.
9
The fundamental difference between these two designs is that while guessing games
provide full information on the game structure (including the parameters), LtFEs
provide only qualitative information about the market structure. Nevertheless, Son-
nemans and Tuinstra (2010) show that convergence dynamics are similar when the
feedback parameters are equal.11
In each period t∈[1, T], a group of Nplayers simultaneously choose a number
(rounded up to two decimals) from the closed set pi,t ∈[0,100], where i= 1, . . . , N.
Each player ihas a target number yi,t that is calculated as
yi,t =b¯p−i,t +a+ξt,(1.1)
where ¯p−i,t is the average number chosen by the remaining players12 at period t,a
and bare positive constant numbers with b∈(0,1), and ξtis a deterministic large
shock which takes the values
ξt=
0,if t≤T/2
¯
ξ, if t > T/2.
(1.2)
The constant term bgenerates strategic complementarity among the players’
actions. The player with the smallest guessing error |yi,t −pi,t |wins the fixed stage
game payoff F. In case of a tie, the payoff is equally split among the winners.
This game has a unique Nash equilibrium which corresponds to an interior so-
lution: pNE
t=a+ξt
(1−b).13 Here, pNE
tis invariant for the first T/2 periods, called the
pre-shock periods. The negative shock ¯
ξshifts the equilibrium downwards at period
T/2 + 1.14 These remaining periods with a new equilibrium are the post-shock peri-
ods. In addition, shocks are repeated: a sequence of Tperiods (pre- and post-shock)
is repeated over Rrounds.
1.3.2 Experimental Design
Our experimental manipulation consists in varying the value of ¯
ξrover rounds.
For identical shocks (baseline), ¯
ξr=¯
ξfor all r∈[1, R]. For nonidentical shocks
11See Sonnemans and Tuinstra (2010) for a detailed comparison of guessing games and LtFEs.
12Sutan and Willinger (2009) report that the inclusion of own guesses causes a significant amount
of confusion among subjects. Therefore, we opted for excluding player’s own guess from the target
formula.
13The proof is based on the iterated elimination of dominated strategies. See Nagel (1995)
for details. This equilibrium is also a RE equilibrium. Bray (1983) shows that when b < 1, a
misspecified expectation rule – ordinary least-squares learning – almost surely converges to the
RE equilibrium. However, she also emphasizes that this does not imply unbiased expectations. As
she notes (Bray 1982, pp. 330), “[r]ational expectations are, if anything, a long run rather than a
short run phenomenon.”
14Of course, future studies could consider alternative implementations, such as positive shocks.
10
(treatment), the size of the shock varies across rounds. Importantly, the equilibrium
solution outlined above applies to both cases, so that players are always incentivized
to play the NE.
The calibration of the experimental game is summarized in Table 1.2. A group
of 5 participants play the guessing game for 4 rounds, and each round is composed
of 16 periods. This yields a total of 64 guessing decisions per player. In period 9 of
each round, a negative shock ¯
ξshifts parameter afrom 15 to a value that depends
on the experimental condition.
In baseline, shocks are identical and the shock component equals −9 in every
rounds. In treatment, shocks are not identical and the shock component is charac-
terized by the sequence (−9,−6,−12,−9) in rounds (1,2,3,4). Thus, the post-shock
NE is the same in the first and last rounds in both conditions. This allows us to cap-
ture the effect of experiencing nonidentical shocks by comparing adjustment speeds
in round 4.
Table 1.2: Experimental design parameters
General parameters
Number of periods per round T= 16
Number of rounds per session R= 4
Group size N= 5
Stage game prize F= 4.40 euros
Slope of target formula b= 0.75
Pre-shock value of constant a= 15
Pre-shock equilibrium pNE
pre = 60
Post-shock equilibrum (pNE
post) Baseline Treatment
Round 1 24 24
Round 2 24 36
Round 3 24 12
Round 4 24 24
The design of this experiment closely follows Cooper et al. (2017), with some
noteworthy modifications. First, in their study, groups are composed of 4 players.
Following Hanaki et al. (2019), we increase the number of players to 5 per group.15
Second, in Cooper et al. (2017) there are three rounds of 20 periods. We decrease the
length of each round to 16 periods to be able to add one additional round without
15They show that the effect of strategic environment is statistically significant for groups of five
or more agents.
11
extending the duration of the experiment excessively. Third, since post-shock phases
are shorter in our study, the post-shock equilibrium in baseline groups is set at a
higher level. Finally, Cooper et al. (2017) elicit expectations of subjects in addition
to their guesses while we elicit expectations jointly with the guesses. An advantage
of our method compared to the previous study of Cooper et al. (2017) is that it
ensures consistency between guess and expectations and thus reduces the scope of
decision-making errors.16
We implement a fixed matching protocol within each round, and a random re-
matching protocol between rounds. To reduce the scope of session effects due to
random rematching, in each session we divide each group of twenty participants
into two equal and permanent rematching clusters. Random rematching only oc-
curs within a rematching cluster which makes observations potentially correlated
within a cluster, but strictly independent between clusters.
1.3.3 Procedures
Experimental sessions were conducted at the GATE-Lab in Lyon by using z-Tree
(Fischbacher,2007).17 120 participants were recruited for 6 sessions in October
2019. Each session has 20 subjects recruited through a between-subjects design
and divided into two separate rematching clusters of ten players.18 This yields six
independent clusters of observations per condition, and twelve clusters in total.
Subjects are provided with the instructions of the game in paper form that are
read aloud by the experimenter.19 These instructions specify all the rules of the
game except the values of shocks. Participants are informed that this value will be
displayed on the decision screen, may be subject to variation during the experiment,
and that in a given period it remains the same for everyone. Once the instructions are
read, subjects are asked to answer nine comprehension questions displayed on their
screens. They are also informed about the correct answers with brief explanations.
In the main part of the experiment, each participant makes a series of 64 guessing
16The elicitation method is explained in Section 1.3.Ex ante, the design of our baseline condition
provides more suitable circumstances to observe rapid adjustment than the one in Cooper et al.
(2017), since we have an extra round and a belief elicitation mechanism that emphasizes best
replying to one’s expectations. However, the data suggest that the patterns of adjustment are
similar in both experiments.
17The experimental procedures have been approved by the GATE-Lab Review Board.
18See Appendix A.1 for the experimental materials and Appendix A.2 for a description of the
subject pool.
19Before reading the instructions and playing the game, subjects solve a series of questionnaires
which contain cognitive reflection test (CRT, Frederick 2005), reading the mind in the eyes test
(RMET, Baron-Cohen et al. 2001), a short-term memory test (STM) adapted from Wechsler digit
span test and seven questions designed to measure subjects’ propensity to reason in a heuristic
manner. We use these scores to test the sensitivity of our findings with respect to several individual
characteristics. This analysis is reported in Appendix A.3. The content and the measurement
method of each test are provided in Appendix A.1.
12
decisions. Each time, subjects first see a decision screen (see Figure 1.1) where they
enter their guesses. For a given guess, the computer automatically provides the
corresponding expectation of the average guess of others in their group. After seeing
these expectations, subjects can either revise or confirm their guesses.20 After each
decision screen, subjects pass to the feedback screen where they receive feedback on
the realized target number and their own payoffs.
Figure 1.1: Decision Screen
Notes: Example of a decision screen used in the experiment.
To smoothen the game, in each period the decision screen has a nonbinding
timer set to 60 seconds (except for the initial period of the game which has 120
seconds).21 Both screens also display the ongoing period and round, and provide a
summary of the previous outcomes: a figure representing the time series of previous
guesses and realized target numbers as well as a historical period-by-period table
containing own expectations and the actual average guesses of others, as well as own
stage game payoffs. To minimize any potential wealth or end-game effects, the final
payoff correponds to the payoffs accumulated in all periods of a randomly chosen
round of the game. At the end, subjects also reply to a demographic questionnaire.
20This method provides a consistent way for joint elicitation of guesses and expectations. Some
of the previous studies document systematic inconsistencies between expectations and decisions
that are elicited separately (Costa-Gomes and Weizs¨acker,2008). Moreover, LtFEs and AMEs
studies show that having both forecasting and optimizing tasks may be detrimental to learning
and cause mispricing (Bao et al.,2013;Hanaki et al.,2018).
21Kocher and Sutter (2006) show that average decision time in the first period of guessing games
is around 50 seconds and decreases gradually until the range 10-15 seconds after 20 periods. Thus,
this feature of our design should not put participants under excessive time pressure.
13
An experimental session took 150 minutes on average. Subjects were paid 7
euros for their participation and 14.08 euros on average for the experimental game.
1.4 Results
First, we analyze the group-level deviations from the NE and measure the adjust-
ment speed across rounds and experimental conditions. We rely on the statistical
framework previously adopted by Cooper et al. (2017). Second, we investigate the
within-period variation of individual guesses across experimental conditions. Lastly,
we compare the descriptive power of various expectation models by their one-period-
ahead forecast accuracy, as measured through the root-mean-squared-error (RMSE).
We also evaluate the changes in performance of models across rounds by comput-
ing their impact factors in an evolutionary learning model – the heuristic switching
model (HSM). This allows us to investigate whether the observed acceleration is due
to an increase in the share of subjects forming RE, or rather due to the adaptive
dynamics of simple expectation rules.
1.4.1 Adjustment Dynamics
Figure 1.2: Guesses by treatments and rounds
Notes: Dotted lines represent NE. In each round, a dot (triangle) corresponds to the median value of the average
group guesses in the baseline (treatment) condition.
We first turn to the aggregate outcomes and look at the evolution of the average
group guesses over time. Figure 1.2 summarizes the observed median values of
average group guesses (with 12 groups per experimental condition) across rounds
and periods. As expected, round 1 – in which the environment remains strictly
identical in both experimental conditions – generates the same patterns in the data
in baseline and treatment. In particular, in the pre-shock phase of round 1 this
14
median never fully converges to the NE level.22 The second salient observation is
that the convergence to the NE following shocks systematically exhibits a convex
pattern, but happens at varying speed.23
For a formal statistical comparison of the patterns of convergence in both ex-
perimental conditions, we use median quantile regression to estimate the following
model:24
¯pg,t −pNE
t=a0+atI[Period =t] + ϵg,t,(1.3)
where the dependent variable is the difference between the average group guess
(¯pg,t) and the NE guess (pNE) in a given period t, while the independent variables
are 63 period indicators (I[Period =t] equals 1 for period t, and 0 otherwise). The
coefficient of the final period is dropped to avoid linear dependencies.25 We run this
regression separately for baseline and treatment. Table 1.3 reports a subset of the
estimated coefficients corresponding to the post-shock periods in all four rounds of
the baseline and in rounds 1 and 4 of the treatment (in which the NE is the same
as in the baseline).
This model allows us to analyze the patterns of adjustment to the NE in two
steps. First, the intercept a0provides an empirical benchmark for convergence,
i.e., the ability to adjust guesses to the NE play as measured by the degree of
convergence to the NE that can be reached after 63 periods and with a full scope of
experience accumulation and learning (that we further investigate in Section 1.4.2)
in the experimental game. Testing whether this empirical benchmark differs from the
formal prediction based on the NE (H0: ¯p64 −pNE
64 = 0) boils down to testing for the
statistical significance of a0. For both baseline and treatment, the estimated values
of a0are similar and very close to zero (-0.38 and -0.42, respectively); in neither case
we reject their nullity (p= 0.133 and p= 0.285, respectively). Second, we build
on this first step and define the speed of adjustment as the earliest period in which
the outcome attains (in statistical terms) our empirical benchmark of convergence.
This period (denoted tc) indicates the point of reaching the adjustment benchmark:
for each period t≥tcof a given phase we fail to reject H0:at= 0.26
22We note that Cooper et al. (2017) observe the same pattern of adjustment under identical
shocks.
23The evolution of guesses in the pre-shock periods of rounds 3 and 4 also suggests that the path
of shocks experienced in the past does not affect per se the adjustment to the NE. Despite the
different sizes of shocks in round 2 between baseline and treatment, the adjustment to the NE in
the pre-shock phase of the following round is identical in both conditions. The same holds for the
adjustment to the NE in the pre-shock phase in round 4.
24This approach closely follows Cooper et al. (2017) and minimizes the role of potential outliers.
25The estimated standard errors are based on clustering with bootstrap resampling to take into
account the possible correlation between guesses within a rematching cluster. The number of
bootstrap samples follows Davidson and MacKinnon (2000).
26This definition echoes the definition of convergence proposed by Hyndman et al. (2012).
15
Table 1.3: Post-shock deviations from NE
Post-shock Baseline Treatment
periods Round 1 Round 2 Round 3 Round 4 Round 1 Round 4
122.76*** 22.08*** 19.31*** 18.38*** 24.89*** 20.86***
2 17.88*** 16.13*** 13.19*** 9.46*** 17.95*** 12.32***
3 13.93*** 10.13*** 8.38*** 4.57*** 13.15*** 5.70***
4 7.39*** 6.38*** 4.66*** 1.51*** 9.32*** 1.64*
5 5.22** 3.02*** 2.38*** -0.22 6.67** 0.17
6 1.64 1.14 0.99 -0.54 3.50 -0.82
7 3.40 1.26 0.37 -0.44* 3.98 -0.80*
8 2.65 0.74 0.15 -0.38 3.18 -0.42
Notes: Coefficients atand intercept term a0(in italics) from a median quantile regression model specified in
(3). Standard errors are clustered at rematching cluster level (6 clusters per condition) and bootstrapped with
1999 replications. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.
In accordance with Hypothesis 1, we observe gradual adjustment in the first
round of both baseline and treatment conditions. Adjustment accelerates across
rounds, but this acceleration is not necessarily monotonic. The adjustment periods
tcare respectively {6,6,6,5}in baseline and {6,4}in the first and the last rounds of
treatment when a 5% significance level is considered. Initial deviations remain high
in each post-shock phase, indicating that experience does not prevent deviations
despite repetition over four rounds.
Result 1a: In both experimental conditions, after the initial shock guesses grad-
ually adjust to the post-shock NE in a convex manner.
Result 1b: In both experimental conditions, adjustment occurs earlier in re-
sponse to the last shock (round 4) compared to the initial shock (round 1).
Notwithstanding our Hypothesis 2, these results indicate that the number of
periods required for adjustment in round 4 is similar in both conditions. We also
propose another way to test (and eventually reject) Hypothesis 2. First, we in-
vestigate the within-period variation of guesses between the two conditions. We
estimate the following median quantile regression model separately for each of the
eight post-shock periods of round 4:
pi=b0+b11[Treatment] + ϵi,(1.4)
16
where the independent variable 1[Treatment] = 1 for observations coming from the
treatment sessions (and 0 otherwise), and the dependent variable is the individual
guess (N= 120 per regression). Like before, we employ bootstrapped standard
errors clustered at rematching cluster level (999 replications). The coefficients b1
(reported in Table 1.4) remain insignificant in each of the eight models, suggesting
that the evolution of guesses over time (and thus their gradual adjustment to the
NE) in the final periods is statistically indistinguishable in both conditions. Figure
1.3 provides a visual representation of this pattern.
Table 1.4: The effect of heterogeneity in shocks on guesses of last post-shock phase
Periods 57 58 59 60 61 62 63 64
b039.70*** 32.89*** 28.40*** 25.10*** 23.49*** 23.07*** 23.21*** 23.60***
(1.54) (1.08) (0.50) (0.47) (0.44) (0.31) (0.11) (0.11)
b14.60* 3.21 1.35 -0.10 0.15 -0.67 -0.41 -0.35
(2.70) (2.02) (1.19) (0.94) (1.09) (0.79) (0.88) (0.71)
Notes: Coefficients from median quantile regression models specified in (4). Each coefficient comes from one regression (N= 120 per
regression). Below coefficients standard errors are reported in parentheses. Standard errors are clustered at the rematching cluster level (6
clusters per condition) and bootstrapped with 999 replications. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.
Figure 1.3: Median individual guesses
Notes: Median individual guess by experimental conditions for periods 55-64. Dots represent the NE. Circles
(triangles) represent guesses in baseline (treatment) condition. Whiskers denote standard deviations. Unit of
observation is the individual guesses.
Result 2: As compared to identical shocks, nonidentical shocks do not cause a
significant slowdown in adjustment.
1.4.2 Expectation Formation
In this section, we exploit the data on expectations retrieved from (and consistent
with) the guesses, as previously explained in Section 1.3.3. We consider a set of
17
expectation rules to provide a descriptive explanation for the observed aggregate
results.27 These rules, summarized in Table 1.5, are mainly derived from two classes
of learning models: adaptive (rules 1 to 6) and extrapolative (rules 7 to 13). Under
the data generation process described in equation 1.1, the adaptive learning can be
represented in a recursive form with the following formula:
pe
i,t =pe
i,t−1+w(p−i,t−1−pe
i,t−1),(1.5)
where a player expects the weighted average of the most recent outcome and his/her
own previous expectation. In the extrapolative expectations, a player tracks the
most recent change in the realized outcome in the following manner:
pe
i,t =p−i,t−1+γ(p−i,t−1−p−i,t−2).(1.6)
The coefficients wand γare learning parameters. We predetermine these parame-
ters based on their computational ease as an attempt to imitate different kinds of
boundedly rational reasoning. We also include two models (rules 6 and 13) where
parameters wand γare estimated from the individual expectations data with fixed
effects regressions and one equilibrium model (rule 14) where a player’s expectation
corresponds to the NE.28
The next three rules (15 to 17) reformulate rules 1, 3 and 9 (respectively) with
similarity-based learning. Under rule 15, for instance, a player expects the outcome
of last period to reoccur as in the case of na¨ıve expectations if the parameters in the
target formula did not change (i.e., if there were no shock). Once the target formula
has changed, the player reviews all the past periods and the new expectation now
coincides with the outcome of the most recent period involving an analogous change
t−m, such that ∆ξt= ∆ξt−mwhere mrefers to the distance between current period
and its most recent analogue from the past. If there is no such analogue, the player
simply expects the outcome of the previous period. Rules 16 and 17 apply the same
logic to adaptive (v1.) and weak trend-following rules.29
27Note that by taking the conditional expectation on both sides, equation 1.1 can be rewritten
as pi,t =Ei,t−1[pi,t] = (a+ξt) + bEh
i,t−1[p−i,t]. Thus, subjects choose their guesses pi,t as best
response to their one-period-ahead expectation about the average guess of other players in the
group that is Eh
i,t−1[p−i,t]. The superscript his placed to indicate that the process of expectation
formation is more general than RE and may be based on any rule h. For the sake of simplicity,
throughout the chapter we use the notation pe
i,t instead of Eh
i,t−1[p−i,t].
28Note that fundamentalism is not equivalent to RE since it ignores the fact that other play-
ers might be nonrational. For these two to be equivalent, one needs to assume homogeneous
expectations and common knowledge of rationality.
29For the sake of illustration, suppose that a subject uses rule 17. In period 25 of the baseline
game, s/he should extrapolate the change between periods 8 and 9 rather than between 23 and 24
(as it would the case in normal trend rule). Following this logic, the similarity-based reformulation
changes expectations in periods 25, 33, 41, 49, 57 for identical shocks, and for period 57 for the
nonidentical ones.
18
Table 1.5: Description of selected expectation rules for comparison
No Description Functional form
1Na¨ıve exp. pe
i,t =p−i,t−1
2 Obstinacy pe
i,t =pe
i,t−1
3 Adaptive exp. v1. pe
i,t = 0.75p−i,t−1+ 0.25pe
i,t−1
4 Adaptive exp. v2. pe
i,t = 0.50p−i,t−1+ 0.50pe
i,t−1
5 Adaptive exp. v3. pe
i,t = 0.25p−i,t−1+ 0.75pe
i,t−1
6 Fitted adaptive exp. pe
i,t = 0.89p−i,t−1+ 0.11pe
i,t−1
7 Strong trend-following exp. pe
i,t =p−i,t−1+ 0.75(p−i,t−1−p−i,t−2)
8 Medium trend-following exp. pe
i,t =p−i,t−1+ 0.50(p−i,t−1−p−i,t−2)
9 Weak trend-following exp. pe
i,t =p−i,t−1+ 0.25(p−i,t−1−p−i,t−2)
10 Strong contrarian exp. pe
i,t =p−i,t−1−0.75(p−i,t−1−p−i,t−2)
11 Medium contrarian exp. pe
i,t =p−i,t−1−0.50(p−i,t−1−p−i,t−2)
12 Weak contrarian exp. pe
i,t =p−i,t−1−0.25(p−i,t−1−p−i,t−2)
13 Fitted extrapolative exp. pe
i,t =p−i,t−1+ 0.08(p−i,t−1−p−i,t−2)
14 Fundamentalist pe
i,t =pNE
t
15 SBNE pe
i,t =
p−i,t−1,if ξt=ξt−1
p−i,t−m,if ξt=ξt−1
16 SBAE pe
i,t =
0.75p−i,t−1+ 0.25pe
i,t−1,if ξt=ξt−1
0.75p−i,t−m+ 0.25pe
i,t−m,if ξt=ξt−1
17 SBTF pe
i,t =
p−i,t−1+ 0.25∆p−i,t−1,if ξt=ξt−1
p−i,t−m+ 0.25∆p−i,t−m,if ξt=ξt−1
Notes: Each description provides a rule for how players expect the average guess of other players one-period-
ahead.
These rules have been selected for two main reasons. First, they are commonplace
in the literature (see Section 1.2).30 Second, they are based on backward-looking
heuristics (with the exception of the fitted rules and rule 14), so that their functional
forms are easy to compute (e.g., rule 1). Finally, we have excluded level-ktype of
expectations in the vein of the rule learning model of Stahl (1996) and the cognitive
hierarchy model of Camerer et al. (2004), since there is no common prior through
which level-0 type can form imitation after the first period.31
The goodness of fit of a given rule to the experimental data is based on the
30Note that the nomenclature used to describe these rules may vary across fields. For instance,
our rule 1 is equivalent to Cournot play in standard game theory and to the random-walk-believing
in finance.
31Alternatively, we could assume that level-0 type always selects randomly; however, this would
be unreasonable when imitation dynamics and evolution over time are considered.
19
aggregate one-period-ahead forecast error that is computed as the root mean squared
error (RMSE):
RMSE(ph
t) = sP64
t=3 PG
g=1(ph
g,t −pe
g,t)2
62 ×G,(1.7)
where ph
g,t is the prediction of rule h∈ {1, ..., 17}for the average expectation of
group gin period tand pe
g,t is the actual average expectation of group gin period t.
Here, the superscript Gis the scale of RMSE. A lower value of RMSE points to a
better fit. We measure the RMSE in three different ways: at the rematching cluster
level (G= 2), for each experimental condition (G= 12), and for the pooled data
(G= 24). We exclude the data from the first two periods since certain rules require
at least two past observations.
Panel A of Table 1.6 reports the RMSE for each of the seventeen rules. For
baseline as well as pooled data, SBNE (rule 15) achieves the best fit. For the treat-
ment data, it is slightly outperformed by SBTF (rule 17). In line with Hypothesis
4, the rules that are augmented with similarity-based learning (rules 15 to 17) yield
a better fit than the remaining ones, with RMSE two times smaller than under the
worst performing fundamentalist rule. The last line in Panel C shows how much
switching to similarity-based learning improves the fits. As can be seen, the im-
provements emerge in the baseline condition in which identical shocks reoccur in
a periodic manner. In the treatment condition, only the first and the last shocks
are identical which leaves less space for applying similarity-based reasoning.32 At
the cluster level, the rules that are augmented with similarity-based reasoning pro-
vide the best fit for most (10 out of 12) rematching clusters. The data from the
baseline (treatment) rematching clusters are best organized by the rules that are
derived from adaptive (extrapolative) learning. Lastly, in line with Hypothesis 3,
fundamentalism (rule 14) is always the worst fitting model regardless of the data
aggregation level.
Result 3: Backward-looking expectation rules describe the expectations better
than the RE.
Result 4a: Reformulating the rules with similarity-based reasoning improves
their fit to the data, especially when shocks are identical.
Result 4b: Overall, the SBNE rule has the best fit among all homogeneous
expectation rules.
32This is most starkly observable when we compare the improvements in RMSE from refining
rules with similarity-based learning for period 57, i.e. the first post-shock period in round 4. More
details are provided in Appendix A.2.
20
Note that these comparisons rely on the central assumption that all players
refer to the same expectation rule (i.e., under the homogeneity of expectations).
There is, however, a wide range of evidence indicating heterogeneity in expectations
(Hommes,2011). Moreover, expectation rules may vary not only across individuals,
but also over time: a given rule may at first perform poorly, but later on become
more relevant due to the experience. For this reason, we first refer to an evolutionary
model of expectations: the HSM of Anufriev and Hommes (2012).33 According to
the HSM, agents choose the expectation rule from a set of heuristics, evaluate the
performance of each heuristic over time and switch to the heuristic that performs
best in terms of the forecasting error. Accordingly, the one-period-ahead expectation
of the HSM for group gis
pHSM
t+1 =
H
X
h=1
nh,tph
t+1,(1.8)
where nh,t is the impact factor of heuristic hat period t. This impact factor can be
interpreted as the weights attributed by agents to different heuristics. The impact
factor depends on the performance of the heuristic measured with the current and
past squared forecast errors
Uh,t =−(¯p−i,t −ph
t)2+ηUh,t−1,(1.9)
where η∈[0,1] is a free parameter representing the weight assigned to the past
performance compared to current. η= 0 implies that only the performance in the
most recent period matters. The impact of heuristic is updated through a discrete
choice model with asynchronous updating described by
nh,t =δnh,t−1+ (1 −δ)exp(βUh,t−1)
PH
h=1 exp(βUh,t−1),(1.10)
where the impact of the expectation heuristic hat period tdepends on its accumu-
lated impact and its relative performance normalized with the sum of all competing
heuristics. There are two free parameters in (10). The first one, δ∈[0,1], rep-
resents the proportion of agents who do not update their heuristic each period, or
the individual inertia in beliefs. The second parameter, β > 0, represents agents’
sensitivity toward differences in performances.34
To compute expectations with HSM and compare fits, one must first determine
which expectation heuristics to include, and then set their initial impacts as well
as assign the values to free parameters η, δ, β. Following a common practice in
33Their model inherits its main features from Brock and Hommes (1997).
34β= 0 would imply equal impacts regardless of the differences in performances.
21
the literature, we consider four different heuristics. We consider three classes of
expectation rules – adaptive, extrapolative and similarity-based – and, for each
of them, choose the best-fitting specification (rules 3, 8, 15, respectively).35 We
also include an equilibrium-based rule: fundamentalism (rule 14). We assign the
initial impact factors equal to nh,3= 0.25 for all hand set the free parameters to
η= 0.1, δ = 0.4, β = 0.1. By trial and error, we discover that this combination of
free parameters fits the data best.36
The second heterogeneous expectation model we consider – MIXEXP – follows
Haltiwanger and Waldman (1985,1989) and is based on the assumption that each
group is composed of nrational players and 5 −nna¨ıve players. The main reason
for including it in our comparison is that it provides good descriptive fit to the
experimental data in Cooper et al. (2017). Like them, we consider a single rational
player per group (n= 1). The remaining players make na¨ıve forecasts according to
the SBNE. In each round, the subject whose prediction error in the first post-shock
period is the smallest in a given group is considered as the rational player in that
group.37 The rational player forecasts consistently with RE:
pRE
i,t = ¯p−i,t +ϵi,t,(1.11)
where ϵi,t ∼ N(0,1).
Panel B in Table 1.6 reports RMSE for each of the heterogeneous expectation
models.38 These results show that the HSM fits better than all the other rules, in-
cluding MIXEXP, at every data aggregation level. The improvement in fit compared
to the best homogeneous expectation rule ranges between 15% and 53% depending
on the scale and equals 32% at pooled level. Albeit not as well performing as the
HSM, MIXEXP performs better than all the homogeneous expectation rules which
makes it the second-best expectation model in our exercise.
Result 5a: Models with heterogeneous expectations better fit the data than
those with homogeneous expectations.
Result 5b: Overall, the HSM has the best fit among all the compared expecta-
tion models.
35Note that for the class of adaptive expectations, the best fitting rule is na¨ıve expectations
(rule 1). However, since SBNE (rule 15) is already a refined version of that rule, we include rule 3
instead.
36As a robustness check, in Appendix A.3 we provide RMSE obtained under various combinations
of free parameters. While the outcomes do vary in absolute terms, the relative standing of different
rules for a given period remains fairly stable.
37We are indebted to Michael Waldman for suggesting us this strategy.
38For the HSM, RMSE is computed for periods 4-64 where expectations are endogeneously
determined.
22
Figure 1.4: Impact of heuristics across periods
(a) Baseline
(b) Treatment
Notes: Impact factors of the HSM across periods and experimental conditions. The symbols circle, cross, square
and triangle represent respectively SBNE, adaptive expectations (v1), weak trend-following and fundamentalist
expectations.
Finally, Figure 1.4 shows the evolution of the impact factors of different heuristics
over time in the baseline and treatment conditions. Notwithstanding the claim of
”experience eliminates na¨ıvety”, for both experimental conditions we find that SBNE
attains the highest average impact factor in the final round of the game.39 On the
opposite extreme, fundamentalist expectations attract very low weights during the
first two rounds, but their role increases toward the final round.
39Appendix A.2 reports the average of impact factors separately for pre-shock and post-shock
phases.
23
Table 1.6: RMSE of expectation rules
Rule Baseline: cluster-by-cluster Treatment: cluster-by-cluster All-cluster
12345612 3 4 5 6 BT B+T
Panel A - Homogeneous expectation rules
16.45 6.59 5.71 7.24 5.80 8.18 6.71 6.23 6.53 6.50 7.21 6.56 6.71 6.63 6.67
2 7.55 7.02 6.49 8.14 7.14 8.28 7.55 7.48 7.86 7.56 7.80 7.57 7.47 7.64 7.55
3 6.65 6.58 5.78 7.31 6.03 8.05 6.83 6.42 6.74 6.66 7.12 6.66 6.78 6.74 6.76
4 6.91 6.65 5.93 7.49 6.34 8.02 7.01 6.71 7.04 6.90 7.20 6.87 6.93 6.96 6.94
5 7.21 6.80 6.17 7.77 6.71 8.10 7.26 7.06 7.42 7.20 7.43 7.17 7.16 7.26 7.21
6 6.53 6.58 5.73 7.25 5.89 8.11 6.75 6.30 6.61 6.56 7.15 6.59 6.73 6.66 6.70
7 7.64 8.33 7.43 9.30 6.97 9.85 8.40 7.75 6.84 7.69 9.39 7.78 8.32 8.01 8.17
8 6.68 7.33 6.45 8.24 6.04 8.83 7.32 6.77 6.15 6.74 8.34 6.87 7.33 7.07 7.20
9 6.25 6.71 5.83 7.52 5.62 8.24 6.72 6.23 6.04 6.31 7.58 6.44 6.76 6.57 6.67
10 9.87 9.00 8.07 9.12 9.09 10.85 9.74 9.02 10.32 9.90 8.68 9.59 9.37 9.56 9.47
11 8.41 7.84 6.94 8.10 7.69 9.58 8.36 7.75 8.81 8.45 7.80 8.26 8.13 8.25 8.19
12 7.22 6.99 6.11 7.44 6.55 8.65 7.30 6.77 7.51 7.26 7.29 7.21 7.20 7.23 7.22
13 6.64 6.67 5.79 7.25 5.99 8.28 6.84 6.35 6.80 6.69 7.19 6.72 6.82 6.77 6.79
14 10.02 10.10 11.25 14.29 11.55 13.80 10.68 12.32 12.47 11.07 17.07 13.65 11.95 13.05 12.51
15 4.61 4.58 4.00 6.62 4.08 6.13 6.14 6.23 6.47 6.58 7.26 6.47 5.10 6.54 5.86
16 4.82 4.55 3.99 6.61 4.18 5.99 6.29 6.40 6.66 6.72 7.16 6.57 5.11 6.64 5.93
17 4.91 5.09 4.56 7.13 4.31 6.42 6.11 6.27 6.00 6.48 7.64 6.36 5.50 6.50 6.02
Panel B - Heterogeneous expectation models
HSM 2.45 3.45 3.37 5.46 2.56 5.08 3.02 3.47 4.56 2.95 5.51 4.34 3.90 4.08 3.99
MIXEXP 4.13 3.94 3.44 5.63 3.38 5.15 4.71 5.62 5.09 5.12 6.31 5.34 4.36 5.39 4.90
Panel C - Changes in fit between models
∆HSM 47% 24% 16% 17% 37% 15% 51% 44% 24% 53% 23% 32% 24% 37% 32%
∆ SB 28% 31% 31% 10% 30% 26% 9% 0% 1% -3% 0% 1% 24% 1% 12%
Notes: RMSE calculated at different data aggregation levels. The first column indicates the number assigned to the rule
in Table 1.5 for panel A. Columns 2 to 7 are cluster numbers that were in baseline condition and columns 8 to 13 are
cluster numbers in the treatment sessions. The last three columns indicate the experimental conditions where ”B” refers
to baseline (G= 6), ”T” refers to treatment (G= 6) and ”B+T” refers to pooled data (G= 12). Bolded values in panel
A indicate the lowest RMSE among homogeneous expectation rules. HSM and MIXEXP in panel B refer respectively to
the RMSE of the HSM and of the model with one rational player. Panel C reports the changes in fits for three comparison
groups. ∆ HSM reports the improvement in fit by using the HSM compared to the best fitting homogeneous expectation
rule. ∆ SB reports the changes in fit between the best upgraded rule with similarity-based learning compared to the
nonupgraded version of the same rule. A negative value of RMSE implies a loss in fit.
1.5 Discussion
The results outlined in Section 1.4.2 reveal that the refinement of expectation rules
with similarity-based learning improves their fit considerably when shocks are iden-
tical. Notably, the SBNE rule performs well under both homogeneous and hetero-
geneous expectations. Under this rule, a player expects the last period’s outcome
to reoccur in stationary phases as it is the case in na¨ıve expectations. In case of a
change in the environment, this expectation rule points to the outcome of the most
recent period characterized by the same change.40 This rule is consistent with the
40The model used herein is based on a particular definition of similarity which rules out any dis-
crepancy between similar events. Note that this definition can be relaxed allowing for an arbitrary
degree of discrepancy between similar events.
24
Gilboa and Schmeidler (1995) case-based decision theory and the similarity-based
learning model of Plonsky et al. (2015), both of which suggest that agents choose
the action which generated the best outcome under similar circumstances that an
agent can recall from the past. Thus, the SBNE rule can be viewed as a combination
of the similarity-based learning process with na¨ıve heuristic that is applied to the
domain of expectations formation.
This framework proposes a potential explanation for why the trend-following
rule with a strong extrapolation parameter fits the data well in LtFEs and poorly in
guessing games like ours. The main difference between LtFEs and guessing games
is that while the LtFEs inform subjects only qualitatively on the data generating
process, guessing games provide quantitative information by disclosing the target
formula. If there are also unexpected large shocks as in the LtFEs of Bao et al.
(2012), the last period is less likely to be perceived as the most similar state. A
stronger extrapolation of recent changes may thus help detect the arrival of a shock.
This strong extrapolation also creates a self-fulfilling prophecy, since it endogenously
generates large oscillations around equilibrium. In guessing games, players may
judge the similarity with certainty so that there is less necessity for extrapolation.
The na¨ıve and weak trend-following rules therefore tend to perform better. For
instance, this may explain why a trend model with extrapolation factor 1 performs
poorly in the AMEs of Marquardt et al. (2019): weak trend-following and na¨ıve
heuristics create slow convergence toward NE, analogously to what we observe in
our experiment.
Lastly, the results in the second part of Section 1.4.2 indicate that allowing
heterogeneity in expectations through the HSM or MIXEXP improves the fits sub-
stantially in line with the evidence from the LtF literature.41 The evolution of
impact factors computed through HSM implies that there is more heterogeneity in
nonidentical shocks condition. This pattern – if proven to be robust – suggests
that the presence of real-world complications such as nonidentical shocks may make
coordination over an expectation rule more difficult.
1.6 Conclusion
We investigate the evolution of adjustment speed across repeated negative shocks
that can be either identical (baseline) or nonidentical (treatment). We ask whether
adjustment accelerates over repetitions and whether this acceleration varies across
the different types of shocks. For this experimental variation, we find that ad-
justment accelerates thanks to repetition, yet only slightly: despite four repetitions,
41This kind of heterogeneous expectation models are gaining momentum in policy applications.
See Cornea-Madeira et al. (2019) for an application of HSM and Hommes (2018) for an overview.
25
convergence speeds up by only two periods at best. Nonidentical shocks do not cause
significant slowdown in adjustment, and adjustment acceleration remains weak re-
gardless the type of shock. A descriptive analysis of the expectation formation
process reveals that the backward-looking rules organize the data well. In particu-
lar, rules refined with similarity-based learning approach outperform the others in
terms of predictive power.
Our experiment successfully documents the robustness of the finding of Cooper
et al. (2017) from a guessing game with strategic complementarity: a gradual and
convex adjustment in response to identical shocks and its acceleration over repeti-
tion. This evidence strengthens the empirical validity of the strategic environment
effect, in line with an early conjecture by Haltiwanger and Waldman (1985). Further-
more, these patterns of learning to play equilibrium under strategic complementarity
persist in a more complex environment with time-varying shocks. We fit a large set
of expectation rules to provide an individual-based explanation to the observed ag-
gregate dynamics. The SBNE rule, a simple learning rule first proposed by Cooper
et al. (2017) and not yet studied in a comparative analysis, outperforms other rules
in terms of descriptive accuracy.
The main implication of these findings is that inertia in adjustment may rather
persist over time. The fact that the type of shock does not affect behavioral dynam-
ics suggests that sluggishness is an inherent feature of strategic complementarity.
Importantly, our design also does not include any market frictions that are usually
considered as the main drivers of sticky behavior. This, in turn, suggests that cogni-
tive frictions such as nonrational expectations suffice to create stickiness, and poten-
tially opens the door for policy implications. Although an experimental testbed for
policy instruments is beyond the scope of this study, we note that monetary policy
interventions may prove to be effective.42
Despite its virtues, our study may also have certain limitations. Here, we set
the number of shocks to four and one may claim that this is not enough for a
major acceleration. While this might be a limitation of experiments in general, we
reckon that four rounds should be sufficient for observing accelerated adjustment in a
relatively simple environment like ours. Another design choice is to rematch subjects
in the beginning of each new round to control for factors that are accumulating
across rounds such as the degree of strategic uncertainty. This random rematching
mechanism might partially be the reason for limited acceleration and it may stand
42Cornand and Heinemann (2019b) show that in a New Keynesian framework, monetary policy
obeying the Taylor principle decreases the degree of complementarity between pricing decisions of
firms and even turns them into strategic substitutes if its effect on aggregate demand is sufficiently
strong. In a similar vein, Assenza et al. (2021) show through their New Keynesian LtFEs that the
Taylor principle with sufficiently strong interest rate rule (ξπ= 1.5 in their experiment) manages
convergence to the forward stable solution.
26
at odds with certain real-world environments, such as asset markets.43 Lastly, we
only look at negative shocks. Even though the sign of a shock should not matter
in guessing games, it may matter in a pricing context. We believe that varying the
form of the heterogeneity of shocks constitutes a possible agenda for future studies.
43Cooper et al. (2017) test this argument in an auxiliary treatment and find that the main results
are qualitatively unchanged. In the light of this result, the question of matching scheme should be
less of a concern.
27
Appendix A
Appendix
A.1 Experimental Material
A.1.1 Instructions and Comprehension Questions Translated
to English
General Instructions
The experiment has 4 rounds and each round has 16 periods, a total of 64
periods. At the beginning of the experiment, you will be randomly assigned in
groups of five. You will only interact with other players in your group. At the
beginning of each new round, the groups will be reconstituted in a random manner.
This means that you will play in the same group during a round, and that the
composition of your group will vary randomly from one round to another.
Your Task
At the beginning of each period, each participant will be asked to choose a
number between 0 and 100, inclusive. This number can be up to two decimals, such
as 11.35 or 95.23. No participant will be able to see the number chosen by another
participant.
In each period, each player has a ”Target Number”. At the end of each period,
the player in your group whose chosen number is closest to his or her target
number will win the prize of 4.40 Euros for that period. The other players will earn
0 euro for this period. If several players have the same distance from their target
number, the prize of 4.40 euros will be divided equally between these players, while
the others would win 0 euros.
The target number of each player is calculated using the following formula:
Target number = 0.75 ×(average of the numbers chosen by the other players in
your group) + a constant
Here, ”the average of the numbers chosen by the other players in your group”
is equal to the sum of the numbers chosen by the other players in your group
28
divided by four. This average is calculated in the same way for all participants
in the experiment. All participants will be informed about the constant through
their decision screen. This constant is the same for all participants but may change
during the experience. When a change occurs, this change will be announced to all
participants on their screen. Please check the formula for each period.
Decision Screen
The target number formula is going to be shown on the screen. On this screen,
you can enter your decision in a cell. When you click on the ”OK” button, the
program will show you the ”Average of the numbers chosen by the other players
in your group” for which your guessing decision corresponds to. After seeing this
information, you can change your decision as many times as you want. Once you
click on ”Confirm” button your decision for this period will be final.
Note that there is limited time for decisions in each period and you can track
the remaining time on your screen. You will have 120 seconds for your first decision
and 60 seconds for each decisions of the remaining periods. A table and a figure also
allow you to follow your previous decisions and the previous average decision
of the other players in your group.
Payment
At the end of the experiment, the computer will randomly select one of the
rounds played, and your final payment will be based on the payoffs that you have
accumulated during this round, plus 5 + 2 = 7 euros for participation and the
questionnaire you answered.
Comprehension Questions
You will now answer several questions designed to check whether you understood
the rules of the game. The button in the middle of the screen will allow you to access
a calculator when you need it.
True of False Questions:
Question 1: There are 4 other players in my group.
Question 2: I play with the same group of players throughout the experience.
Question 3: The formula for the target number may change during the experi-
ment.
Question 4: All players have their own formula for the target number.
Question 5: I will be paid based on my accumulated winnings during a randomly
chosen round.
Questions Based on an Example:
Imagine that the formula for the target number is equal to
Target number = 0.75 ×(Average of the numbers chosen by the other players)
+ 15
Question 6: If the other players in the group chose 10, 30, 35, 85 as decisions for
29
the target number, what do you think is the ”Average of the numbers chosen by the
other players in your group”?
Question 7: What would your target number be equal to in this situation?
Question 8: Imagine that you chose number 55 as the decision for this period.
What is the distance between the target number and your decision?
Question 9: In this example, the distances between the chosen numbers and the
target numbers for the other players are respectively: 47.18, 17.5, 0.31 and 41.87.
In this example, are you the winner?
Answers and Explanations Provided to the Subjects:
Question 1: True.
Explanation: There are 5 players in each group and 4 others when you are
excluded.
Question 2: False.
Explanation: At the beginning of each new round (17th, 33rd and 49th periods),
the groups will be reconstituted in a random manner. This means that you will play
with same group members during a round, and that the composition of your group
will vary from round to round.
Question 3: True.
Explanation: The formula for the target number may change. Please pay atten-
tion in each period.
Question 4: False.
Explanation: The formula for the target number for a given period is the same
for all players.
Question 5: True.
Explanation: At the end of the experiment, one of the four rounds will be ran-
domly selected and you will get your winnings that are accumulated during this
round.
Question 6: 40.
Explanation: The correct answer is 40. This is the average number of the other
players in the group, in this example: (10 + 30 + 35 + 85) / 4 = 40.
Question 7: 45.
Explanation: The correct answer is 45. The target number is calculated using
the formula for the target number: 0.75 x 40 + 15 = 45.
Question 8: 10.
Explanation: The correct answer is 10. The target number is 45 and you have
chosen 55. The distance between these two numbers is 10.
Question 9: No.
Explanation: Your distance (10) is not the smallest. 0.31 is the smallest distance
in this group.
30
A.1.2 Tests Translated to English
Note that subjects solve these tests before the main part of the experiment. So,
instructions presented here are the initial instructions that subjects see.
Initial Instructions
Welcome!
You will participate in an economic experiment. During this experiment, you are
not allowed to communicate with other participants. If you have a cell phone, please
turn it off. If you have a question, press the red button on your left or raise your
hand, the experimenter will come to see you; don’t ask your question out loud. If
the question is relevant to all participants, we will repeat it and answer it out loud.
If you do not respect these rules, we will have to exclude you from the experiment
and therefore from the payment.
All the information you provide, as well as the amount of your payoffs during
this experiment, will be kept strictly confidential and anonymous. Participating in
this experiment will gain you money. Your winnings will be paid to you privately
at the end of the experiment. You earn 5 euros for showing up on time, 2 euros
for answering a series of questions and an additional amount that varies between 0
and 70 euros. The additional payment depend on your decisions and may also be
influenced by decisions made by others.
First of all, before starting the actual experiment, we ask you to answer a series
of preliminary questions. You will answer these questions using the interface on
your computer screen.
CRT Questions
1) A notebook and a pencil cost 1.10 Euros. The notebook costs 1 Euro more
than the pencil. How many cents does the pencil cost? (correct answer: 0.05 cents)
2) Assuming that 5 machines take 5 minutes to make 5 pens, how long would it
take 100 machines to make 100 pens? (correct answer: 5)
3) In a lake, there is a patch of lily pads. Every day, the patch doubles in size.
If it takes 48 days for the patch to cover the entire lake, how long would it take for
the patch to cover half of the lake? (correct answer:47)
Representativeness Heuristic Questions
Please read the descriptions below and answer the questions.
Description 1: Linda is 31 years old, she is single, frank and very bright. She has
a master’s degree in philosophy. As a student, she was very concerned about issues
of discrimination and social justice, she also participated in anti-nuclear demonstra-
tions.
Please rank the following statements based on their likelihood, using 1 for the
most likely and 7 for the least likely:
31
1) Linda is a primary school teacher.
2) Linda works in a bookstore and takes yoga classes.
3) Linda is active in the feminist movement.
4) Linda is a bank teller.
5) Linda is a social worker in a psychiatric environment.
6) Linda is an insurance salesperson.
7) Linda is a bank teller and is active in the feminist movement.
Correct answer: If option 7 is judged more probable than options 3 and 4, then
the answer is correct.
Description 2: A certain city is served by two hospitals. About 45 babies are
born every day in the big hospital and about 15 babies are born every day in the
small hospital. As you know, about 50% of new born babies are boys. The exact
percentage of baby boys, however, varies from day to day. Sometimes it can be more
than 50%, sometimes less.
For a period of one year, each hospital recorded the days when more than 60%
of the babies born were boys. Which hospital do you think had the highest number
of such days?
A) The big hospital
B) The small hospital
C) Same for the two hospitals.
Correct answer: Option B is the correct answer.
Availability Heuristic Questions
Below, each item includes two possible causes of death. The question to which
you must answer is the following: among the two possible causes of death, which is
the most frequent, in general, in France? For each pair of possible causes of death,
(a) and (b), we want you to choose the cause that you think is the most common.
Pair 1 (a) Road accidents (b) Diabetes (correct answer: b)
Pair 2 (a) Homicide (b) Suicide (correct answer: b)
Pair 3 (a) Stroke (b) All accidents (correct answer: a)
Pair 4 (a) Falls (b) Drug use (correct answer: a)
Pair 5 (a) Lightning strike (b) Poisoning (correct answer: b)
Reading the Mind in the Eyes Test
You will now see a series of images presenting pairs of eyes, as well as 4 words.
For each pair of eyes, choose the word that best describes what the person in the
image thinks or feels. You may feel that more than one word may apply, but choose
only the word that you consider most appropriate. Before making your choice, make
sure that you have read the 4 propositions correctly.
There will be a total of 36 questions to be answered within 10 minutes. Try to
respond as quickly and accurately as possible. To respond, select one of the choices
32
displayed below the image and click OK at the bottom. Please note that once you
click OK, you will not be able to return to the previous questions.
Click OK to proceed to the test question.
Training question:
Figure A.1: Test picture in the RMET
Options: 1: jealous, 2: panicked, 3: arrogant, 4: hateful
Correct answer and explanations:
Among the choices: ”jealous”, ”panicked”, ”arrogant”, ”hateful”, the correct
answer was: ”panicked”. Click OK to start the other questions. From now on,
correct answers will no longer be shown.
As a reminder, there will be a total of 36 questions which you will have to answer
within 10 minutes. Try to respond as quickly and accurately as possible.
Short Term Memory Test
You will now look at several sets of slides. At the start of each series, you will
see the word ”Ready?” then a sequence of numbers appearing one after the other.
At the end of each series, you will enter in the input zone the sequence of numbers
that you will have observed.
The sequences will become longer as you enter correct answers. Your goal is to
go as far as possible. You will have two tries. For example, if in a series you see 1,
then 3 and then 5, you would enter 135.
First sequence: 69, 929, 1021, 34634, 943453, 7374865, 69358267, 690875725,
6457803021, 26456897198, 601518340985, 1285246589042.
Second sequence: 25, 217, 8618, 48629, 727240, 1203439, 32904142, 750572970,
1720378975, 62617825067, 609295956490, 1678606889148.
A.1.3 Score Measurement, Procedures and References for
Tests
The cognitive reflection task is retrieved from Frederick (2005) and adapted to
French. Each correct answer is considered as one point in the score calculation.
There is a nonbinding time limitation which is 30 seconds for this part.
33
The heuristic questions are retrieved and adapted to the French population from
the studies of Kahneman and Tversky (1972), Tversky and Kahneman (1973) and
Fischhoff et al. (1977). The correct answers are determined from the data of World
Health Organization’s (WHO) report on global health estimates between 2000-2016
(World Health Organization,2018). Each wrong answer is considered as one point
in the score calculation.
The French version of The Reading the Mind in the Eyes Test is retrieved from
Prevost et al. (2014). Each correct answer is considered as one point in the score
calculation. There is a binding time limitation which is 10 minutes for this part.
The short-term memory test is retrieved from Wechsler digit span test. The
score is calculated as the maximum of the number of digits accurately remembered
in both sequences. Each number stays in the screen for 2 seconds and the box where
subjects type the number appears with 2 seconds delay after the last number.
A.2 Additional Figures and Tables
Table A.1: Description of subject pool
Pooled Baseline Treatment
Age 21.43 21.83 21.03
(1.91) (2.32) (1.26)
Share of women 42% 38% 47%
(0.49) (0.49) (0.50)
Baccalaureate grade 15.97 15.76 16.18
(2.03) (1.99) (2.04)
CRT score 1.49 1.4 1.58
(1.14) (1.07) (1.20)
Eyes score 27.2 26.97 27.43
(3.92) (4.30) (3.48)
Memory score 7.05 7.02 7.08
(1.67) (1.95) (1.33)
Represent. score 1.29 1.30 1.28
(0.65) (0.64) (0.66)
Availability score 2.13 2.00 2.27
(1.02) (1.05) (0.98)
Notes: Below averages, the standard deviations are reported in paren-
theses.
34
Table A.2: Improvements from similarity refine-
ment in the initial post-shock period
Baseline Treatment
Periods 25 41 57 57
∆SBNE -0.117 0.383 0.682 0.286
∆SBTF -0.363 0.152 0.451 0.210
∆SBAE 0.000 0.502 0.723 0.335
Notes: Each row reports the percent change in RMSE when
the SBNE, SBTF or SBAE rules are taken instead of their
non-refined equivalents. The columns correspond to the data
from the initial post-shock periods (one per round)
Table A.3: Average of impact factors in the HSM
Panel A - Average impact factors of pre-shock phases
Baseline Treatment
Rounds Rule 15 Rule 3 Rule 8 Rule 14 Rule 15 Rule 3 Rule 8 Rule 14
10.31 0.37 0.28 0.03 0.32 0.34 0.30 0.04
2 0.29 0.27 0.24 0.21 0.31 0.34 0.26 0.09
3 0.47 0.19 0.21 0.12 0.29 0.26 0.31 0.14
4 0.42 0.17 0.19 0.23 0.23 0.20 0.21 0.35
Panel B - Average impact factors of post-shock phases
Baseline Treatment
Rounds Rule 15 Rule 3 Rule 8 Rule 14 Rule 15 Rule 3 Rule 8 Rule 14
10.33 0.36 0.30 0.00 0.36 0.31 0.34 0.00
2 0.34 0.29 0.31 0.06 0.27 0.24 0.35 0.13
3 0.45 0.22 0.27 0.06 0.31 0.22 0.43 0.03
4 0.37 0.15 0.32 0.16 0.37 0.16 0.34 0.13
Notes: Each value in panel A and B represents the average impact factors computed for the HSM described in
Section 1.4.2 for pre-shock or post-shock phases. The pre-shock phase of first round includes periods 4 to 8.
A.3 Robustness Analyses
A.3.1 Calibration of Free Parameters in HSM
The HSM has three free parameters where each has its own behavioral implication.
To avoid arbitrariness, we provide a robustness analysis by comparing our bench-
mark HSM, denoted as HSM 1, with the benchmark HSM of Bao et al. (2012),
35
Figure A.2: Individual guesses across periods
Notes: Distribution of individual guesses across periods. 120 subjects per period. Plus signs (cross signs) represent
baseline (treatment) conditions.
Figure A.3: Cluster guesses across periods
Notes: Average guess of clusters by periods. The identification number of clusters are denoted above each figure.
Clusters 3, 4, 7, 8, 11, 12 belong to baseline sessions and clusters 1, 2, 5, 6, 9, 10 belong to treatment sessions.
denoted as HSM 4. For analytical tractability, we add two other versions of HSM
where in each step one free paramater gets closer to the one in Bao et al. (2012).
Table A.4 reports the results and Figure A.4 projects the impact factors as time
series. Results show that the fit worsens when any of the parameter increases ce-
teris paribus and the RMSE of HSM 4 is almost equal to the best homogeneous
expectation rule. Nonetheless, the fits of HSM 2 and 3 are better than any of the
homogeneous rule. Step-wise changes imply that the HSM applied to our data is not
36
much sensitive to changes in parameters ηand βbut it is somehow sensitive to the
abrupt changes in δ, at least for the treatment condition. This parameter represents
the proportion of agents who do not update their impact factor each period and a
value of 0.9 is behaviorally hard to justify. In conclusion, our benchmark results are
robust to the medium-level changes in parameters.
Table A.4: RMSE of HSM under different values of free
parameters
Comparison Level HSM 1 HSM 2 HSM 3 HSM 4
Baseline 3.90 4.43 4.54 5.10
Treatment 4.08 5.68 5.68 6.46
Pool 3.99 5.09 5.14 5.82
η0.1 0.7 0.7 0.7
δ0.4 0.4 0.4 0.9
β0.1 0.1 0.4 0.4
Notes: HSM 1 is equivalent to our benchmark HSM reported in Table 1.6.
HSM 4 has the benchmark parameters combination from Bao et al. (2012)
and HSM 2 and 3 provide an intermediate change in parameters.
A.3.2 Cognitive Skills and Individual Expectations
In this Appendix section, we ask whether individual scores on cognitive skills tests
predict the accuracy of expectations and the best fitting expectation rule. For this
sake, we compute the average relative prediction error (ARPE) in a given round
1
16
T=16
X
t=1
|pe
i,t −¯p−i,t|
¯p−i,t
,(A.1)
as well as the goodness of fit of the different expectation models per subject, as mea-
sured through RMSE. We then regress each of these measures on the set of subjects’
test scores. These test scores are designed to measure some of the cognitive skills of
subjects. Baccalaureate grade is the score that subjects obtained upon completion
of their secondary education. The CRT score is computed through three items and
is designed to measure one’s ability to reflect on a question and override reporting
the gut response. The RMET score is computed through thirty-six items and is
designed to measure one’s capacity to infer the internal emotional states of others.
Memory task is designed to measure one’s short-term memory capacity through the
number of items that one can remember with the correct order after seeing a se-
quence of numbers. The representativeness (availability) score is computed through
37
Figure A.4: Impact factors from various HSMs
(a) Baseline - HSM 2 (b) Treatment - HSM 2
(c) Baseline - HSM 3 (d) Treatment - HSM 3
(e) Baseline - HSM 4 (f) Treatment - HSM 4
Notes: Impact factors calculated with the HSM 2, 3, 4 across periods and experimental conditions. The symbols
circle, cross, square and triangle represent respectively SBNE, adaptive expectations (v1), weak trend-following and
fundamentalist expectations.
38
two (five) items and designed to measure one’s propensity for reasoning according
to the representativeness (availability) heuristic.
All test scores are standardized as Scorei−min(Score)
max(Score)−min(Score). Round and treatment
dummies, as well as their interactions, are also included in the model. We use a
random effects specification (N= 480 per regression).
Tables A.5 and A.6 report the corresponding estimates. Overall, the included
test scores only weakly explain the variation in either measure of interest. A higher
RMET score predicts a lower forecast error and a better fit for the fundamentalist
rule. By contrast, in groups where subjects are more prone to representativeness
heuristic, Nash play is less likely to occur.
The round dummies are systematically found to explain the dependent variables.
Compared to round 1, in round 4 the values of ARPE drop and the fits of all the
compared rules improve. Note that even with this improvement, the fundamental-
ist rule still performs worse than the other rules. The treatment dummy has no
significant impact on any variable, but its interaction with round 4 worsens the fit
of all the rules. So, the improvement in the goodness of fit over time is attenuated
under nonindentical shocks. The coefficient of this interaction term is also positive
for ARPE (in line with Hypothesis 2), but not statistically significant.
39
Table A.5: RMSE of expectation rules and individual characteristics
Variables SBNE SBAE SBTF Fundamentalism
Intercept 10.52*** 10.37*** 11.42*** 19.14***
(3.36) (3.13) (3.39) (2.83)
Treatment -0.75 -0.58 -0.81 0.24
(1.42) (1.41) (1.55) (1.62)
Round 2 0.74 0.52 0.81 -2.26***
(1.19) (1.22) (1.26) (0.99)
Round 3 -1.69 -1.95* -1.37 -4.38***
(1.14) (1.13) (1.23) (0.95)
Round 4 -3.67*** -3.81*** -3.43*** -8.05***
(1.28) (1.202) (1.38) (0.87)
Round 2 ×Treatment -0.18 -0.06 -0.16 -2.98**
(1.31) (1.34) (1.38) (1.16)
Round 3 ×Treatment 2.28 2.51 1.74 4.38**
(1.55) (1.54) (1.62) (1.71)
Round 4 ×Treatment 6.47*** 6.43*** 6.25*** 3.24***
(1.40) (1.34) (1.48) (1.24)
Baccalaureate grade 2.99** 2.66** 2.66** 2.35*
(1.22) (1.24) (1.11) (1.32)
CRT score -1.53 -1.54* -1.65* -0.81
(0.95) (0.88) (0.97) (0.74)
RMET score -2.56 -1.88 -3.05 -5.41***
(1.94) (1.73) (1.97) (1.99)
Memory score -2.95 -2.43 -3.13 -2.55
(3.44) (3.33) (3.51) (3.52)
Represent. score 0.81 0.61 1.16 2.89***
(1.23) (1.15) (1.24) (0.89)
Availability score -0.75 -0.56 -0.93 -0.49
(2.07) (1.95) (1.99) (1.36)
R2overall 0.08 0.08 0.08 0.22
Notes: Coefficients from random effect regression models. The dependent variables are the RMSE of four
expectation rules per subject computed for each round (N= 480 per regression). The independent variables
are standardized test scores, round and treatment dummy variables and their interactions. Robust standard
errors clustered at the rematching cluster level (12 clusters per condition) are reported. *, **, *** indicate
statistical significance at the 10, 5, 1% level, respectively.
40
Table A.6: Prediction errors and individual test scores
Variables (1) (2) (3) (4) (5) (6) (7) (8)
Intercept 0.174*** 0.176*** 0.186*** 0.219*** 0.208*** 0.154*** 0.176*** 0.224***
(0.022) (0.027) (0.023) (0.027) (0.034) (0.025) (0.024) (0.041)
Treatment -0.001 -0.001 -0.000 -0.000 -0.001 -0.001 -0.001 0.001
(0.024) (0.024) (0.025) (0.024) (0.025) (0.024) (0.024) (0.024)
Round 2 -0.043** -0.043** -0.043** -0.043** -0.043** -0.043** -0.043** -0.043
(0.019) (0.019) (0.019) (0.019) (0.019) (0.019) (0.019) (0.019)
Round 3 -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066***
(0.025) (0.025) (0.025) (0.025) (0.025) (0.025) (0.025) (0.025)
Round 4 -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107***
(0.021) (0.021) (0.021) (0.021) (0.021) (0.021) (0.021) (0.021)
Round 2 ×Treatment -0.023 -0.023 -0.023 -0.023 -0.023 -0.023 -0.023 -0.023
(0.020) (0.020) (0.020) (0.020) (0.020) (0.020) (0.020) (0.020)
Round 3 ×Treatment 0.065* 0.065* 0.065* 0.065* 0.065* 0.065* 0.065* 0.065*
(0.035) (0.035) (0.035) (0.035) (0.035) (0.035) (0.035) (0.035)
Round 4 ×Treatment 0.039 0.039 0.039 0.039 0.039 0.039 0.039 0.039
(0.024) (0.024) (0.024) (0.024) (0.024) (0.024) (0.024) (0.025)
Baccalaureate grade -0.002 0.016
(0.021) (0.019)
CRT score -0.024* -0.020
(0.014) (0.014)
RMET score -0.069** -0.070**
(0.029) (0.031)
Memory score -0.053 -0.021
(0.042) (0.037)
Represent. score 0.030* 0.030
(0.017) (0.016)
Availability score -0.003 -0.018
(0.026) (0.022)
R2overall 0.138 0.138 0.147 0.152 0.145 0.148 0.138 0.170
Notes: Coefficients from random effect regression models. The dependent variable is the individual ARPE computed for each round (N= 480 per regression). The independent
variables are standardized test scores, round and treatment dummy variables and their interactions. Robust standard errors clustered at rematching cluster level (12 clusters per
condition) are reported. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.
41
Chapter 2
Imperfect Tacit Collusion and
Asymmetric Price Transmission
An earlier version of this chapter is published as: Bulutay, M., Hales, D., Julius, P.
and Tasch, W., 2021. Imperfect tacit collusion and asymmetric price transmission.
Journal of Economic Behavior & Organization, 192, pp.584-599. This article is
licensed under a Creative Commons license (CC-BY 4.0).
2.1 Introduction
The phenomenon of Asymmetric Price Transmission (APT), that is, that supplier
prices rise quickly after positive input cost shocks but fall relatively more slowly after
similarly-sized negative shocks, has been repeatedly documented in the literature
such that we can rightly describe it as a stylized fact.1However, while empirical
evidence for the existence of APT is ample, identification of its causal forces is
not settled. Many theoretical explanations have been proposed, but the empirical
literature has yet to conclusively determine which of these are valid or are most
influential.
Empirical studies of APT predominantly examine aggregate-level variables (e.g.
inflation, concentration) proposed to be relevant in the theory literature. The fo-
cus on such variables occurs because firm-level determinants are either not directly
observable, or are not adequately measurable in panel data form. This approach
yields helpful correlations between such variables, but the effort to identify causal
relationships has met with only limited success, most notably in the context of firm-
level underpinnings of the phenomenon. While the search for accurate firm-level
1This is also sometimes termed as ”positive APT” to distinguish it from the opposite phe-
nomenon of ”negative APT.” In this chapter, we will simply refer to ”APT” when we mean
positive APT, except where doing so would create ambiguity. See Section 2.2.1 for an overview of
the evidence.
42
data should certainly be continued, and where discovered used to further inform
our understanding of pricing behavior, experimental methods offer a comparative
advantage: testing theories that involve variables which are unobservable in the
field (e.g. agents’ information sets) lie outside the reach of empirical methods;2if
however these same variables can be controlled through experimental design, we can
overcome this obstacle to testing theory.
A question of primary interest is whether tacit collusion drives APT-like pricing
behavior.3The field data does not convincingly exclude the possibility that market
competitors secretly communicate, given the strong legal and even criminal incen-
tives for firms to conceal – or avoid engaging in – such activities. This provides an
obvious challenge for identification and motivates turning to the controlled setting
of the laboratory, where we can directly observe competitor behavior and credibly
prevent communication between sellers.4
An argument put forth by Borenstein et al. (1997) is that a variation of the
”trigger price” model of oligopolistic coordination originally introduced by Green
and Porter (1984) may explain the emergence of APT-type dynamics through tacit
collusion. In their model, when positive shocks occur firms immediately raise prices
in order to preserve profit margins; however, when negative shocks occur firms react
adaptively, holding prices at pre-shock levels until they see convincing evidence that
a rival has cut their prices. Rapidly lowering prices in response to a downward cost
shock could be perceived as defection from a mutually beneficial regime of tacit
collusion, thus inviting retaliation from other firms. In contrast, rapidly raising
prices in response to an upward cost shock poses no such threat to one’s competitors,
and therefore incurs no corresponding risk of retaliation. Although their arguments
are sound and are consistent with a deep empirical literature finding correlations
suggestive of tacit collusion, Borenstein et al. (1997) conclude that they are unable to
conclusively draw support for this hypothesis from their data. As no other empirical
study of which we are aware has accomplished this either, we thus find motivation
to turn to the laboratory to examine the role of tacit collusion in driving the APT
dynamic.5
2Meyer and von Cramon-Taubadel (2004) and Frey and Manera (2007) provide extensive dis-
cussions of methodological issues in econometric tests of APT.
3In this chapter we will use the term ”tacit collusion” to mean the phenomenon in which
suppliers coordinate on prices above the competitive equilibrium level, through the channel of
publicly visible pricing alone. Tacit collusion can also take the form of coordination on quantities
below competitive equilibrium levels, but in this chapter we will focus strictly on the role of
coordination on prices.
4Furthermore, the laboratory may be the only environment in which we can reliably detect
collusion, since the non-collusive prices or profits are unavailable without imposition of strong
structural assumptions.
5There are some studies that regress the estimated asymmetry with measures of market concen-
tration, e.g. Loy et al. (2016). Counter-intuitively, the authors find that asymmetry decreases with
higher concentration in German milk markets. However, it is difficult to associate this estimate
43
A second question of interest is whether the number of competing sellers in a
given market plays a significant role in the realization of the APT phenomenon. No-
tably, in his broad study of U.S. wholesale and retail markets, Peltzman (2000) finds
a negative relation between the number of competitors in a market and the magni-
tude of APT observed. As with any empirical study, however, this study does not
exclude the possibility that explicit (but unobserved) communication between firms
lies behind this result. Several non-APT focused studies of experimental oligopoly
markets find that there is an inverse relation between the number of sellers in a
market and the size of deviations from the Nash equilibrium (NE) outcomes (for
example, see Huck et al. (2004), Dufwenberg and Gneezy (2000), and Fonseca and
Normann (2012)). However, we are unaware of any experimental study that specif-
ically studies the role of the number of sellers in driving the APT phenomenon. We
therefore incorporate the number of sellers in our markets as a treatment variable
in our experimental design.
To our knowledge, Bayer and Ke (2018) is the only experimental study that
directly targets the topic of APT. The authors’ study employs a Bertrand duopoly
setting in which sellers’ costs either increase, decrease or stay constant at the halfway
point of the experiment. With two extensions of this baseline condition, they further
test the impacts of search costs and asymmetric information on APT. They find APT
across all treatments, even in the absence of search and information frictions. They
argue that the asymmetry can be explained with a backward-looking learning model:
If a seller fails (manages) to sell the good in the period prior to the shock, it is more
(less) likely that she will adjust her price downwards (upwards) in the following
period. The authors’ results support this regularity when the shock is negative, but
not when it is positive. Hence, although this learning model may account for the
downward rigidity, it falls short of explaining the asymmetry.6
While Bayer and Ke (2018)’s study provides a useful benchmark to our own,
our design choices differ substantially from theirs, as we pursue different research
questions. Whereas we aim to assess the roles of cooperative behavior and tacit
collusion on pricing asymmetries, they deliberately try to attenuate their impacts to
isolate the role of learning.7In particular, in their experiment sellers whose stores
with the causal impact of collusion on APT as their observed higher concentration index may stem
from higher efficiency or product differentiation rather than from competitor conduct.
6Bayer and Ke (2018) also reason that following positive cost shocks sellers will reason that other
sellers will all immediately raise their prices, and so they do the same, while following a negative
cost shock sellers do not see any reason to cut their prices unless and until they subsequently lose
sales. They cite factors such as bounded rationality as explanations for this behavior, but do not
offer a more precise explanation of the channels through which the observed behavior emerges.
7Although Bayer and Ke (2018) exert effort to minimize the role played by tacit collusion with
their study, their typed-stranger matching protocol significantly reduces but does not completely
eliminate the possibility that subjects might repeatedly interact, and thus have the opportunity
to establish reputation over time. By contrast, the perfect-stranger matching protocol, in which
44
are not visited by a buyer receive only limited information on the market price,
due to the feedback structure. In our experiment, we inform sellers of the average
market price of the other sellers, as we want to create the conditions in which price
signalling can be studied more explicitly.
In our experimental setting, subjects play the role of sellers and a computer plays
the role of buyers. Each seller faces demand that linearly decreases with one’s own
price and linearly increases with the average price of others. We vary the size of
groups across sessions as 2, 3, 4, 6, and 10, while calibrating the demand function to
hold the best-response functions of each seller identical, across all group sizes. Thus,
we isolate and study the impact of group size on behavior, while holding the price-
based incentives facing individual sellers constant across markets.8Throughout
our experiment, sellers experience a series of input price shocks – either large or
small – that shift the NE price either up or down. Through this design, we are
able (i) to test whether APT emerges despite the absence of market frictions and
information asymmetries that are often theorized to be the causal forces behind
pricing asymmetries; and, (ii) if APT does occur, to assess the impact of number
of sellers on the magnitude of the resulting asymmetries. To our knowledge, ours is
the first experiment that study the role of number of sellers in shaping APT, but
also the first to study the pure number effect in a price competition setting.
Our contributions to the literature are two-fold: First, we document the preva-
lence of the APT phenomenon through experiments in which we possess strict control
over the environment. In particular, our results indicate that the APT may emerge
even in the absence of market frictions and information asymmetries that are often
theorized to be the causes of pricing asymmetries. This suggests that in markets
with three or more sellers, the presence of agents who attempt to coordinate on
prices via price signaling may suffice for APT pricing dynamics to emerge. In our
duopoly markets, however, our results suggest that coordination can be so success-
ful that rather than the (positive) APT phenomenon, persistent pricing at collusive
price levels, or even negative APT, may instead result.
Second, keeping incentives the same across differing group size, we are able to
isolate and perform hypothesis tests on the pure effect of increased group size on
APT. For markets with three or more sellers, we do not find significant differences in
the magnitude of observed APT. Together, the results of our study support theories
that highlight the role of tacit collusion on APT. We conclude that APT may be the
a subject is assured they will be matched with another only once in a session, does more credibly
eliminate this possibility. Moreover, the duopoly setting of their study makes collusion presumably
more achievable, since coordination is easier when there is only one other market participant. As
a result, it is hard to assess the extent of the role to which cooperative behavior played in their
study.
8This is akin to the “pure number effect” studied by Isaac and Walker (1988) in the context of
public goods game. See Hanaki and Masili¯unas (2021) for a similar approach in Cournot oligopolies.
45
product environments in which collusion is significant, but imperfect (i.e., unstable).
2.2 Related Literature
2.2.1 Field Evidence
Bacon (1991) provides an early empirical study suggesting that retail gasoline prices
in the United Kingdom experience faster and more concentrated responses to crude
oil price increases than they do to similar crude oil price decreases. Bacon termed
this phenomenon ”Rockets and Feathers,” and since this paper was published dozens
of other researchers have detected the presence of this sort of asymmetry in a variety
of consumer and intermediate goods markets.
Peltzman (2000) provides one of the most comprehensive empirical examinations
of APT. He conducts a broad study of pricing behavior of 77 consumer and 165
producer goods markets in the U.S., and he concludes that in more than two-thirds
of these markets prices rise faster than they fall, in response to input cost changes.
Peltzman also seeks correlations between various features of markets and industries,
and the degree to which evidence of APT is present. Most notably, he finds that
markets with fewer competitors tend to exhibit more pricing asymmetry, while on
the other hand markets with higher levels of concentration tend to be less likely to
exhibit pricing asymmetry, as in Loy et al. (2016). Peltzman’s study, however, does
not provide an explanation for these correlations.
In an early survey of field evidence, Meyer and von Cramon-Taubadel (2004) find
that (excluding Peltzman (2000)’s study), symmetry in price response is rejected in
almost one-half of all cases in the literature. Their survey also shows that different
test methods yield highly varying rejection rates (between 6% and 80%). Frey
and Manera (2007) and Perdiguero-Garc´ıa (2013) provide meta-regression analyses
with more comprehensive and recent data sets. Both studies confirm that APT is
very likely to occur but also emphasize the variation of reported outcomes. Their
results show that this heterogeneity can be explained by certain characteristics of the
data (e.g., data frequency) and of the employed econometric model. Most notably,
Perdiguero-Garc´ıa (2013) reports that the asymmetry tends to decrease in more
competitive segments of the industry.
2.2.2 Theoretical Explanations
There is a growing body of literature on the theoretical accounts of APT, an un-
surprising fact given that pricing asymmetries are not predicted by standard price
46
competition models.9These studies propose explanations of APT mainly by in-
troducing market frictions, information asymmetries or boundedly rational agents
into the underlying models. One reason there is such a variety in the way different
studies explain the APT is because these studies typically focus on specific market
structures (e.g., wholesale petroleum markets) and their idiosyncrasies. In this sub-
section, we review some of these studies in an attempt to categorize them as well as
to highlight discrepancies.10
Borenstein et al. (1997) consider the role of search costs in facilitating APT.
They hypothesize that negative cost shocks in the presence of costly search provide
firms temporary pricing power, which they then use to delay reductions in prices,
yielding temporarily superior profits. Benabou and Gertner (1993) and Yang and
Ye (2008) also develop explanations based on consumer search costs, but also on the
volatility of input costs. They reason that volatility should reduce search incentives
for consumers; producers, realizing that they face demand that is temporarily more
inelastic, yielding them increased pricing power over the short term, respond by
reducing prices more slowly. Reagan and Weitzman (1982) and Borenstein and
Shepard (1996) propose explanations based on inventory costs, reasoning that it is
relatively more costly for manufacturers and suppliers facing capacity constraints or
sharply rising short-term production costs to deal with unanticipated increases in
demand resulting from price drops, than it is to respond to corresponding drops in
demand due to price increases. Ball and Mankiw (1994) consider a menu-cost model
in conjunction with positive trend inflation as an explanation of APT. In another
study, Ahrens et al. (2017) show that the presence of consumers with loss aversion
may explain why prices are more sluggish to adjust downwards than upwards in
response to permanent demand shocks.
The various explanations and models described above provide differing implica-
tions for government policy: if APT occurs due to collusion, there may be room for
regulation to improve economic efficiency; if however APT is primarily caused by
9A notable exception is the case of Markov-perfect equilibria, and in particular the case of the
Edgeworth cycle. In this phenomenon, firms undercut each others’ prices successively until prices
approach marginal cost; at this point, one of the firms decides with some positive probability to
spike its price, and once this occurs the cycle is repeated, yielding each firm positive economic
profits. Maskin and Tirole (1988) further show that these cycles provide a case where asymmetric
pricing can be sustained in equilibrium. However, the Edgeworth cycle model requires that firms
make price decisions alternately; the model does not support an equilibrium when price decisions
are made simultaneously or continuously. Moreover, the emergence of the phenomenon seems in
practice to be limited to environments in which competitors rapidly and publicly change prices (see
for example Byrne and De Roos (2019) for an interesting case in Perth, Australia petrol markets,
in which a government mandate for retail suppliers to publish their prices daily seems to have
facilitated the emergence of a weekly cycle of Edgeworth-like pricing dynamics that persisted for
many years.). Thus the Edgeworth cycle model arguably applies only to a relatively narrow range
of market contexts.
10For more exhaustive surveys of theoretical explanations, see Meyer and von Cramon-Taubadel
(2004) and Brown and Yucel (2000).
47
the presence of inventory costs, asymmetric menu costs, or search costs, then reg-
ulation that controls pricing behavior may actually induce inefficiency rather than
attenuate it. Given the robust evidence of the widespread existence of APT and its
non-trivial magnitude and impact on consumer outcomes, identifying which theories
best describe the asymmetric pricing behavior is key to determining effective public
policy.
2.2.3 APT and Experiments
Despite the many possible explanations that have been proposed, the empirical lit-
erature yields only mixed evidence that is often inconclusive due to identification
issues. This suggests there is room for further research to shed light on the phe-
nomenon. We consider the advantages of experimental methods in isolating and
studying causal determinants of APT.11 In this subsection, we summarize the most
relevant literature to our study.
There are two studies of which we are aware – in addition to Bayer and Ke
(2018) – that conduct market experiments with APT-related results. Deck and
Wilson (2008) investigate gasoline markets and find that retail prices adjust asym-
metrically to changes in station costs in zones with clustered stations, but not in
zones with stations that are relatively isolated from competitors. Cason and Fried-
man (2002) find weak evidence of APT in posted offer markets where customers
incur switching costs. While these studies examine their findings on APT, their
experimental designs are optimized to investigate questions regarding the structure
of gasoline markets (e.g., zone pricing, divorcement) and of consumer markets (e.g.,
switching costs), not to identify causes of APT. In particular, sellers’ costs in both
experiments follow random-walk shocks, which may not be salient enough to detect
APT. Our study distinguishes itself from this string of literature by examining APT
with larger, persistent shocks.12
Apart from studies that directly target APT, price competition experiments
that study the impact of group size on tacit collusion are also relevant to this chap-
ter. Dufwenberg and Gneezy (2000) provide an early evidence for such a relation
through an oligopoly game that corresponds to a discrete version of the Bertrand
model. They find that winning prices tend to converge to NE levels in groups of
11The usage of experimental methods in macroeconomic research is becoming more and more
prevalent. See Duffy (2017) and Cornand and Heinemann (2019a) for recent surveys.
12Fehr and Tyran (2001) also employ large positive and negative shocks and report APT-like
behavior in a price-setting game. However, the authors do not analyze the phenomenon, nor do they
probe its implications. In another related experimental study, Duersch and Eife (2019) consider
Bertrand duopolies with zero marginal cost in either inflationary, deflationary or constant price
environments. They find that real prices are significantly lower in the inflationary environment
compared to non-inflationary environments.
48
three or four competitors, but stay consistently high in duopolies. Morgan et al.
(2006) find that increasing the number of sellers from 2 to 4 decreases the prices
paid by some consumers (the ones informed about the entire distribution of prices)
but not for others (the ones who buy with motives other than prices). Abbink and
Brandts (2008) also find that there is a negative relationship between the number
of competing firms and price levels.13 Nevertheless, as in Dufwenberg and Gneezy
(2000), they find that collusive pricing is the modal outcome in duopolies. Fonseca
and Normann (2012), Orzen (2008), Davis (2009) and Horstmann et al. (2018) pro-
vide further evidence that collusive prices are very likely to be observed in duopolies.
Average prices approach considerably close to the NE in the baseline condition of
these studies (fixed matching, no communication, symmetric sellers etc.) when the
number of sellers is 3 or greater.
A finding common to each of these studies is that persistent coordination over
collusive prices is unlikely in markets other than duopolies. This, however, does
not preclude the possibility that players might manage to coordinate temporarily
on high prices following negative shocks. Experiments also indicate that increasing
the number of sellers often leads to more competitive outcomes (in terms of price
and output), which in turn should make APT less likely. Although, the meta-
analyses of Fiala and Suetens (2017) and Horstmann et al. (2018) on oligopoly
experiments indicate that there may not be a linear relationship between the number
of competing firms and the degree of tacit collusion. On the one hand, Horstmann
et al. (2018) argue that this result stems from the relatively small number of studies
that provide pairwise comparisons and the lack of statistical power in these studies.
On the other hand, Hanaki and Masili¯unas (2021) consider, as we do, the fact that
a change in group size simultaneously influences the difficulty of coordination (aka
the ”pure number effect”) and the incentives provided to collude. They find that
the pure effect of group size is small if exists at all. Our study contributes to
the literature through improvements of these axis, in particular, by changing the
group size without changing the incentives provided to individual sellers in different
markets in a price competition game.
13Their results are particularly interesting since in their price competition setting, there exist
multiple equilibria.
49
2.3 Method
2.3.1 Pricing Game
We develop a price competition game with a linear demand model and employ this
in our experimental markets.14 In this setting, the demand facing seller i∈Nin
period t∈Tis equal to
qi,t(pi,t, p−i,t;δ, γ) =
δ−γ·(pi,t −p−i,t), pi,t ∈[pmin, p]
0,otherwise
(2.1)
where δand γare parameters of demand, pi,t is the price set by seller iand p−i,t
is the average price chosen by the rival sellers in the same market (i.e. p−i,t ≡
1
N−1PN−1
j=ipj,t) at period tand p=min pmax, p−i,t +δ
γ. Parameters pmin and
pmax refer to the price floor and the reservation price of the representative consumer,
respectively. Variable pregulates the maximum price level below which (conditional
on the average price of other sellers) qi,t takes on strictly positive values.15
The model of linear demand that we use, with own- and cross-price parameters
equal in magnitude, is micro-founded by the Spokes models of Chen and Riordan
(2007) and Bos and Vermeulen (2022), which are themselves adaptations of Salop’s
canonical Circular City address model (Salop,1979). In addition, this specification
of demand can also be motivated by a limiting case of quasi-linear quadratic utility
(see Appendix B.1 for an exposition). Note that any demand specification with linear
parameters on the price of each competing good can be equivalently represented in
terms of own-price and the (weighted) average price of other sellers, as in (2.1).
Thus the presence of average price figure p−i,t in this equation does not imply that
when contemplating their appetite for good ithat consumers consider the average
price set by firms competing with firm i; rather, it implies that the aggregate effect
of all consumers’ individual demand for good iis based on the entire vector of prices,
but can nevertheless be mathematically represented concisely in terms of own price
piand and the average price of all other sellers, p−i. This fact will be helpful in
keeping both our analysis and our experimental design tractable, as we will shortly
see.
14Linear demand systems have a number of advantages over alternatives and long been applied
in the industrial organization literature. Such systems lead to closed-form best-response functions
and Nash Equilibrium specifications, greatly enabling interpretation of empirical or experimental
results. To the extent that non-linear systems of demand (e.g. the Almost Ideal Demand System
of Deaton and Muellbauer (1980) or the Relative Love of Variation model of Zhelobodko et al.
(2012)) may be preferred on theoretic or other grounds, linear systems provide approximations
with first-order accuracy.
15In practice subjects chose prices that were revealed to be above this value in fewer than 0.2%
of all cases.
50
Given the own-demand specification in (2.1), seller profits are calculated as
πi,t = (pi,t −mct)·qi,t −f, (2.2)
where qi,t is quantity demanded from seller ias defined in (2.1), mctis marginal cost
that shifts every Tperiods that comprise a round (denoted r∈R) and fis fixed
cost. Sellers set their prices in each period simultaneously from a discrete set that
is bounded as pi,t ∈[mct, pmax], such that the price floor is equal to the marginal
cost of that round.16
In the described finitely repeated game, we can express the maximization prob-
lem as
maxpi
T
X
h=0
βhEi,t−1πi.t+hsubject to pi,t+h∈[mct+h, pmax].(2.3)
Sellers thus maximize the expected discounted sum of profits over Tperiods by
choosing a vector of prices pi.17 This in turn leads to best-response function
pBR
i,t =1
2mct+δ
γ+Ei,t−1[p−i,t].(2.4)
The current period’s marginal cost mc is revealed to the sellers prior to their pricing
decisions thus is outside the expectation operator. Sellers form expectations of the
prices their competitors will set during the current period (Ei,t−1[p−i,t]) by condi-
tioning on all available information. The system of best responses for all sellers
implied by (2.4) solves for steady-state prices as:
p∗
i,t =pBR
i,t Ei,t−1[p∗
−i,t]=mct+δ
γ≡pNE
t.(2.5)
This price level also corresponds to the unique stage-game NE (pNE
t) and the unique
subgame perfect Nash equilibrium (SPNE). Sellers may achieve the joint profit max-
imum (JPM) if they each set their prices to the maximum price pmax.
In this pricing game, neither own-demand nor own-profit depend on the number
of sellers. These only depend on own-price and the average price of rival sellers.
The best-response action is also independent of Nfor a wide range of expecta-
tion models, including rational expectations and adaptive expectations (Evans and
Honkapohja,2001). This feature assures that the incentives given to the sellers of
different group sizes are matched and the market power of each seller, measured by
the size of markup over marginal cost, is ex-ante equal. We consider this as nec-
16As a practical matter we needed to set a price floor of pi,t ≥mctto ensure subjects would not
complete the experiment with a negative payoff.
17Although the model assumes that sellers choose a vector of prices for all periods, subjects only
submit a price decision for the current period in the experiment.
51
essary for ensuring a ceteris paribus comparison between the treatment conditions
and to capture the pure number effect.
2.3.2 Experimental Design
Sellers interact repeatedly in the described pricing game for Rrounds, which are each
composed of Tperiods. Marginal cost mctfluctuates across rounds, modeling large
exogenous cost shocks, but remains invariant during a round. Our experimental
manipulations consist of varying the size of markets across sessions in a between-
subjects design, and of varying the size and direction of shocks across rounds in a
within-subjects design. We implement a fixed-matching protocol during a session.
The calibration of the experimental game is summarized in Table 2.1. The exper-
iment consists of 5 rounds of 15 periods each, with a new marginal cost announced
at the beginning of each round. The sequence of shocks is identical across all treat-
ments: marginal cost starts at $0.90 in Round 1, drops to $0.50 in Round 2, rises to
$1.30 in Round 3, falls again to $0.50 in Round 4, then rises to $0.90 for Round 5.
Table 2.1: Experimental design parameters
General parameters
Number of periods per round T= 15
Number of rounds per session R= 5
Demand parameters δ= 8.50, γ= 7.275
Fixed cost f= 1
Maximal/reservation price pmax = 3
Varying parameters
Group size across treatments N∈ {2,3,4,6,10}
Marginal cost across rounds mc : (0.90,0.50,1.30,0.50,0.90)
Cost shock sequence ∆mc ≡η: (−0.40,+0.80,−0.80,+0.40)
NE price across rounds pNE : (2.07,1.67,2.47,1.67,2.07)
2.3.3 Procedures
Experimental sessions were conducted at the University of California, Santa Bar-
bara’s Experimental and Behavioral Economics Laboratory (EBEL) using the z-Tree
platform (Fischbacher,2007), between September and December of 2018. A total
of 245 subjects were recruited from the experimental economics subject pool of the
same university, using the ORSEE tool (Greiner,2015). Subjects were allocated
52
to markets of size 2, 3, 4, 6 and 10, with a total of 36, 39, 52, 48 and 70 subjects
assigned to each group size condition, respectively. This setup yields 59 independent
markets for the analysis.18
At the beginning of each experiment, subjects are provided written instructions
which are also read to them aloud by an experimenter. Subjects then proceed to
take a short comprehension quiz.19 In the main part of the experiment, each subject
plays the role of sellers and makes a series of 75 pricing decisions, whereas consumer
behavior is simulated by computer. We also elicit subjects’ one-period-ahead ex-
pectations about the average price chosen by rival sellers (i.e., Ei,t−1[p−i,t]). These
expectations are not rewarded separately, to avoid creating hedging issues. Subjects
are able to set a price between the marginal cost and the maximum price (of $3.00),
in increments of $0.01. Once all subjects set their prices and expectations, they are
individually notified by the computer of the average price established by the others
in their market, reminded of their own price, and shown their own resulting payoff
for that period. Subjects are able to track the previous values of these outcomes
through a history box that is available in their screen (see Appendix B.2).
We notify subjects that a new cost shock will occur at the beginning of each
new round, either an increase or decrease, of either $0.40 or $0.80. We reveal the
magnitude and direction of each shock immediately prior to the first period of each
respective round. At that time, we also hand out copies of a printed payoff table
corresponding to the new marginal cost. These tables assist subjects in estimating
the profits they will receive, conditional on the hypothetical prices they and others
may set in each period of that round (see Appendix B.2).
Sessions lasted a total of 90 to 125 minutes. Subjects were paid $18.66 on average
(a minimum of $10.89 and a maximum of $28.50), which includes the $5.00 show-up
fee and $3.00 for the completion of the optional survey (no subject declined this
offer). The remaining payoff is determined as the average payoff of a randomly
chosen round of the game.
2.4 Hypotheses
As the experimental design specifically avoids any of the features outlined in Section
2.2.2 (e.g., frictions, information asymmetries), standard theory suggests that prices
react symmetrically to shocks. However, we may observe asymmetry if certain
conditions are satisfied even in the absence of these frictions. To unravel these
conditions, we need to investigate the strategic tension underlying our pricing game.
18In one session (20 subjects), the data from the final period is lost due to technical reasons. All
the analysis in the results section is performed based on all the available data.
19We reviewed answers for each subject and provided explanations where needed. See Appendix
B.2 for all experimental material.
53
On the one hand, individual incentives promote undercutting opponents’ prices until
the NE price is reached. On the other hand, sellers may generate higher profits if
they successfully sustain coordination at a price level above NE.
We first focus on the incentive each individual seller has to undercut other sellers.
Consider the following variable:
Inc2Dev(p−i,t)≡πi,t(pBR
i,t (p−i,t−1), p−i,t−1)−πi,t(p−i,t−1, p−i,t−1),(2.6)
which expresses the incentive of a myopic seller to deviate from the coordinated
price level from the previous period, p−i,t−1.20 This seller is myopic as s/he ignores
the possibility that the competitors might also engage in similar reasoning. By
substitution of (2.1), (2.2), and (2.4) we can rewrite (2.6) as
Inc2Dev(p−i,t) = γ
4(p−i,t−1−δ/γ −mct)2=γ
4p−i,t−1−pNE
t2(2.7)
The incentive to deviate from price p−i,t−1during period tthus increases quadrati-
cally with the difference between the previous period market price and the current
period NE price. Thus, individual incentives promote convergence to the NE.
We now separately analyze Inc2Dev in reaction to positive and negative shocks.
Consider the case where a shock of magnitude η≡ |mct−mct−1|occurs in period t.
Then, application of (2.5) allows us to express pNE
t=pNE
t−1+ηand pNE
t=pNE
t−1−η
for positive and negative shocks, respectively, and we thus define:
Inc2Dev+
i,t ≡γ
4(p−i,t−1−(pNE
t−1+η))2
Inc2Dev−
i,t ≡γ
4(p−i,t−1−(pNE
t−1−η))2.
(2.8)
Straightforward manipulation of (2.8) allows us to express the difference in the
incentives to deviate as:
∆Inc2Devi,t ≡Inc2Dev+
i,t −Inc2Dev−
i,t =−γη ·(p−i,t−1−pNE
t−1).(2.9)
This equation implies that when market prices are above (below) NE immedi-
ately prior to a cost shock, the incentive to deviate following the shock will be greater
(lesser) if the shock is negative than if it is positive. When however p−i,t−1→pNE
t−1,
or when sellers are not myopic, the incentive to deviate converges to zero. We con-
sequently consider the absence of APT as our first null hypothesis:
Hypothesis 1: Prices respond symmetrically to (equally sized) positive and
negative shocks.
20We are indebted to an anonymous reviewer for proposing analysis using this approach.
54
We can test this hypothesis by exploiting the exogenous within-subjects treatment
variations in marginal cost.
An interesting implication of equation (2.9) is that if prices are already collusive
prior to the shock, the relatively greater temptation to deviate from the coordinated
price following negative shocks suggests we should observe negative APT as opposed
to the positive APT that Borenstein et al. (1997) and many others have observed.
The arguments of myopic best-response and of tacit collusion – explained in the
next hypothesis – therefore go in the opposite directions.
Our second hypothesis concerns the second force underlying this strategic ten-
sion: the prospect of sustained higher profits through tacit collusion. We use market
power, measured through the size of markups over marginal cost, to study collusion
(see Section 2.5.2 for the description of the exact measure). Markups should be
invariant to the number of rival sellers, given that both the profit and best-response
functions are independent of group size. Moreover, in the absence of frictions and
the ability of competitors to communicate, the theory predicts a constant markup
for all levels of marginal cost. However, if tacit collusion occurs, we expect to ob-
serve higher market power (i) in markets with fewer sellers, and (ii) in the periods
occurring soon after negative shocks. For (i), we expect to observe persistent coor-
dination more often where there are fewer sellers to dampen the strength of price
signals. Moreover, it is arguably easier to sustain coordination above NE pricing
when one has fewer competitors, as any one of them can undermine joint coordi-
nation if they individually fail to cooperate. For (ii), we conjecture that negative
shocks boost the market power of sellers (at least temporarily), as such shocks may
play the role of a coordination device for attempts of collusion. We can test this
hypothesis by using the between-subjects treatment variations in group size, and
within-subject treatment variations in marginal cost.
Hypothesis 2: Sellers’ market power is invariant to the number of sellers in the
market, and is unaffected by the existence of periodic shocks.
Finally, our third hypothesis concerns individual pricing strategies. The Rational
Expectations Hypothesis (REH) of Muth (1961) admits the possibility of expecta-
tion errors at the individual level, but which should tend to cancel out in aggregate.
Also, after observing t−1 periods of price history, a seller may learn that the others
do not behave consistently with the predictions of REH. Nevertheless, conditional on
expectations, sellers should select the best-response action as this maximizes their
profit. As we elicit subjects’ guesses on the average price set by others, we can test
this hypothesis without assuming a specific expectation model. We can interpret
any intentional deviation from the profit-maximazing action as an attempt to reach
55
collusive prices.
Hypothesis 3: Conditional on expectations, pricing behavior follows the best-
response function.
2.5 Results
Figure 2.1: Average pricing behavior across periods and group sizes.
(a) Average of all market prices
.5 1 1.5 2 2.5 3
Prices
0 15 30 45 60 75
Periods
Average JPM MC NE
Prices
(b) Average market prices by group size
1.5 2 2.5 3
Price
0 15 30 45 60 75
Periods
N=2 N=3 N=4 N=6 N=10 JPM NE
Prices
Figure 2.1 provides a depiction of the average price per period, as the average
of all market prices and as broken out by group size. Here, market price refers to
the average of all prices in market m(i.e., pm,t =1
NPN
i=1 pi,t). The reader can
readily discern that for groups of size 3 and greater, average prices rise rapidly after
positive cost shocks, while they fall more slowly after negative cost shocks. By
contrast, for groups of size 2, it is not immediately obvious whether average pricing
behavior is affected by cost shocks. A second observation that is immediately clear
56
is that average prices are generally above the NE price, with deviations being higher
following negative shocks than when following positive shocks. Overall, the visual
inspection of the data suggests the presence of market behavior consistent with
APT.
2.5.1 Estimation of the Asymmetry
We follow Peltzman (2000) and estimate the coefficients of the distributed lag model
(DLM) to measure the magnitude of APT. This model can be expressed as:
∆pi,t =
K
X
k=0
bt−k·∆mct−k+
K
X
k=0
ct−k·(I[∆mct−k>0] ·∆mct−k) + ϵi,t (2.10)
where the change in output price (i.e., ∆pi,t =pi,t −pi,t−1) is modelled as a func-
tion of the lagged changes in marginal cost (i.e., ∆mct−k). The indicator variable
I[∆mct−k>0] takes the value 1 if the change in marginal cost in period t−kis
positive and equal to 0 otherwise. The sum of interaction coefficients PK
k=0 ct−k
reflects the magnitude of asymmetry and its persistence over Kperiods.
Figure 2.2: Cumulative response to shocks
0 .2 .4 .6 .8 1
Cumulative asymmetry since the shock
0 1 2 3 4
Periods after shock
Notes: Cumulative response after Kperiods. Dots refer to PK
k=0 ct−k. Lines represent 95% confidence intervals.
We estimate model (2.10) with Ordinary Least Squares (OLS) regressions in
a step-wise manner. Figure 2.2 reports the estimated asymmetry for K= 4.21
21We report the full set of results in Appendix B.3. All estimations employ robust standard
errors that are clustered at market level. We also include a set of indicator variables that are
specific to each group size (i.e., I[N=s]), the lagged change in the average price of rival sellers
(i.e., ∆p−i,t−1), a three-way interaction term (between I[∆mct−k>0], ∆mct−kand group size
specific indicator variables) and autoregressive terms amongst the set of regressors to check the
robustness of estimates. The significance of asymmetry coefficients as well as their magnitude are
57
Estimates indicate that the APT is both strong and persistent. Immediate price
reactions are 32.9 cents greater in magnitude for positive than for negative 80-cent
shocks.
We now assess the reaction of prices to equally sized shocks between our treat-
ment groups with non-parametric tests. We compare immediate pass-through rates
of shocks that are β+
0and β−
0calculated as:
p+
i,t+τ=p+
i,t−1+β+
τη+
p−
i,t+τ=p−
i,t−1+β−
τη−(2.11)
where η+(η−) reflects either the large or small positive (negative) shock and t−1
corresponds to the period just before the shock. Note that aggregate market demand
implied by (2.1) is perfectly inelastic, so that shifts in the NE price following cost
shocks are exactly equal to the magnitude of the cost shock itself. Accordingly, we
can denote the cases β+
0= 1 and/or β−
0= 1 as incidences of ”full pass-through” of
cost shocks. The ratio of β+
0and β−
0thus conveys information on the degree of APT
in immediate cost-shock responses. A ratio of 1 would indicate the absence of APT.
Table 2.2 provides average value of pass-through rates for different aggregation
levels.22 First, we note that the hypothesis of full pass-through can generally be
rejected. Exceptions consist of the small positive shock (i.e., η+= 0.40) and N= 6
for the large positive shock. Second, we test APT in the immediate post-shock
responses by testing the equality of immediate pass-through rates for equally sized
shocks as H0:β+
0=β−
0via Wilcoxon signed-rank tests. The pooled data and the
data for groups of size greater than 2 suggest rejecting the null. For groups of size
3, we reject symmetry for the smaller but not for the larger shock.
robust to the inclusion of these variables.
22We report the results of an identical analysis for τ= 14 in the Appendix B.3 Consequently,
asymmetry in pass-through rates reduces but does not disappear entirely even 14 periods after the
shock.
58
Table 2.2: Asymmetry in the immediate pass-through rates
All N>2N= 2 N= 3 N= 4 N= 6 N= 10
Small shocks
β−
00.159 0.115 0.411 0.558 0.158 -0.144 0.0154
(0.0676) (0.0739) (0.162) (0.247) (0.138) (0.130) (0.0985)
β+
01.119 1.270 0.244 1.305 1.209 1.233 1.322
(0.0672) (0.0659) (0.196) (0.172) (0.121) (0.121) (0.123)
p-value 0.000 0.000 0.411 0.028 0.000 0.000 0.000
Large shocks
β−
00.324 0.313 0.391 0.303 0.432 0.206 0.304
(0.0362) (0.0405) (0.0734) (0.107) (0.0811) (0.0706) (0.0712)
β+
00.639 0.718 0.181 0.537 0.779 0.945 0.619
(0.0396) (0.0400) (0.111) (0.116) (0.0636) (0.0796) (0.0643)
p-value 0.000 0.000 0.035 0.202 0.013 0.000 0.006
Observations 245 209 36 39 52 48 70
Notes: The averages of pass-through rates by differing group sizes are reported. Below averages, standard errors
are reported in parentheses. p-values correspond to the result of the Wilcoxon signed-rank test on the equality of
pass-through rates for small or large shocks (i.e. H0:β+
0=β−
0).
For duopolies, we see that the asymmetry is reversed; the average price response
following the larger cost shock is significantly greater for the negative than for the
positive cost shock. Reflecting on the differences in incentives to deviate calculated
in (2.9) from Section 2.4, and noting from Figure 2.1 that pre-shock prices were
generally much higher than the NE price immediately prior to shocks in duopoly
markets, we see that the relative incentive to deviate after negative shocks was much
stronger in duopoly markets than in other markets. We explore the possibility that
this relatively higher incentive to deviate in duopoly markets may have led to this
divergent outcome.
Taken together with the estimates of the DLM, we reach the first two results of
this chapter:
Result 1.1: Prices do not react symmetrically to equally sized positive and
negative shocks.
Result 1.2: Price reactions in triopoly and larger markets are consistent with
positive APT. Price reactions in duopoly markets are consistent with negative APT.
59
2.5.2 Market Power
We now turn to our second hypothesis. We follow the literature in applying the
Lerner index as the relevant measure of market power: Li,t =pi,t−mct
pi,t (Lerner,1934).
We propose that the difference between the observed Lerner index (i.e., Li,t) and the
”theoretical” Lerner index, that is the index that would be relevant if behavior was
consistent with NE predictions (i.e., LNE
t=pNE
t−mct
pNE
t), provides a measure of ”excess”
market power due to collusion. We further propose this as an appropriate measure of
tacit collusion, as our price competition structure incorporates homogeneous goods
and we control marginal costs. Thus, we do not suffer the identification problem of
observational studies. Our measure of excess market power can then be expressed
as:
Lx
i,t =Li,t −LNE
t=mct1
pNE
t
−1
pi,t .(2.12)
Figure 2.3 depicts the average of our measure of excess market power, by pe-
riod and treatment. Upon visual examination one can immediately see that excess
market power generally lies above the theoretical ”Nash” level, consistent with an
environment in which tacit collusion exists. Also, this average measure reaches its
highest levels during the second and fourth rounds, the two rounds that immediately
follow negative shocks. Following the large positive shock at the beginning of the
third round, excess market power falls so much that it turns negative for several pe-
riods. Following the small positive shock at the beginning of the fifth round, excess
market power does not notably react.
We test the veracity of these observations by performing OLS regressions.23 We
consider the following specification:
Lx
i,t =α+X
s=2
δs·I[N=s] + X
e=1
γe·I[r=e] + ϵi,t,(2.13)
where the excess market power of seller iin period tis modeled as a function of
group size- and round-specific indicator variables. According to our null hypothesis,
the model with no independent variables should fit the data as well as this model.
23The non-parametric counterpart of this test is reported in Appendix B.3.
60
Figure 2.3: Excess market power across periods and group size
(a) Average of excess market power
−.05 0 .05 .1 .15
Excess Market Power
0 15 30 45 60 75
Periods
Excess Max Nash
Market Power
(b) Excess market power by group size
−.05 0 .05 .1 .15
Excess Market Power
0 15 30 45 60 75
Periods
N=2 N=3 N=4 N=6 N=10 Max Nash
Market Power
Notes: Excess market power across periods and group sizes. In both subfigures, ”Max” refers to the maximum
excess market power that can be observed (i.e., when Li,t =Lmax
t=pmax −MCr
pmax ) and “Nash” refers to the case
Li,t =LNE
t.
61
Table 2.3: Excess market power
(1) (2) (3) (4) (5)
Constant 0.032*** 0.060*** 0.025*** 0.054*** 0.004
(0.004) (0.011) (0.004) (0.011) (0.007)
N= 3 -0.039* -0.039* 0.018
(0.015) (0.015) (0.020)
N= 4 -0.031* -0.031* 0.028**
(0.014) (0.014) (0.010)
N= 6 -0.026 -0.026 0.047***
(0.014) (0.014) (0.010)
N= 10 -0.038** -0.038** 0.029*
(0.012) (0.012) (0.011)
r= 2 0.017*** 0.017***
(0.003) (0.003)
r= 3 -0.014*** -0.014*** -0.084***
(0.004) (0.004) (0.009)
r= 4 0.023*** 0.023*** 0.028***
(0.004) (0.004) (0.006)
r= 5 0.005 0.005 -0.020**
(0.004) (0.004) (0.007)
Observations 18355 18355 18355 18355 980
Adjusted R2- 0.052 0.054 0.106 0.225
F-statistic - 2.607 31.289 19.288 27.003
Notes: Results of OLS regressions on specification 2.13 are reported. In model (5), the dependent
variable is the change in excess market power, ∆Lx
i,t. Below estimates, robust standard errors
that are clustered at the market level are reported in parentheses. * p < 0.05, ** p < 0.01, ***
p < 0.001
Table 2.3 reports the estimates in a step-wise manner. In model (5), we truncate
the data to the periods where shocks shift the marginal cost (i.e., periods 16, 31,
46 and 61) and replace the dependent variable with the change in excess market
power as ∆Lx
i,t. This allows us to interpret the estimates of round specific indicator
variables as the immediate effect of cost shocks on the tacit collusion in model (5).
First, we reject the null hypothesis in all specifications except (2) at a confi-
dence level of 0.01 with the F-test. The fact that the constant αis positive and
significant in model (1) indicate the overall presence of tacit collusion. Second, the
coefficients of round-specific indicator variables in model (3) indicate that tacit col-
lusion is higher during the second and fourth rounds, and lower during the third
round relative to the first round. In model (5) where we truncate the data, the
62
coefficient of rounds 3 and 5 are negative and that of round 4 is positive. Further-
more, we reject the hypothesis H0:α+δs+γe= 0 at a confidence level of 0.05
(i.e., α+δ{N=3,4,10}+γ5= 0). We can thus say that immediately after a negative
(the large positive) shock, the excess market power increases (decreases). Third,
coefficients of group size specific indicator variables are negative and significant, al-
though at marginal level for N= 6 (p-value= 0.071) in model (2). Here, we also
reject the hypothesis H0:α+δs= 0 for all s(p-value<0.01). This suggests that
tacit collusion is present in all markets but its magnitude is smaller when N > 2.
The sign of these coefficients in model (5) suggests that markets larger than size
3 increase their market power in response to the first negative shock. Lastly, we
generally reject the hypothesis H0:α+δs+γe= 0 in the most unrestricted model
(4) (11 times out of 15 tests at p-value<0.05). The overall interpretation of these
tests provide the basis of our second result:
Result 2: Excess market power (i.e., tacit collusion) is not invariant to shock
direction and group size. It is persistently higher in duopolies, and in larger-sized
markets it rises following negative cost shocks.
2.5.3 Deviations from Best-Response
Finally, we assess the deviation of subjects’ prices from the profit-maximizing best-
response action that is computed by using the submitted expectations/guesses (i.e.,
pi,t −pBR|E
i,t ). Conditional on the subjects accurately reporting their expectations,
they risk lower profits if they fail to best-respond to these reported expectations.
Consequently, the observed deviations can be attributed either to error or alterna-
tively to strategic motives (i.e. attempts to collude). To argue that the deviations
we observe in our experiments are not entirely due to erroneous behavior, we com-
pare the magnitudes and directions of such deviations to the average magnitude of
expectation errors (i.e., Ei,t−1[p−i,t]−p−i,t) and the average of absolute expectation
errors.24 If subjects deviate from their best-response action in a way that is differ-
ent from their expectation errors, this suggests that deviations reflect intentional
behavior.
Figure 2.4(a) depicts the average value of these deviations over time. The average
expectation errors are remarkably close to zero, with no obvious trend across periods.
Although this suggests that beliefs are on average correct, it does not imply the
complete absence of errors: the average measure of absolute expectation errors lies
well above zero throughout the experiment. The latter peaks following both positive
24We label these latter two as ”errors” rather than as ”deviations” as there is no strategic benefit
to knowingly submitting inaccurate guesses/expectations in our experiment.
63
Figure 2.4: Deviations from best-response action and errors in expectations
(a) Average of different deviations across all markets
−.1 0 .1 .2 .3 .4
Deviation
0 15 30 45 60 75
Periods
Deviation from best−response Exp. Errors Absolute Exp. Errors
(b) The deviations from best-response action by group size
0 .1 .2 .3 .4
Deviation
0 15 30 45 60 75
Periods
N=2 N=3 N=4 N=6 N=10
Deviation from Best−response
64
and negative cost shocks but subsequently trends downward. The patterns of high
initial absolute expectation errors and slow convergence are consistent with those
of prior experiments in which prices are strategic complements (e.g., Hommes et al.
2005;Cooper et al. 2021). However, deviations from the best-response action reveal
a different and rather interesting pattern: they peak sharply following negative
shocks and remain high during these rounds, but do not peak similarly following
positive shocks. The second graph in Figure 2.4 depicts the average of deviations
from best-response action by group size. The same pattern can be traced across our
treatment groups.
We perform OLS regressions to study deviations from best-response. Consider
the following specification:
pi,t −pBR|E
i,t =α+X
s=2
δs·I[N=s] + X
e=1
γe·I[r=e] + ϵi,t (2.14)
where the deviation of subject i’s price from the best-response action conditional
on the submitted guess is modeled as a function of group size- and round-specific
indicator variables. Our null hypothesis concerns the overall significance of this
model.
Table 2.4 reports the estimates in a step-wise manner. In model (5), we truncate
the data to the periods where shocks shift marginal cost the same way as we did in
the previous section, and replace the dependent variable with the change in deviation
from best-response action following a cost shock.
We reject the null hypothesis in all specifications. The hypotheses H0:α+γs= 0
in model (2) and H0:α+δs= 0 in model (3) can be rejected at a significance of
p < 0.01. This points out to the following two results: (i) Sellers deviate more (less)
from the associated best-response action following negative (positive) shocks and
(ii) deviations are lower when N > 2. We see that the sign of group size indicator
coefficients in models (4) and (5) are flipped. In model (4), they reflect the fact
that groups of size 3 and larger deviate less, on average, relative to duopolies. In
model (5), they correspond to the immediate reaction of these groups to the first
negative shock. These deviations rise further when a large negative shock shifts
the marginal cost down (ˆγ4= 0.131) while they drop significantly in response to the
large positive shock (ˆγ3=−0.260). In consequence, we reach to the following results:
Result 3.1: Sellers deviate on average above their best-response action.
Result 3.2: Deviations from the best-response action grow (shrink) following
negative (positive) shocks.
65
Table 2.4: Deviations from best-response
(1) (2) (3) (4) (5)
Constant 0.120*** 0.083*** 0.248*** 0.212*** 0.044
(0.013) (0.012) (0.041) (0.040) (0.028)
N= 3 -0.159** -0.159** 0.122*
(0.049) (0.049) (0.055)
N= 4 -0.138** -0.138** 0.163**
(0.050) (0.050) (0.051)
N= 6 -0.128* -0.128* 0.210***
(0.051) (0.051) (0.041)
N= 10 -0.170*** -0.170*** 0.144**
(0.044) (0.044) (0.046)
r= 2 0.080*** 0.080***
(0.010) (0.010)
r= 3 -0.028** -0.028** -0.260***
(0.010) (0.010) (0.031)
r= 4 0.112*** 0.112*** 0.131***
(0.015) (0.015) (0.025)
r= 5 0.018 0.018 -0.118***
(0.013) (0.013) (0.024)
Observations 18355 18355 18355 18355 980
Adjusted R2- 0.044 0.051 0.095 0.170
F-statistic - 37.840 4.013 19.546 40.442
Notes: Results of OLS regressions on specification 2.14 are reported. In model (5), the dependent
variable is the change in deviation from the best-response action following a cost shock, ∆(pi,t −
pBR|E
i,t ). Robust standard errors are clustered at the market level and are reported in parentheses.
*p < 0.05, ** p < 0.01, *** p < 0.001
2.6 Discussion
Our results point to the co-appearance of asymmetric price transmission and tacit
collusion. The latter seems to be the result of strategic behavior, as our analysis of
deviations from best-response action reveals. These findings are consistent with the-
ories that cast tacit collusion as having a significant role in the emergence of APT,
such as the trigger price model in Borenstein et al. (1997). Most of the other theo-
retical explanations of APT in the literature cannot account for the pricing behavior
observed in our results. We can reasonably exclude, for example, the influence of
explicit collusion (i.e., involving direct communication), capacity constraints, inven-
66
tory limitations, (a)symmetric menu costs, consumer loss aversion, (a)symmetric
search costs, contexts of alternating price moves and price lockup periods, and so
forth, as being necessary conditions for APT, since these features are excluded by
our design.
We cannot, however, claim a monotonic relation between the magnitude of APT
and tacit collusion: the pricing behavior of duopolies in our experiment is revealed
to be consistent with 1) a negative APT response in the immediate aftermath of
shocks, and 2) elevated and relatively stable pricing behavior between shocks. We
propose an explanation for the exceptionality of the duopoly result follows, tackling
the latter result first: in duopolies collusion is so strong that sellers are, by and large,
able to maintain cooperative (tacitly collusive) pricing over a sustained period of
time, with pricing showing no reversion to Nash. We therefore argue that APT
requires significant, but imperfect, collusion.25
The success of duopoly markets in maintaining average prices well above equi-
librium levels between shocks, and well above average price levels of triopoly and
larger markets, provides a plausible explanation for the negative APT result we ob-
served with duopolies for large shocks: as is apparent in an examination of (2.9), the
high price deviations above NE prices in the lead-up to shocks in duopoly markets
implies a correspondingly greater incentive for firms to deviate downward following
negative than positive shocks. Thus we have the interesting result that while the
dynamics of our duopoly markets appears to have enabled more success in attempts
to tacitly collude between shocks, this success (in the form of elevated prices) also
made these markets relatively more prone to respond strongly to downward than
upward cost shocks.26
If tacit collusion is indeed a significant causal force behind APT, then our work
has important implications for antitrust enforcement policy against collusion and
price-fixing. In particular, regulators may consider APT in a market as a signal for
25The fact that our duopolies reached almost stable collusion, while larger markets did not, is
consistent with the literature we review in Section 2.2. This can be attributable to a combination
of two factors: first, coordination between market participants becomes increasingly difficult with
each new seller, and three may well be the number from which the difficulties and costs involved in
maintaining coordination start to exceed the marginal benefits; second, our duopolies are unique in
that each participant can deduce the choices made by the other participant by observing aggregate
market outcomes. In a triopoly or larger market, by contrast, it is not possible for sellers to detect
whether an aggregate market outcome is due to the defection of a single competitor, or from a
broader but shallower defection by multiple competitors.
26As we noted in the Results section, our dupoloy markets exhibit pricing behavior consistent
with negative APT, while larger markets’ behavior was consistent with positive APT. We see these
apparently contradictory results as indicating a tension between two phenomena operating in op-
posite directions: on the one hand, collusive dynamics encourage competitors to respond more
sharply to upward than to downward cost shocks, as we have argued throughout the chapter; how-
ever, as prices rise above NE levels the differential incentives to deviate become stronger following
negative shocks than positive shocks. At some point the success of tacit collusion may be so great
as to reverse the direction of the resulting APT.
67
the presence of collusion between firms in that market. Since many real-world inter-
actions between competitive firms are repeated indefinitely, such collusion may even
be sustainable as a NE. Further research is needed to determine whether collusion
is an important cause of APT behavior in field settings, and if so, whether suit-
able forms of regulatory intervention might exist to reduce such collusion without
increases in inefficiency.
We propose that follow-on research may yield further insights into the mecha-
nisms through which tacit collusion leads to APT, as well as potential policy re-
sponses that might diminish its frequency and magnitude. In particular, future ex-
periments should address the impact of different levels of information transparency.
Most notably, testing the effects of providing feedback on individual prices and/or
payoffs of rivals on APT may provide particularly helpful insights. The latter is
shown to lead to more rivalistic outcomes in experimental oligopoly studies, as it
initiates imitation dynamics (Fiala and Suetens,2017), while the former can lead
to more collusive levels. Nevertheless, both may reduce the degree of asymmetry
in price transmission. Another area of needed research is to explore the roles of
market power and market concentration in shaping APT pricing behavior. In our
experimental design we explicitly kept the incentives provided to sellers the same
across markets of varying sizes to study the pure number effect similar to Hanaki and
Masili¯unas (2021). Finally, future studies may benefit from testing the robustness
of our findings to alternative demand specifications. The parameters of demand in
our experimental markets are consistent with goods such as retail gasoline, which
are demanded inelastically but for which suppliers face elastic demand. Our results
are also more relevant to markets in which there are close substitutes.
68
Appendix B
Appendix
B.1 Quadratic Utility and Linear Demand
Here we derive a linear demand system from an assumption of quasi-linear quadratic
utility.1. Following the notation and proof in Amir et al. (2017) we have:
U(x, y) = a′x−1
2x′Bx +y,
where ais a strictly positive vector of size n,Bis a positive-definite n×nmatrix, xis
an n-vector representing quantities of goods, and yis the quantity of the numeraire
good with price py= 1.
Being that matrix Bis positive definite B−1exists and is also positive definite.
Then, imposing the standard budget restriction p′x+y≤mwith exogenous price
vector pand budget m; assuming interiority condition B−1(a−p)>0 and feasibility
condition p′B−1(a−p)≤m, we arrive at a system of linear demand functions:
q(p) = B−1(a−p).(B.1)
Now, to match the market environment we model in our experiment, we impose
restrictions on aand B: we assume ai=a,bii =band bij =d∀i, j ∈[1, n], i =j
for some strictly positive constants a, b, and d, with b>dto ensure Bis positive
definite. These assumptions are equivalent to assuming that the utility derived from
consumption of each good xis symmetric both in terms of own- and cross-product
parameter values. Intuitively and as we will see, this leads to symmetric (linear)
demand functions for each good x.2
1We thank an anonymous reviewer for comments that inspired the approach followed in this
appendix
2As Amir et al. (2017) point out, our use of the term ”linear demand function” is a slight abuse
of terminology. More correctly we have an affine function whenever the implied result is positive
and zero otherwise.
69
To explore this further and apply to our specific demand specification, we make
the several definitions and impose the following additional restrictions on a, b, and
d:
•Define parameters δand γ:δ, γ ∈R++
•Define pias the ith element of price vector pand p−i≡1
n−1Pn
j=ipjas the
average of the other n−1 elements in the price vector
•Assume n∈Z, n ≥2
•Restrict a=δnd
•Restrict b=d+n−1
nγ
•For compactness of notation and clarity, define ∆ ≡n−1
nγ ·1
d⇒b=d(1 + ∆)
Next, we impose these restrictions on (B.1) in order to show that:
lim
d→∞ qi(p;d, n, δ, γ) = δ−γ(pi−p−i)
Next, we must rationalize B−1. Noting that b=d(1 + ∆) we can rewrite Bas
d·B, where the diagonal elements of Bare all 1 + ∆ and the off-diagonal elements
are all 1. For example, to illustrate with n= 4:
B=
1 + ∆ 1 1 1
1 1 + ∆ 1 1
1 1 1 + ∆ 1
1 1 1 1 + ∆
This leads to a specification of B−1=d−1·B−1, with:
B−1=1
∆(∆ + n)
n−1 + ∆ −1−1−1
−1n−1 + ∆ −1−1
−1−1n−1 + ∆ −1
−1−1−1n−1 + ∆
.
We then apply these restrictions to (B.1) and have:
q(p) = B−1(a−p) = 1
dB−1(a−p).
70
Then, the demand for arbitrary good ican be represented as:
qi(p;d, n, δ, γ) = 1
d·1
∆(∆ + n)"(n−1 + ∆)(a−pi)−(−1)
n
X
j=i
(a−pj)#
=1
d∆(∆ + n)[∆(a−pi) + (n−1)(a−pi) + (n−1)(a−p−i)]
=a−pi
d(∆ + n)−(n−1)(pi−p−i)
d∆(∆ + n)
Substituting in a=δnd and ∆ = n−1
nγd , we have:
qi(p;d, n, δ, γ) = δnd −pi
dn−1
nγd +n−(n−1)(pi−p−i)
dn−1
nγd n−1
nγd +n
⇒qi(p;d, n, δ, γ) = δ−pi/nd −γ(pi−p−i)
n−1
n2γd + 1(B.2)
Now, evaluating this expression as d→ ∞ we see that limd→∞ n−1
n2γd + 1= 1
and limd→∞(pi/nd) = 0, which implies:
⇒lim
d→∞ qi(p;d, n, δ, γ) = δ−γ(pi−p−i),QED. (B.3)
We thus see that our model of linear demand, with own- and cross-price parame-
ters equal in magnitude, is consistent with an assumption of quadratic, quasi-linear
utility, albeit a special case taken in the limit as parameters a,band dtend towards
infinity in the pathway described above.
B.2 Experimental Material
B.2.1 Instructions
Good morning, and thank you for agreeing to participate in this economics experi-
ment.
Earnings
As compensation, you will be paid a show-up fee of $5. In addition to the show-up
fee, you will have the opportunity to earn additional money. We anticipate that this
experiment will run around 90-100 minutes. The experiment consists of 5 rounds of
15 periods each, a total of 75 rounds. The computer will randomly select a number
between 1 and 5, corresponding to one of the 5 rounds of the experiment. At the
end of the experiment, your additional fee will be equal to the average payout during
71
that particular round. You will then be paid the added total of the show-up fee plus
the additional fee. You are free to leave at any time; if you do so you will still receive
the $5 show-up fee, but if you leave before the experiment is complete you will not
receive the additional fee. In all cases, your earnings will be paid individually and
anonymously.
Market Setup
In this experiment we will simulate markets, in which you and the other partic-
ipants each play the role of the CEO of a company that produces and sells a single
product in your particular market. You will be randomly grouped with X (n−1)
other companies (participants), and together the Y (n) of you will form this market.
You will stay matched with the same participants in your market for the duration
of the entire experiment. Each company is largely identical, faces the same identical
costs to produce each unit, and has the same profit function. The only thing that
differs between companies is the price they set for the product. Demand for your
product will be simulated by the computer, according to a formula shown on the
payoff sheet we have left at your workstation. The higher your price, the fewer units
you will sell. The higher the average price of the other participants in your market,
the more units you will sell.
During each period of the experiment, all Y (n) of you will be asked to set a
price at which you will each sell the product. You will not know anything about
the price the other participants set, until after you have set your own price. We will
then ask you to guess the average price the other participants set during that same
round. Finally, after all Y participants have set their own prices, we will show you
the average price set by the other X participants, and calculate and show you your
payoff for that particular round. You will be able to see the history of your prices,
the average prices of the other participants, and your resulting payoff for each of the
previous periods within each round, to help you make future pricing decisions.
How to Set Your Price and Predict Your Payout each Period
In the first round of 15 periods, you will face an input cost of $0.90 per unit
produced. “Input cost” is shorthand for the total costs of raw materials, labor,
etc., required to produce one unit of product. The payoff sheet at your workstation
corresponds to this particular input cost. Based on what you guess the average of
others’ costs to be, shown along the columns, you can see how much you will earn
for each potential price you would set, shown along the rows. For example, if the
average price other participants set is $1.50, and you set your price at $2.00, your
payoff would be $4.35. As another example, if the average other participants set
is $2.90 and yours is $2.30, your payoff would be $17.01. Note that the maximum
price you can set for your product is $3.00, because consumers in this market are
not willing to pay more than $3.00 for the product. We only show prices that are
72
multiples of $0.10, because otherwise the payoff sheet would be too large to print on
a single piece of paper. The price you set can be anywhere between the input price
and $3.00, in increments of $0.01. Finally, note that negative numbers are shown in
the payoff sheet in parentheses, so for example ($1.00) means minus one dollar.
Changes in Input Cost
In each round, there will be a change to the input costs each company faces.
Costs will increase or decrease by either $0.40 or $0.80. You will not know the size
or direction of the change until it is announced at the beginning of each round. At
that time, we will hand out a new payoff sheet that corresponds to the new input
cost. Make sure you use only the payoff sheet that corresponds to the current input
cost. Again, input costs will remain the same for all 15 periods of each round, but
will change at the beginning of each new round.
Please raise your hand if you have a question, and one of us will come to you
at your workstation. Please do not talk or discuss the experiment or anything else
with your neighbors, until after the experiment is complete.
B.2.2 Comprehension Questions
Before we begin the experiment, please complete the following questions. Just write
your answer on this sheet of paper. We will walk by each workstation and look at
your answers, to ensure you understand. If you have questions please ask us when
we come by, but DO NOT discuss or ask questions of your neighbors. If there is
anything needing clarification, we will announce it to everyone in the group at the
same time.
1) True/False: I will be rematched with different participants at the beginning
of each round:
2) If I keep my price the same from one period to the next, but the average price
others in my market set falls, my payout in that period will: (hint – look at the
payout chart and see what happens as you go from right to left on any given row)
a. Rise
b. Fall
c. Stay the Same
d. May Rise or Fall, or Stay the Same
3) If the average price of others in my market stays the same between periods,
but if I increase my price, my payout that period will: (hint – look at the payout
chart, but this time see what happens as you go from top to bottom on any given
column)
a. Rise
b. Fall
73
c. Stay the Same
d. May Rise or Fall, or Stay the Same
4) Use the Payoff Chart to answer the following questions:
a. If the average price others set in my market is $2.80, what is the price I could
set that would maximize my own payout that period?
$
b. What would be the amount of that payout?
$
B.2.3 Payoff Table and Experimental Interface
Figure B.1: Payoff matrix
Notes: Exemplary payoff matrix/table. Provided to subjects when marginal cost is equal to $0.90.
74
Figure B.2: Screen: Price setting
Notes: An example for price setting screen.
Figure B.3: Screen: Guess setting
Notes: An example for guess decision screen.
75
Figure B.4: Screen: Feedback
Notes: An example for feedback screen.
76
B.3 Additional Analysis
B.3.1 Regression Results of Asymmetry
The regression estimates used in Figure 2.2 of the main text are reported on Table
B.1 under column (1). The remaining columns introduce control variables in a step-
wise manner. In all models, the dependent variable is the change in output price
(i.e., ∆pi,t =pi,t −pi,t−1).
77
Table B.1: Estimation of asymmetry
(1) (2) (3) (4) (5)
∆mct0.497*** 0.487*** 0.227** 0.224** 0.769***
(0.050) (0.051) (0.066) (0.066) (0.070)
∆mct−1-0.540*** -0.520*** -0.169 -0.165 -1.265***
(0.076) (0.076) (0.088) (0.088) (0.125)
∆mct−20.566*** 0.527*** 0.362*** 0.352** 1.257***
(0.099) (0.099) (0.104) (0.104) (0.134)
∆mct−3-0.295*** -0.265*** -0.172* -0.164* -0.582***
(0.063) (0.063) (0.065) (0.064) (0.077)
∆mct−40.063*** 0.053** 0.030 0.028 0.102***
(0.016) (0.016) (0.016) (0.016) (0.018)
k= 0 (ct) 0.412*** 0.432*** 0.583*** 0.043 0.478***
(0.106) (0.109) (0.140) (0.144) (0.118)
k= 1 (ct−1) 0.152 0.113 -0.026 -0.036 0.004
(0.098) (0.100) (0.124) (0.123) (0.118)
k= 2 (ct−2) -0.230 -0.152 -0.134 -0.113 -0.084
(0.122) (0.125) (0.141) (0.140) (0.140)
k= 3 (ct−3) 0.161* 0.102 0.082 0.067 0.067
(0.077) (0.078) (0.082) (0.081) (0.087)
k= 4 (ct−4) -0.051* -0.031 -0.024 -0.018 -0.008
(0.021) (0.021) (0.020) (0.020) (0.022)
N= 3 -0.007*** -0.005*** -0.005 -0.010***
(0.002) (0.001) (0.003) (0.003)
N= 4 -0.007*** -0.005*** -0.008*** -0.009***
(0.001) (0.001) (0.002) (0.002)
N= 6 -0.009*** -0.007*** -0.012*** -0.013***
(0.002) (0.001) (0.002) (0.003)
N= 10 -0.008*** -0.006*** -0.007** -0.011***
(0.002) (0.001) (0.002) (0.003)
∆p−i,t−10.287*** 0.287*** 0.347***
(0.022) (0.021) (0.024)
Three-way interaction terms with immediate asymmetry
(N= 3) ×ct0.492*
(0.192)
(N= 4) ×ct0.680***
(0.142)
(N= 6) ×ct0.828***
(0.154)
(N= 10) ×ct0.561***
(0.143)
AR(4) included? No No No No Yes
N17130 17130 17130 17130 17130
adj. R20.132 0.133 0.165 0.178 0.301
F-statistic 92.240 70.604 160.112 136.818 135.597
Notes: Robust standard errors are clustered at the market level and are reported in parentheses. *
p < 0.05, ** p < 0.01, *** p < 0.001
78
B.3.2 Pass-through Rates
Table B.2 reports the pass-through rates when τ= 14 for different aggregation levels
of data.
Table B.2: Asymmetry in the pass-through rates after 14 periods
All N>2N= 2 N= 3 N= 4 N= 6 N= 10
Small shocks
β−
14 0.881 0.957 0.483 1.004 1.002 1.088 0.749
(0.0543) (0.0518) (0.192) (0.145) (0.0764) (0.114) (0.0816)
β+
14 0.599 0.729 -0.0806 0.783 0.815 0.479 0.836
(0.0518) (0.0432) (0.197) (0.0809) (0.0818) (0.110) (0.0564)
p-value 0.000 0.001 0.108 0.145 0.066 0.000 0.475
Large shocks
β−
14 0.821 0.902 0.398 0.811 0.961 0.878 0.936
(0.0251) (0.0205) (0.0849) (0.0598) (0.0393) (0.0417) (0.0205)
β+
14 0.802 0.866 0.465 0.637 0.978 1.002 0.796
(0.0297) (0.0270) (0.105) (0.0765) (0.0450) (0.0430) (0.0399)
p-value 0.064 0.042 0.955 0.021 0.331 0.156 0.000
Observations 245 209 36 39 52 48 70
Notes: The averages of pass-through rates by differing group sizes are reported. Below averages, standard errors are
reported in parentheses. p-values correspond to the result of the Wilcoxon signed-rank test on the equality of pass-
through rates for small or large shocks (i.e. H0:β+
14 =β−
14).
B.3.3 Non-parametric Test on Excess Market Power
Table B.3 reports average of excess market power broken out by round and by group
size. The rows 7 and 8 report the p-values resulting from Wilcoxon signed-rank test
on the average of excess market power for small and large shocks, respectively. The
null hypothesis is that the average of excess market power during rounds 2 and 5
(or 3 and 4) are equal.
79
Table B.3: Non-parametric test on excess market power
Rounds All Markets N > 2N= 2 N= 3 N= 4 N= 6 N= 10
10.025 0.021 0.051 0.019 0.020 0.029 0.017
2 0.043 0.037 0.077 0.034 0.033 0.046 0.034
3 0.011 0.008 0.029 -0.012 0.019 0.019 0.004
4 0.048 0.042 0.083 0.038 0.044 0.052 0.038
5 0.030 0.025 0.061 0.026 0.029 0.027 0.020
All Rounds 0.032 0.027 0.060 0.021 0.029 0.034 0.023
Small η(2vs5) 0.000 0.000 0.499 0.115 0.109 0.000 0.000
Large η(3vs4) 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Observations 245 209 36 39 52 48 70
80
Chapter 3
Measuring Strategic-Uncertainty
Attitudes
An earlier version of this chapter is published as: Bruttel, L., Bulutay, M., Cornand,
C., Heinemann, F. and Zylbersztejn, A., 2023. Measuring strategic-uncertainty
attitudes. Experimental Economics, 26(3), pp.522-549.
3.1 Introduction
Strategic uncertainty is the uncertainty that players face with respect to the pur-
poseful behavior of other players in an interactive decision situation. While economic
theory mostly applies equilibrium concepts like Nash or rational expectations equi-
libria that are based on the absence of strategic uncertainty, experiments show that
real decision makers are sensitive to strategic uncertainty. Laboratory experiments
have indicated that many humans exhibit strategic uncertainty aversion: they are
ready to waive a part of their expected payoff in order to avoid that their pay-
off depends on the decisions made by others.1This behavioral phenomenon has
far-reaching consequences for economic efficiency, because it implies coordination
failures and suboptimal levels of investment and risk taking in markets.
From early experiments, we know that humans tend to prefer situations with
known probabilities of outcomes to “ambiguous” situations in which these proba-
bilities are unknown (Camerer and Weber,1992). This attitude is called ambiguity
aversion. Tests of ambiguity aversion traditionally compare choices between lotter-
ies with given probabilities and lotteries for which the probabilities are exogenously
given but unknown to subjects. Ambiguity aversion might also apply to strategic
interaction. However, the beliefs about the strategic behavior of other humans are
also affected by the theory of mind: agents may put themselves in the shoes of other
1See, for example, Greiner (2016).
81
decision makers and form beliefs about their reasoning processes. This idea has been
taken to the extreme by the Nash equilibrium concept in which each player’s strat-
egy is a best response to the other players’ strategies. As a descriptive theory, Nash
equilibrium assumes that players are able to guess the strategies of others either by
simultaneously solving the others’ decision problems or by relying on experience (as
in repeated games). Such reasoning processes may reduce perceived strategic un-
certainty, so that strategic uncertainty aversion may have lower effects on behavior
than ambiguity aversion in lotteries with completely unknown probabilities. On the
other hand, strategic interactions are also more complex to analyze than lotteries.
Humans try to avoid complexity and may doubt the logical consistency of their own
reasoning processes or the logical consistency of other players’ reasoning processes
or decisions.
This chapter develops a method for measuring strategic-uncertainty attitudes
and distinguishing them from risk and ambiguity attitudes. The main idea is to
elicit and exploit the information contained in certainty equivalents (willingness to
accept) for lotteries under three different sources of uncertainty: strategic uncer-
tainty, risk and ambiguity. We provide a structural model of uncertainty attitudes
that allows us to measure two dimensions of uncertainty attitudes: a preference for,
or aversion against, the source of uncertainty, modelled by an additional [dis]utility
depending on the source, and optimism or pessimism2regarding the outcome, which
we formalize as a shift of the subjective weight that is put on the higher outcome.
We conduct an experiment with interactive games and interaction-free lottery
tasks. Unlike previous experiments, our novel methodology allows for a variation
of the source of uncertainty (whether strategic or not) across conditions in a ceteris
paribus manner. This means that we keep the potential payoffs constant but consider
different mechanisms (random or strategic) that determine the realized payoff. Since
strategic uncertainty typically characterizes coordination problems, we focus on two
coordination games: one with strategic complementarities in agents’ actions and
one with strategic substitutability (anti-coordination). Following the literature on
strategic uncertainty (see below), we apply our methodology to two classic 2x2
games: stag-hunt and market-entry games.3
For the different sources of uncertainty – each of the two games, as well as the
corresponding ambiguous lottery environments – we identify two subject-specific
parameters of a model of uncertainty attitudes. We investigate two ways in which
2One might also interpret these as excitement or fear about the other player’s behavior.
3Stag-hunt games provide a useful paradigm to analyze a wide range of economic phenomena,
such as macroeconomic fluctuations (Cooper and John,1988), bank runs, debt and liquidity crises,
speculative attacks (Morris and Shin,2003;Heinemann,2012), and commercial production pro-
cesses (Brandts et al.,2015). Market-entry games describe the prototypical situation of conflicting
interests, such as Cournot competition or location choice.
82
strategic uncertainty may affect behavior in a game. First, following Baillon et al.
(2017), we define ambiguity as a situation where subjects have information about
possible outcomes of a lottery but not about probabilities. Whether these given –
exogenous to the decision maker – unknown probabilities are resulting from human
decisions or nature does not affect this definition of ambiguity. We investigate
whether, all other things being equal, attitudes towards uncertainty differ between
strategic uncertainty and ambiguity conditions. Second, strategic uncertainty is
related to conscious behavior of human players whose interaction exhibits common
or opposite interests, and as such involves decisions based on strategic thinking. We
study how the nature of the game (strategic complements versus substitutes) affects
these uncertainty attitudes.
We document systematic attitudes toward uncertainty. These attitudes vary
across contexts and across subjects. The median participant exhibits neither a
preference for, nor an aversion against ambiguity or strategic uncertainty. In the
game with strategic complements [substitutes], the median participant is found to be
pessimistic [optimistic] regarding the outcome that leads to a higher payoff given the
player’s own choice. Comparing uncertainty attitudes across treatments, we observe
more optimism in the entry game than in the stag-hunt game or under ambiguity
(both of which, in turn generate similar results).
The next section describes our contribution to the literature. Section 3.3 presents
the experimental design and procedures. Section 3.4 lays out the theoretical under-
pinnings of our design. Section 3.5 shows the results and Section 3.6 concludes the
chapter.
3.2 Related Literature
Brandenburger (1996) defines strategic uncertainty as uncertainty about the pur-
poseful behavior of players in an interactive decision situation. Experimental evi-
dence reported in Beard and Beil (1994) can hardly be explained without assuming
that players dislike situations in which their payoffs depend on the decisions made
by other players. Camerer and Karjalainen (1994) attribute this behavior to ambi-
guity aversion, because there are no given probabilities for other players’ strategies.
They use non-additive probabilities as in Gilboa and Schmeidler (1989) to model
ambiguity aversion and argue that ambiguity aversion may be responsible for players
not reaching an efficient equilibrium in coordination games with strategic comple-
ments (like the median effort game). Camerer and Karjalainen (1994) conduct an
experiment on the median effort game, in which they elicit bounds on subjective
probabilities for complementary and exhaustive events defined on the outcomes of
the game. If the sum of these probabilities is smaller than one, a subject can be
83
said to be ambiguity averse. Unfortunately, their method of eliciting subjective
probabilities seems rather fragile as it may produce contradictory results and does
not allow a clear distinction between subjective beliefs about others’ behavior and
aversion against strategic uncertainty.
Greiner (2016) is the first to clearly identify aversion against strategic uncer-
tainty by comparing behavior in dictator, ultimatum, and impunity games. He
shows that behavior in these games indicates a substantial aversion against strate-
gic uncertainty that may be higher than ambiguity aversion. Subjects pay high
prices for avoiding that their payoff depends on the decisions of their partners, even
though they attribute high subjective probabilities to their partners’ decisions being
favorable for them.
Bohnet and Zeckhauser (2004) find similar evidence in a trust game, where the
second mover could either be another subject or a lottery. They attribute subjects’
reluctance to depend on human second movers as betrayal aversion, but strategic
uncertainty aversion might have played some role. Li et al. (2020) find that ambi-
guity preference affects the decision to trust a trustee. Note that the games used
by Greiner (2016) and Bohnet and Zeckhauser (2004) all have a unique equilibrium
and equilibrium choices of the second movers can be derived simply by eliminating
dominated strategies.
Kelsey and le Roux (2015) analyze behavior in an extended battle of the sexes
game and find further evidence indicating that strategic uncertainty aversion may
exceed ambiguity aversion in non-strategic games. They also conjecture that not
only strategic uncertainty, but also strategic uncertainty aversion may depend on
the nature of the game. However, they have no means to test this hypothesis.
Nevertheless, this conjecture has to be taken seriously, because Ivanov (2011) finds
that in a game that is solvable by iterative elimination of dominated strategies, 32
percent of subjects are strategic uncertainty loving, while only 22 percent are averse
to strategic uncertainty.
Nagel (1995) provides an experimental test of a game with strategic complements
and shows that behavior can be described by assuming that subjects follow distinct
levels of reasoning, where Level zero is defined as random choice of a strategy and
Level k is defined as best response to Level k-1. Camerer et al. (2004) develop a
cognitive hierarchy model based on levels of reasoning. Uncertainty about other
players’ strategies can be modelled as uncertainty about the levels of reasoning
applied by other players. In games with strategic complements, the number of
levels of reasoning is in a monotone relationship with actions and, thus, experiments
on such games can be used to measure the distribution of levels among players, but
also the beliefs about others’ levels of reasoning. In games with strategic substitutes,
however, the optimal strategy for a given number of levels of reasoning is non-
84
monotonic. In entry games, for example, the optimal decision is to enter for any odd
number of levels and to stay out for any even number of levels (or vice versa). This
raises the question whether perceived strategic uncertainty or strategic uncertainty
aversion differ between games with strategic complements and substitutes.
Heinemann et al. (2009) propose a method to measure strategic uncertainty in
coordination games with strategic complements. They let subjects play a variety
of games, each consisting of a choice between two options A and B. Option A is
associated with a safe payoff X, while Option B paid 15eif at least a fraction k
of the other subjects were choosing B in the same game and zero otherwise. The
safe payoff was varied from 1.50eto 15eand subjects typically switched from B to
A at some value of the safe payoff. The safe payoff at the switching point can be
interpreted as certainty equivalent for the uncertain option in this game and, thus,
be used as a measure for strategic uncertainty. Subjective probabilities for success of
Option B can be elicited directly or derived from comparing the certainty equivalent
of a strategic game with certainty equivalents of lotteries with given probabilities.
As the safe payoffs are part of the game and any pair A-B is a different game,
switching points only provide precise measures of strategic uncertainty for games in
which subjects are indifferent between A and B. Thus, this method can only give
upper or lower bounds for strategic uncertainty in games in which subjects reveal
their preference for one or the other option by choosing it.
Following the same method as Heinemann et al. (2009), recent work by Chier-
chia et al. (2018) elicits certainty equivalents for choosing the uncertain option in
coordination games with strategic complements (stag-hunt games) and substitutes
(entry games). They find that most subjects have a unique switching point in stag-
hunt games, but multiple switching points for entry games, which is in line with
higher levels of reasoning.4The observed multiple switching points in entry games
indicate, however, that levels of reasoning and strategic uncertainty may be related,
for which reason we focused on games with strategic complements and substitutes
to measure strategic uncertainty aversion. In addition, many simultaneous-move
games are characterized by strategies being either complements or substitutes, and
games with these characteristics are applied in many domains of economics to model
competition, monetary policy, financial crises, network externalities in growth, and
4Nagel et al. (2018) explain multiple switching points in entry games by the higher demand
for strategic reasoning compared to a stag-hunt game. They analyze the brain activity of subjects
during decision-making in an fMRI scanner. They show that strategic games activate the brain
network that also mediates risk during lottery decisions (anterior insula, dorsomedial prefrontal
cortex and parietal cortex) which indicates that strategic uncertainty is treated in a similar way as
other forms of uncertainty. The activation of the risk mediating network is highest when subjects
chose the risky action in the entry game which indicates that entry games are associated with a
higher perceived strategic uncertainty. The level of strategic thinking is reflected in the activity of
the dorsomedial and dorsolateral prefrontal cortex. These regions are more active among players
with non-threshold strategies in the entry game, indicating higher levels of reasoning.
85
political economy issues, to name just a few.
While multiple price lists used by Heinemann et al. (2009) and others allow for
measuring strategic uncertainty, the authors do not clearly distinguish strategic-
uncertainty attitudes from ambiguity attitudes.56 At best, the existing methods
suffice to distinguish whether a subject likes or dislikes strategic uncertainty. While
the general conclusion is that subjects dislike strategic uncertainty, Ivanov (2011)
provides evidence that strategic uncertainty may be preferred to risk. We thus
reckon that the literature lacks a clear methodology to measure strategic-uncertainty
attitudes. We fill this void by developing a method that can be used to measure
strategic-uncertainty attitudes for any strategic binary-choice game and distinguish
optimism or pessimism regarding the outcome of the game from a preference for or
aversion against the source of uncertainty.
3.3 Experimental Design and Procedures
We develop a method for measuring attitudes towards strategic uncertainty. We use
a within-subject design based on three distinct experimental conditions. The main
condition of interest is STRATEGICUNCERTAINTY, in which the uncertainty that
players face in the game stems from other players’ behavior. We also include two
control conditions: RISK (the aim of which is to establish a behavioral benchmark
for a pre-determined structure of uncertainty, where possible outcomes and associ-
ated probabilities are known) and AMBIGUITY (which captures behavior under
uncertainty, where possible outcomes are known but associated probabilities are
unknown).
Each subject acts in all of the three decision-making environments in the follow-
ing order: RISK,AMBIGUITY, and finally STRATEGICUNCERTAINTY. The
STRATEGICUNCERTAINTY treatment is played for two distinct 2-player, 2-
strategy settings: one with strategic complements, the stag-hunt game (see Game 1
in Table 3.1 below), and one with strategic substitutes, the entry game (see Game
2 in Table 3.2 below). The order, in which subjects face the two games, varies. In
half of the sessions, the STRATEGICUNCERTAINTY treatment starts with sub-
5Heinemann et al. (2009) compare strategic uncertainty to risk. Apart from the research ques-
tion itself, many design features of our experiment differ from theirs (e.g. elicitation of certainty
equivalent and subjective beliefs). In their experiment, subjects choose between a safe payoff and
a risky payoff that they get if and only if a sufficient number of subjects chooses the risky option.
Thus, the safe payoff was not a certainty equivalent for the game, but part of the game itself.
Hence, the method employed by Heinemann et al. (2009) cannot identify any attitudes towards or
against strategic uncertainty. In contrast, we elicit certainty equivalents for each potential action
in the game without the stated certainty equivalents affecting payoffs in the game.
6A comparison of risk and ambiguity driven either by human behavior or computer is proposed
by Farjam (2019). However, he focuses on non-strategic human-driven uncertainty and shows that
computerized uncertainty is preferred.
86
jects facing Game 1 before Game 2, and conversely in the other half of the sessions.
The payoff structure in Tables 3.1 and 3.2 is such that in each game each player
decides between two “lotteries” (one lottery pays either 20eor 15e, the other ei-
ther 5eor 25e) in which the outcome depends on the other player’s decision. We
elicit the certainty equivalents for both of these “lotteries” along with subjective
beliefs, and compare them with certainty equivalents of analogous binary lotteries
with exogenously given probabilities.
Table 3.1: Game 1 and associated
payoffs
You
The other player
L R
L 20e, 20e15e, 5e
R 5e, 15e25e, 25e
Table 3.2: Game 2 and associated
payoffs
You
The other player
L R
L 5e, 5e25e, 20e
R 20e, 25e15e, 15e
Prior to the RISK treatment, subjects take part in five unpaid lotteries under
the same design as the RISK treatment. The goal of this training part is to ac-
custom subjects with the basic mechanisms at play, and especially to let them gain
experience with the Becker-DeGroot-Marschak procedure (Becker et al.,1964). Un-
like the main part of the experiment that follows, in this preliminary part subjects
receive feedback after each lottery.
The three treatments are summarized in Section 3.3.1. The key feature of our
experimental design is that it varies the source of uncertainty, keeping the remain-
ing aspects of the decision-making process as identical as possible across treatments.
This, in turn, allows for isolating and measuring the behavioral effect of strategic un-
certainty as compared to other sources of uncertainty. The experimental procedure
is outlined in Section 3.3.2.
87
3.3.1 Treatments
While the STRATEGICUNCERTAINTY is played last in our experiment, we present
it first because it is our main treatment. We then present the two control treatments,
which are played first.
Main treatment: STRATEGICUNCERTAINTY
This treatment consists of two consecutive parts, each involving a different game
(either Game 1 or Game 2). The order of games is balanced across sessions. Subjects
are randomly and anonymously matched into pairs for each game.
In each session there are 12 subjects. This allows us to consistently use frequency-
based framing (“how many times out of 10”) when eliciting beliefs about others’
behavior.
In the STRATEGICUNCERTAINTY treatment, each subject makes 4 decisions:
–Decision 1: The choice between L and R in the game.
–Decision 2: Subjective beliefs about the behavior of the other subjects. We
ask the following question: Out of the 10 other participants (not including the own
counterpart) in this session, how many would choose R? Beliefs are incentivized
using a binarized quadratic scoring rule.7
–Decision 3: The certainty equivalent (WTA) for not playing the game if De-
cision 1 is implemented.
–Decision 4: The certainty equivalent (WTA) for not playing the game if the
alternative of Decision 1 is implemented.
7Note that quadratic scoring rules are incentive compatible only for expected-payoff maximizers.
Biases may occur for non-risk-neutral subjects (Offerman et al.,2009). Schotter and Trevino (2014)
provide a survey on experimental belief elicitation. The binarized quadratic scoring rule (Hossain
and Okui,2013) (BSR) incentivises truthful reporting of beliefs independently of risk-preferences
and the (non-linear) form of probability weighting. Danz et al. (2020) have recently shown that
in practice subjects misreport their beliefs even with the BSR. However, they also show that
“false reporting and pull-to-center effects arise only when participants are informed of the BSR’s
quantitative incentives” (Danz et al.,2020, p.2). For this reason, we apply the binarized quadratic
scoring rule, but in the instructions, we present the details only on demand and solely tell subjects
the principle of this mechanism and that it is in their own interest to state their true beliefs.
Alternatively, we could correct the stated beliefs from a standard quadratic scoring rule using the
estimated relative risk aversion along the lines laid out in Offerman et al. (2009). However, this
exercise also requires structural assumptions that, if misspecified, may bias the findings even more
than using the stated beliefs without correction. See the experimental instructions in Appendix
C.1 for implementation details of the BSR in our study.
88
We allow WTAs in Decisions 3 and 4 to be stated on [0, 30e]. This exceeds the
range of potential payoffs so as to detect strong aversion against or strong preference
for strategic uncertainty. Payoffs are determined as follows:
A. With 1/3 probability, the game is played and payoffs are determined by both
subjects’ Decision 1.
B. With 1/3 probability, subjects are paid for their stated beliefs.
C. With 1/3 probability, a subject’s own payoff depends on her own stated WTAs
and on the other subject’s Decision 1. Here, each subject’s payoffs are deter-
mined as follows:
1. One of two possible actions – either L or R – is drawn at random (with
50% chance for the own preferred action) and replaces the subject’s own
Decision 1.
2. For that action, the BDM procedure takes place. The computer draws
a random integer from 1 to 30e. All integers are equally likely. If the
drawn integer is larger than or equal to the stated WTA for that action,
then the subject’s payoff equals the randomly drawn number.
3. If the drawn integer is smaller than the stated WTA for that action, the
subject’s payoffs are determined by that action and by Decision 1 of the
other subject.
With this design, a subject’s own Decision 1 is only payoff-relevant for her if
the game is actually played (Situation A). Thus, each subject’s Decision 1 is not
affected by her choice of WTAs. Hence, beliefs about the other’s Decision 1 are not
affected by beliefs about the other’s WTA either. Thereby, we provide the highest
incentive for subjects to activate their theory of mind as intended for Game 1 or
Game 2. The decision on WTAs depends solely on beliefs about the Decision 1 of
the other subject and it requires the same considerations. Our procedure elicits the
WTAs for the action that the subject would have chosen herself and also for the
counterfactual non-preferred decision. This allows us to identify two parameters of
a model of strategic uncertainty that can be interpreted as uncertainty aversion and
optimism (see Section 3.5). Theoretically, the higher of the two stated WTAs is the
WTA for the entire game.
For comparability purposes, we design the two control treatments in a similar
frame as the STRATEGICUNCERTAINTY treatment. These two treatments vary
the source of uncertainty. In the RISK treatment, uncertainty is generated by a
random process with known probabilities. In the AMBIGUITY treatment, the out-
come is determined by an unknown probability distribution.
89
Control treatment 1: RISK
In this treatment, each subject is faced with 11 pairs of lotteries (lotteries
15e/20eor 5e/25eassociated with 11 given probabilities p). Here, we only ask
for 22 WTAs for the respective 22 lotteries.
A subject’s own payoff depends on her own stated WTAs and is determined as
follows. The computer determines which of the 22 lotteries is carried out. Each
lottery is equally likely to be selected. Then, the BDM procedure takes place. The
computer draws a random amount from 0 to 30ewith 2 decimals. If the drawn
amount is larger than or equal to the stated WTA for the selected lottery, then the
subject’s payoff is equal to the randomly drawn amount. Otherwise, the lottery
is played. Altogether, each subject makes 22 decisions using a table of contingent
choices similar to Table 3.3.
Table 3.3: Decision table in the RISK treatment
Probability with which the
computer selects the higher
payoff
WTA for lottery that pays
either 15eor 20e
WTA for lottery that pays
either 5eor 25e
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
The 11 lotteries on the left-hand side of the table pay either 15eor 20e, the 11
lotteries on the right-hand side pay either 5eor 25e. In any lottery, the computer
determines randomly which of the two possible payments is made. Subjects receive
information about which part of the experiment and eventually which lottery is se-
lected for payoffs only at the end of the experiment after all decisions are completed.
90
Control treatment 2: AMBIGUITY
In this treatment, each subject is faced with one pair of lotteries that are pre-
sented in the same way as potential payoffs in the previous treatment, but this time,
subjects are not told the likelihood that the higher payoff is chosen. Subjects are
informed that the computer selects one of the 11 distributions from the RISK treat-
ment before their own decision. We inform them that the 11 distributions are not
equally likely to be selected.8As in the RISK treatment, each subject states WTAs.
Here, we ask for two WTAs, one for each lottery. In addition, each subject states her
belief about the selected probability distribution. The computer randomly decides
whether subjects get paid according to the BDM procedure, or according to their
stated beliefs (with 1/2 probability each). The computer selects the probability for
the higher payoff and, if the BDM procedure is payoff-relevant, one of the lotteries
(L/R) is selected with 50% chance. As a next step of the BDM procedure, the com-
puter draws a random amount from 0 to 30e. If the random amount is larger than
or equal to the stated WTA for that lottery, then the subject’s payoff is equal to
the randomly drawn amount. Otherwise, the lottery is played with the probability
distribution selected by the computer.
3.3.2 Implementation Details
The design of the experiment was approved by the local GATE-Lab (Lyon) ethic
committee. We ran 19 sessions (including the pilot session) with 12 participants each
(maximal capacity during the COVID pandemic) at the Experimental Economics
Laboratory of the Technische Universit¨at Berlin, Germany, between September and
October 2021.910 Participants were recruited through ORSEE (Greiner,2015) and
95% of them were students from various disciplines – engineering (41.7%), economics
(8.8%), and business administration (6.6%) representing the largest groups. The
8For the sake of implementation, the random process generating probability distributions in
lotteries played under ambiguity is based on 2019 weather data from Berlin.
9In the pre-results reviewed report, we planned to run sessions with a minimum of 14 partic-
ipants at the GATE-Lab, Lyon, France. This initial plan could not be implemented due to the
pandemic conditions.
10The minimal sample size determined by the power analysis is N= 208. Our power calculations
(GPower software, version 3.1) are based on the nonparametric two-sided Wilcoxon signed rank
test. We assume normal parent distribution. We apply the following criteria. First, a test attains
the statistical power of at least 0.8 (which is a common-place reference value in the literature) with
the conventional threshold for rejecting a null hypothesis of 5%. Second, the minimal effect size
(as measured by Cohen’s d) a test can pick up on is small (d= 0.2). The resulting actual power
equals 0.801. Given our initial sample of N= 228 (i.e., prior to applying both the pre-registered
and the ex post data selection criteria, as explained in Section 3.5.1) and d= 0.2, the resulting
statistical power is even higher and equals 0.836 at the 5% significance level. Conversely, with a
reference minimal power of 0.8 (the actual one being 0.801), this sample size is enough to pick up
on a treatment effect of magnitude d= 0.191.
91
experiment was programmed using z-Tree (Fischbacher,2007).
Participants were randomly seated in front of PCs. Throughout the sessions,
they were not allowed to communicate with one another and could not see each
other’s screens. All questions were answered in private.
Only one of the four parts (risk, ambiguity, stag-hunt game, entry game) was
chosen for final payoffs. The probability was 0.25 for each part. Within the selected
part, the payoff was determined as specified in Section 3.3.1. This means that
only one decision of a player was payoff relevant, but each decision could be the one.
This procedure rules out incentives for hedging and provides the highest incentive to
consider the uncertainty of the outcome associated with each decision. The average
payoff was about 21.80e(minimum 6.60e, maximum 34.80e) including the fixed
show-up fee of 5e. Sessions lasted for around 70 minutes on average. Examples of
instructions, questionnaires, and screens are given in Appendix C.1.
3.4 Theoretical Framework
Let us start our theory considerations by observing that any choice in a simultaneous-
move game may be interpreted as a choice between lotteries whose outcomes depend
on the choices of other players. Our 2x2 games involve the choice between a lottery
L with payoff 20eor 15eand lottery R with payoff 5eor 25e. Note that the prob-
ability of receiving 15eafter choosing L is the same as the probability of receiving
25ewhen choosing R. It is the probability that the other player chooses R. In the
stag-hunt game (Game 1), a player gets the higher payoff of her chosen lottery, if
her partner chooses the same lottery. In the entry game (Game 2), a player gets the
higher payoff, if her partner chooses the other lottery.
The value of a lottery k for a subject i can be written as
Wk
i(¯x|πi) = E(ui(x)|πi) + ∆k
i(¯x|πi)
where ui(x) is subject i’s utility function, ¯xis the vector of potential monetary pay-
offs and πiis the subject’s probability distribution over outcomes. For an expected-
utility maximizer, ∆k
i(x|.) = 0 for all x. If we assume that subjects evaluate lotter-
ies with exogenously given probabilities by expected utility, the attitude towards or
against ambiguity or strategic uncertainty can be written as a deviation of the eval-
uation from expected utility, denoted by ∆k
i(x|.). A theory of ambiguity attitudes
specifies this deviation.
We propose to model ambiguity attitudes for binary lotteries and strategic-
uncertainty attitudes for a 2x2 game by two parameters αk
iand δk
isuch that the
92
utility value that subject i attaches to the possible outcomes from her own choice is
Wk
i(x1, x2, πi) = (πi+αk
i)ui(x1) + (1 −πi−αk
i)ui(x2)−δk
i,(3.1)
where x1≥x2are the potential monetary payoffs and πiis the subjective proba-
bility for receiving x1. The parameter δk
imay be interpreted as an aversion against
strategic uncertainty if it is positive, or as a preference for strategic uncertainty if
it is negative. The higher δk
i, the lower is the value of the lottery, in line with the
interpretation of an increasing aversion against uncertainty. The parameter δk
iaf-
fects the value of a lottery independent of the perceived risk that is associated with
it. The second parameter, αk
i, establishes the weight that the subject puts on the
higher outcome given her own choice. If αk
i>0, the subject puts a weight on the
higher payoff that exceeds her subjective probability for this outcome. If αk
i<0,
the subject puts a weight on the lower payoff that exceeds her subjective probability
for this outcome. We may interpret this parameter as optimism, where αk
i= 0 is the
unemotional Bayesian view on the lottery, while subjects with αk
i>0 may be called
optimists and subjects with αk
i<0 pessimists. Optimism [pessimism] may arise
from the excitement [fear] about the prospect of getting the high [low] amount when
it is determined by another human playing strategically (strategic uncertainty) or
by an unknown process (ambiguity). Note that the value of the lottery rises with
increasing optimism. Thereby, our model allows for a clear interpretation of both
parameters.
The value of an ambiguous lottery or a game may be higher [lower] than the
value of the highest [lowest] possible realization under certainty. From the standard
economic perspective, it may seem odd that the value of an uncertain situation could
be higher than the highest possible payoff or lower than the lowest one. However,
this may reflect particular attitudes towards strategic interactions with other human
players: a person may be generally uncomfortable with depending on other humans,
or may derive utility from playing a game with somebody else on top of the utility
generated by the monetary payoffs in this game.
In our experiment, we also elicit the certainty equivalent of participating in the
game, if the player’s chosen action is replaced by the opposing action. If the subject
is optimistic about getting x1= 25ein the game with his chosen action, she must
be pessimistic about receiving ¯x1= 20eunder the replaced choice. Thus, for this
counterfactual choice, the value of the implied lottery is
¯
Wk
i(¯x1,¯x2,1−πi) = (πi+αk
i)ui(¯x2) + (1 −πi−αk
i)ui(¯x1)−δk
i,(3.2)
where ¯x1>¯x2are the payoffs implied by the counterfactual choice.
An alternative theory of ambiguity attitudes is the Choquet-expected utility with
93
neo-additive capacities that specifies a value function (Chateauneuf et al.,2007)
Vi(¯x, πi) = X
x
(1 −δi)πi(x)ui(x) + δi[αiui(xmax) + (1 −αi)ui(xmin)].
For a lottery with only two possible outcomes x1≥x2,
Vi(x1, x2, πi) = (πi+δi(αi−πi))ui(x1) + ((1 −πi)−δi(αi−πi))ui(x2)
=E(ui(x)|πi) + δi(αi−πi)[ui(x1)−ui(x2)].
The interpretation, given in the literature (see e.g., Greiner 2016), is that δiis the
ambiguity of a player (1 −δi) is her trust in her own beliefs) and αiis optimism.
By this interpretation, an increasing ambiguity may raise or lower the value of
the lottery, depending on whether optimism exceeds or stays below the subjective
probability for the higher payoff. The interpretation of αimay also cause a problem.
For δi>0, the evaluation rises in αi, but for δi<0, increasing “optimism” reduces
the value of the lottery. Restricting αiand δito be in [0,1] avoids this, but may be
inconsistent with large deviations of the value of a lottery from the expected utility
that it implies. Finally, the parameters are not identified from the evaluations of the
two lotteries that a subject can choose in a 2x2 game, if she assigns πi= 0.5 to the
other player’s choices. For these reasons, we use the model described by Equations
3.1 and 3.2 for further analysis.
3.4.1 Identification and Uncertainty Attitudes
For identification, we assume that αk
iand δk
iare the same for all lotteries with the
same source of uncertainty. With the data from our experiment, we compare these
parameters for three sources of uncertainty: we denote k=Ain the AMBIGUITY
treatment, k=Sin the stag-hunt game, and k=Ein the entry game.
Utility function and risk aversion
In order to estimate uncertainty attitudes, we assume that subjects have CRRA
utility functions, ui(x) = x(1−ri)/(1 −ri) for ri= 1 and ui(x) = ln(x) for ri= 1,
where riis the Arrow-Pratt measure of relative risk aversion (RRA). We use the
22 stated WTAs in the RISK treatment to estimate rifor each subject i. If all 22
WTAs are equal to the expected monetary payments of the respective lotteries, we
set ri= 0. For further details, see Section 3.5.3.
Identification of parameters
94
Let πibe a subject i’s probability to receive x1in a binary lottery kwith payoffs
x1> x2. Then, the subject’s WTA for an ambiguous lottery or for the chosen lottery
in a game is given by the value Wk
i(x1, x2, πi).
Our 2x2 games involve the choice between a lottery L with payoff 20eor 15e
and lottery R with payoff 5eor 25e. In the stag-hunt game (Game 1), a player
gets the higher payoff of his chosen lottery, if her partner choses the same lottery. In
the entry game (Game 2), a player gets the higher payoff, if her partner choses the
other lottery. Thus, in both games, we observe the values of two lotteries where the
probability πito win the higher payoff in the chosen lottery equals the probability
of getting the lower payoff in the counterfactual lottery. In the stag-hunt game, πi
is the subject’s probability that her partner chooses the same action. In the entry
game, πiis the subject’s probability that her partner chooses the opposite action.
Setting the utility of the stated WTA for the chosen strategy in game k equal to
Wk
i(x1, x2, πi) and the utility of the stated WTA for the opposing strategy in game k
equal to ¯
Wk
i(¯x1,¯x2,1−πi), while assuming a CRRA utility function with RRA rias
estimated from decision in the RISK treatment, yields two equations that identify
αk
iand δk
i.
As we assumed that subjects evaluate lotteries as expected-utility maximizers,
the WTA for a lottery with payoffs 20eor 15eand a probability p for the higher
payoff should yield a utility that equals Eui(20,15|p).
If a subject chooses the strategy that leads to potential payoffs (x1, x2) in a game
with x1> x2, and for a subjective probability πiof getting x1, the value of this game
is given by Equation 3.1. Using this,
Wk
i(x1, x2, πi) = (αk
i+πi)ui(x1) + (1 −αk
i−πi)ui(x2)−δk
i
=E(ui(x1, x2|πi)) + αk
i(ui(x1)−ui(x2)) −δk
i,
and replacing expected utility by utility from stated WTA in the risky lottery
(WTAR
i), we get11
(WTAk
i(x1, x2, πi))1−ri−(WTAR
i(x1, x2, p =πi))1−ri
1−ri
=αk
i[ui(x1)−ui(x2)] −δk
i,
if and only if
δk
i=(WTAR
i(x1, x2, p =πi))1−ri−(WTAk
i(x1, x2, πi))1−ri+αk
i[x1−ri
1−x1−ri
2]
1−ri
.
(3.3)
11Note that, alternatively, we could calculate the expected utility of this lottery by inserting
monetary payments in the estimated CRRA utility function. We prefer the more direct comparison
between stated WTAs, because this is less affected by assumptions on the utility function.
95
For the lottery with the alternative payoffs (¯x1,¯x2), the probability of achieving
the higher payoff is 1 −πi. Thus,
¯
Wk
i(¯x1,¯x2,1−πi) = (1 −πi−αk
i)ui(¯x1) + (πi+αk
i)ui(¯x2)−δk
i
=Eui(¯x1,¯x2|1−πi)−αk
i(ui(¯x1)−ui(¯x2)) −δk
i.
Replacing the value of the lottery by the utility of the certainty equivalent, WTAk
i,
and EU by WTA under risk, we get:
(WTAk
i(¯x1,¯x2,1−πi))1−ri−(WTAR
i(¯x1,¯x2, p = 1 −πi))1−ri
1−ri
=−αk
i[ui(¯x1)−ui(¯x2)]−δk
i,
if and only if
δk
i=(WTAR
i(¯x1,¯x2,1−πi))1−ri−(WTAk
i(¯x1,¯x2,1−πi))1−ri−αk
i[¯x1−ri
1−¯x1−ri
2]
1−ri
.
(3.4)
Setting 3.3 equal to 3.4 yields
(WTAR
i(x1, x2, p =πi))1−ri−(WTAk
i(x1, x2, πi))1−ri+αk
i[x1−ri
1−x1−ri
2]
= (WTAR
i(¯x1,¯x2,1−πi))1−ri−(WTAk
i(¯x1,¯x2,1−πi))1−ri−αk
i[¯x1−ri
1−¯x1−ri
2.
Therefore
(WTAR
i(x1, x2, p =πi))1−ri−(WTAR
i(¯x1,¯x2,1−πi))1−ri
+ (WTAk
i(x1, x2, πi))1−ri−(WTAk
i(¯x1,¯x2,1−πi))1−ri
=−αk
i[x1−ri
1+ ¯x1−ri
1−x1−ri
2−¯x1−ri
2].
Hence,
αk
i=A
Bfor k=S, E, (3.5)
with
A= (WTAR
i(x1, x2, πi))1−ri−(WTAR
i(¯x1,¯x2,1−πi))1−ri
+ (WTAk
i(¯x1,¯x2,1−πi))1−ri−(WTAk
i(x1, x2, πi))1−ri
and
B=−[x1−ri
1+ ¯x1−ri
1−x1−ri
2−¯x1−ri
2]<0 for k=S, E.
96
For subjects with ri= 1,
A= ln(WTAR
i(x1, x2, πi)) −ln(WTAR
i(¯x1,¯x2,1−πi))
+ ln(WTAk
i(¯x1,¯x2,1−πi)) −ln(WTAk
i(x1, x2, πi)),
B=−[ln x1+ ln ¯x1−ln x2−ln ¯x2],
and δk
i= ln(WTAR
i(x1, x2, p =πi)) −ln(WTAk
i(x1, x2, πi)) + αk
i[ln(x1)−ln(x2)].
Note that (x1, x2) = (20,15) ⇔(¯x1,¯x2) = (25,5) and (x1, x2) = (25,5) ⇔
(¯x1,¯x2) = (20,15). In both games, if (x1, x2) = (25,5), πiis the probability that
the other player chooses R. If (x1, x2) = (20,15), πiis the probability that the other
player chooses L.
Under ambiguity (k=A), we elicit the WTAs for two lotteries with payoffs
(25,5) and (20,15) along with a subjective probability πifor receiving the higher
payoff in both of these lotteries. Setting utility of stated WTAs equal to WA
i(25,5, πi)
and WA
i(20,15, πi), respectively, identifies parameters (αA
i, δA
i). To see this, define
(x1, x2) = (20,15) in Equation 3.3 and use the same equation also for (x′
1, x′
2) =
(25,5). Then, by setting the right-hand sides of these equations equal to each other,
we get
(WTAR
i(20,15, p =πi))1−ri−(WTAA
i(20,15, πi))1−ri+αk
i[201−ri−151−ri]
= (WTAR
i(25,5, p =πi))1−ri−(WTAA
i(25,5, πi))1−ri+αk
i[251−ri−51−ri]
⇔αA
i=A′
B′,(3.6)
with
A′= (WTAR
i(20,15, πi))1−ri−(WTAR
i(25,5, πi))1−ri
+ (WTAA
i(25,5, πi))1−ri−(WTAA
i(20,15, πi))1−ri
and
B′= 251−ri−201−ri+ 151−ri−51−ri>0.
Plugging the result of Equation 3.6 into one of the Equations 3.3 also yields δA
i. For
subjects with ri= 1,
A′= ln WTAR
i(20,15, πi)−ln WTAR
i(25,5, πi)
+ ln WTAA
i(25,5, πi)−ln WTAA
i(20,15, πi),
B′= ln 25 −ln 20 + ln 15 −ln 5,
97
and
δk
i= ln WTAR
i(x1, x2, p =πi)−ln WTAk
i(x1, x2, πi)+αk
i(ln(x1)−ln(x2)) .
These calculations show that both parameters are identified through comparing
WTAs between treatments. By calculating our parameters from differences between
WTAs, we avoid the possibility that any systematic bias stemming from the BDM
mechanism affects our parameter estimates.
3.4.2 Hypotheses
Our goal is to find out whether the source of uncertainty affects uncertainty at-
titudes. Based on the theoretical model, our numerical predictions for the model
parameters are given by Bayesian behavior:
Hypothesis 1: There are no systematic attitudes towards or against ambiguity
or strategic uncertainty. Parameters αk
iand δk
iare distributed around 0.
Here, we test for each condition k∈ {A, S, E}whether the parameters αk
iand
δk
ifrom different subjects iare distributed around zero. As the literature generally
found average subjects to be ambiguity averse, we expect that Hypothesis 1 will be
rejected.
Subjects are likely to differ in their uncertainty attitudes and our experiment
is designed to capture how individual attitudes are affected by the source of un-
certainty being another human’s action and by the nature of strategic interaction.
Here, we exploit the within-subject design and use as null hypothesis:
Hypothesis 2: Subjects do not make any distinction between the sources of un-
certainty and between the considered strategic situation (strategic complementarity
versus substitutability): αA
i=αS
i=αE
iand δA
i=δS
i=δE
i.
A positive (negative) δk
iis interpreted as a general aversion against (preference
for) ambiguity or strategic uncertainty. A positive (negative) αk
iis interpreted as
optimism (pessimism) for receiving the higher payoff under ambiguity or strategic
uncertainty.
3.5 Results
This section outlines the main empirical results based on the pre-registered pro-
cedures of sample selection and data analysis. They can be summarized as fol-
98
lows. Subjects react to the presence of uncertainty (notwithstanding Hypothesis 1),
but also make a systematic distinction between the different sources of uncertainty
(notwithstanding Hypothesis 2). Importantly, the magnitude of that last effect de-
pends on the strategic context. Regarding the two parameters of our structural
model, we find that the majority of subjects exhibits pessimism [optimism] in the
stag-hunt [entry] game while the median subject has neither a preference for nor an
aversion against strategic uncertainty.
3.5.1 Data Selection
We begin by applying the data selection criteria to the initial sample of 228 sub-
jects. The elicitation of both WTAs, for the preferred and the non-preferred action,
provides us with a consistency measure since it should be that WTApreferred ≥
WTAnotpreferred. If a participant violates this criterion such that her WTA for par-
ticipating with her preferred action is lower than the WTA for the not preferred
action in at least one of the games, we exclude this participant from our main data
analysis. The reason is that such a reversal indicates a systematic misunderstanding
of the BDM procedure that could affect all stated WTAs and data from these sub-
jects might just introduce noise. For the same reason, we exclude subjects whose
stated WTA for a lottery that pays the higher payoff with probability 1 is lower than
the stated WTA for an otherwise equal lottery that pays the higher payoff with prob-
ability 0. These criteria were pre-specified. We also pre-specified a robustness check
using the full sample.
In 45 [63] cases we observe a violation of choice consistency in the stag-hunt
[entry] game: the stated WTA for the preferred action is lower than the WTA for
the not preferred one.12 19 subjects violate our rationality criterion in the lotteries:
the stated WTA for a lottery that surely pays a high payoff is lower than the stated
WTA for an otherwise equal lottery that never pays a high payoff. Jointly put,
these criteria turn out to be stringent.13 In total, there are 102 subjects to whom
at least one of these exclusion criteria applies. We call the remaining 126 subjects
the restricted sample.
Ex post, after conducting the experiments, we detected that certain combinations
of choices lead to extreme values of estimated relative risk aversion (beyond +/-100)
and thereby also to estimated values for αand δin astronomical dimensions. In
total, there are 15 subjects with an estimated RRA outside [-100,+100], 7 in the
12For 5 subjects, the selected action in one of the games was not recorded due to a minor software
glitch. One of them failed to comply with the inclusion criterion for lottery choices, the remaining
4 are included in the restricted sample.
13While lack of understanding of the BDM mechanism may partially account for deviations from
expected utility, we also note that the BDM performs not worse than the alternative elicitation
methods for certainty equivalents in terms of bias and noise (Hey et al.,2009).
99
restricted sample. We exclude them from the parametric analysis. 5 other subjects
(1 from the restricted sample) have an estimated RRA>1, but stated a WTA of 0
for at least one of the games or lotteries needed to calculate uncertainty parameters.
For these subjects, some or all pairs (αk
i, δk
i), k=A, S, E, cannot be calculated. So,
we exclude these subjects from all analyses of uncertainty parameters.
3.5.2 Comparison of Certainty Equivalents
In the experiment, we elicit the WTAs for two lotteries with outcomes depending
on the strategy of another player or on ambiguity simultaneously with subjective
probabilities for the possible outcomes. As an initial descriptive step of our analyses,
we can directly compare the WTA of a lottery in a game (where the outcome is de-
termined by another player’s action) with the WTA of a lottery that yields the same
payoffs with exogenously given probabilities that match the subjective probabilities
in the game. Similarly, the WTA for an ambiguous lottery with unknown probabil-
ities can be compared to the WTA of a lottery yielding the same payoffs with given
probabilities that match the subject’s stated probabilities for the ambiguous lottery.
Note that in theory, the WTA for a game is the higher of the two WTAs for the
two possible actions. As a first step in analyzing uncertainty attitudes, we count
the number of subjects whose WTA for a game or for an ambiguous situation is
higher than, equal to or lower than the WTA for the analogous lottery played under
risk. This informs us about the average preference for, or aversion against, a given
source of uncertainty. Note that the size of these deviations may depend on payoffs
associated with the chosen strategy, but also on the subjective probabilities.
Table 3.4 presents the results of this comparison separately for the two lotteries
under ambiguity, for the lottery implied by the actually chosen action in each game,
but also for the counterfactual lottery implied by “replacing” the subject’s actual
choice with the alternative action.
Table 3.4: Comparison of certainty equivalents
Ambiguity: k=A Stag hunt: k=S Entry: k=E
(x1,x2)(20,15) (25,5) chosen replaced chosen replaced
WTAk
i(x1, x2, πi)> WTAR
i(x1, x2, πi)43 49 45 60 80 30
WTAk
i(x1, x2, πi) = WTAR
i(x1, x2, πi)26 29 29 23 12 23
WTAk
i(x1, x2, πi)< WTAR
i(x1, x2, πi)57 48 52 43 34 73
Notes: In the stag-hunt [entry] game, 81 [101] out of 126 subjects choose the action R.
The WTAs under ambiguity and for the stag-hunt game are not significantly
100
different from the WTAs under risk. A Wilcoxon signed rank test yields p-values
above 0.2. For the entry game, however, we find that subjects have a higher WTA
for the lottery implied by their own choice in the game than for the respective lottery
with exogenously given probability (p-value <0.001). The opposite effect occurs
once we look at the WTA for the lottery implied by replacing the actual choice with
the opposite action: it is lower than the WTA for the respective lottery under risk
(p-value <0.001). This indicates that the median subject tends to be optimistic
about the behavior of her partner in the entry game. The weight a player puts on
the payoff corresponding to her partner choosing a different action than her own
exceeds her stated probability of that outcome.
Direct comparisons between WTAs of different games or between a strategic
game and an ambiguous situation could only be possible if a subject stated the same
probability for getting the higher payoff in both contexts. Unfortunately, restricting
analysis to these observations would leave us with just a few matched pairs and
possibly introduce a selection bias. Thus, for further econometric analysis, we use
strategic-uncertainty attitudes as characterized by the parameters αk
iand δk
iof our
structural model. In order to identify these parameters, we need to estimate a utility
function for each subject.
3.5.3 Main Results
Our identification strategy relies on a two-step procedure. As a first step, we use the
individual certainty equivalents (WTA) elicited in 22 lotteries to estimate individual
parameter riof the CRRA utility function. We adopt a parametric procedure from
Hey et al. (2009). For a given lottery (x1, x2, πi), the observed WTAi(x1, x2, πi)
corresponds to the latent expected value Eui(x1, x2, πi), but is also subject to an
i.i.d. error ei∼N(0, s2
i):
WTAi(x1, x2, πi) = u−1
i(Eui(x1, x2, πi)) + ei.
For each individual i, the pair of parameters (ri, si) is estimated through standard
maximum likelihood (ML) estimation.14 As a second step, the estimated coeffi-
cient ˆriis used to compute two individual parameters (αk
i, δk
i) for each context of
uncertainty k=A, S, E following Equations 3.3,3.5, and 3.6.
14For 14 subjects (among which 6 appear in the restricted sample) the ML procedure cannot
converge since their parameter ris unbounded and takes extreme values: it either tends to plus
infinity or to minus infinity. For the sake of nonparametric tests, these subjects are assigned
extreme realizations of r going beyond values observed in the remainder of the sample: either
200 or -200, respectively. In parametric analyses, we only consider cases where the estimated
r∈[−100; 100], which requires removing all the subjects mentioned above as well as another one
with the estimated rof -112; this subject appears in both the restricted and the unrestricted
sample. The resulting range of estimated values of ris (-33, 4) in a sample of 207 observations.
101
Table 3.5: Summary of estimated uncertainty attitudes
Parameters Median #N > 0 #N= 0 #N < 0 Sign test
Restricted sample
ˆr0 60 11 54 -
ˆσ2.356 108 11 - -
αA0 53 14 58 0.704
αS-0.065 41 11 73 0.003
αE0.214 92 7 26 <0.001
δA0 58 14 53 0.704
δS0 51 12 62 0.347
δE-0.073 51 4 70 0.101
Unrestricted sample
ˆr-0.191 88 16 119 -
ˆσ2.535 193 16 - -
αA-0.005 89 20 114 0.092
αS-0.133 72 11 140 <0.001
αE0.104 131 10 82 0.001
δA0 103 21 99 0.833
δS0 92 16 115 0.126
δE-0.018 96 6 121 0.103
Notes: Columns 3-5 summarize the absolute frequencies of estimated parameter values (as listed
in column 1) being positive, negative, or null, respectively. The last column provides p-values from
a sign test of nullity of the median value of the respective parameter. Top (bottom) part of the
table: N= 125, restricted sample (N= 223, unrestricted sample).
Accordingly, the top part of Table 3.5 summarizes the first-step risk attitudes
and the second-step uncertainty attitude parameters, as estimated in the restricted
sample. Most subjects are found to be either risk seeking or risk averse, both types
of preferences emerging in similar proportions. Moving to the domain of uncertainty,
we find that, in our benchmark AMBIGUITY condition, most subjects are either
pessimistic (αA<0) or optimistic (αA>0), both of which again happen in equal
proportions. In a similar vein, most subjects are found to exhibit either aversion
against (δA>0) or preference for (δA<0) ambiguity. In purely descriptive terms,
the parameter of uncertainty aversion is not significantly different from zero in any
of the conditions. However, we observe that the median subject is pessimistic about
the behavior of the other player in the stag-hunt game (αS<0) and optimistic in
the entry game (αE>0).15
15Wilcoxon signed rank tests also reject αS= 0 and αE= 0 at the 1% level and across samples.
102
Statistical evidence provided in the last column in Table 3.5 does not corroborate
Hypothesis 1 stating that across all conditions, both parameters are located at zero.
The nonparametric sign test strongly rejects the nullity of the median of αin both
games; the nullity of the median cannot be rejected at the 5% level for any other
parameter.
Next, we provide a complementary parametric analysis using a Seemingly Unre-
lated Regression (SUR) estimation. For the ith subject, parameters αk
iand δk
iare
assumed to depend on the experimental condition k∈ {A, S, E}in the following
way:
αk
i=a0+aS×1[k=S] + aE×1[k=E] + uk
i,(3.7)
δk
i=d0+dS×1[k=S] + dE×1[k=E] + vk
i,(3.8)
where 1[k=X] = 1 if a decision is made in condition X, and 1[k=X]=0
otherwise. The AMBIGUITY condition Ais set as the reference condition. Hence,
E(αA
i) = a0,E(αS
i) = a0+aS,E(αE
i) = a0+aE, and E(δA
i) = d0,E(δS
i) = d0+dS,
E(δE
i) = d0+dE. In each of the two equations, errors are clustered at the individual
level due to the within-subject implementation of the experimental conditions.
The main virtue (and relative advantage with respect to the nonparametric meth-
ods) of this approach is that it provides a one-size-fits-all framework for fitting our
experimental data that fully accounts for the within-subject treatment variation
and the presence of two distinct preference parameters, αk
iand δk
i, that simultane-
ously arise as dependent variables. Furthermore, it allows us to go beyond single-
parameter tests, and instead test for the joint hypothesis that a group of parameters
has zero mean through a standard Wald test. It also allows us to test for order ef-
fects.16 However, the challenge here is to account for the presence of outliers arising
for two reasons. First, extreme risk preferences can drive the estimated uncertainty
parameters to astronomical values. Second, due to the cardinality of the value func-
tion in Equation 3.1, the parameter δk
iis expressed in units of subjective utility.
For both reasons, δk
ican take extreme values, whether positive or negative. We
tackle this issue in two ways. First, as explained above, parametric analyses con-
sider only subjects whose estimated RRA lies in [−100,100]. For this sample, we
Since the distribution of these parameters is asymmetric, we prefer to report the outcomes of a
more conservative sign test which does not require the symmetry assumption.
16Order effects may arise since the order of S and E treatments is random, yet balanced across
sessions. To check for the possible order effects, regression models 3.7 and 3.8 can be extended
by including an indicator variable for the order of the experimental conditions along with its
interactions with both independent treatment indicator variables. This specification allows us
to compare outcomes across treatments for a given order (in analogy to comparisons made in
models 3.7 and 3.8). It also allows for a formal statistical test of order effects in the data through
Chow test that we run simultaneously for both extended regressions to check whether the order-
related coefficients are jointly insignificant. This exercise points to the lack of order effects at the
conventional 5% significance level, and hence does not raise any indication of order effects. See
Table C.1 in Appendix C.2 for details.
103
apply the negative logarithm transformation to δk
i, i.e. we replace δk
iin Equation
3.8 by sign(δk
i) log(1 + |δk
i|) in order to de-emphasize extreme realizations. Second,
we estimate the SUR without logarithmic transformation, by only looking at indi-
viduals whose estimated RRA lies in [−3,+3], a range that should be considered
reasonable in the light of existing literature (see Charness et al. 2020). Applied
to the restricted and unrestricted samples in turn, this procedure delivers four re-
gression specifications that are reported in Table 3.6. Table 3.7 further summarizes
additional parametric mean tests based on the estimated coefficients.
Table 3.6: Uncertainty attitudes across treatments: parametric estimates from seemingly unrelated
regressions
(1) (2) (3) (4)
Dep. variable: αk
iδk
iαk
iδk
iαk
iδk
iαk
iδk
i
1[k=S]-0.038 -1.194 -0.123* 946.12** 0.027 -1.957 -0.090 214.61
(0.078) (1.429) (0.065) (413.71) (0.125) (2.048) (0.101) (216.37)
1[k=E] 0.347** -1.557 0.232** 672.49* 0.575** -2.353 0.385** 154.93
(0.154) (1.434) (0.108) (345.11) (0.259) (2.114) (0.173) (123.52)
Constant -0.120*** 0.364 -0.107*** -819.18* -0.103** 0.789 -0.093* -162.32
(0.040) (0.853) (0.041) (435.00) (0.047) (1.251) (0.048) (106.58)
Observations 624 561 354 321
Clusters 208 187 118 107
Notes: Standard errors are clustered at the subject level and reported in parentheses. 1[k=S] is a binary variable set to 1 for
condition T, and to 0 otherwise. In all models, we exclude cases with indefinite δk
ias well as those with estimated rioutside the range
[−100,100]. Specifications (1) and (3) use neglog transformation of δi. In specifications (2) and (4), we consider only subjects with an
estimated riin the range [−3,3]. Specifications (1) and (2) use the unrestricted sample, (3) and (4) the restricted sample. Significance
levels: * p<0.1 ** p<0.05 *** p<0.01.
104
Table 3.7: Results of mean testing across specifications
Tests (1) (2) (3) (4)
E(αA
i) = 0 0.003 0.009 0.030 0.053
E(αS
i) = 0 0.031 <0.001 0.536 0.067
E(αE
i) = 0 0.111 0.204 0.054 0.077
E(δA
i) = 0 0.670 0.060 0.528 0.128
E(δS
i) = 0 0.345 0.551 0.366 0.763
E(δE
i) = 0 0.176 0.389 0.242 0.949
E(αA
i) = E(αS
i) = E(αE
i) 0.016 <0.001 0.055 0.001
E(δA
i) = E(δS
i) = E(δE
i) 0.283 0.075 0.404 0.443
E(αA
i) = E(αS
i) = E(αE
i) = 0 0.001 <0.001 0.045 0.001
E(δA
i) = E(δS
i) = E(δE
i) = 0 0.401 0.151 0.580 0.375
Nullity of all parameters 0.001 <0.001 0.026 0.003
Notes: p-values corresponding to the stated mean test in column 1 are based on the
coefficient estimates from the four specifications reported in Table 3.6. Respective
samples contain 208, 187, 118 and 107 subjects for specifications (1), (2), (3) and (4).
Overall, results reported in Tables 3.5 (nonparametric median tests) and 3.7
(parametric mean tests) lead us to reject Hypothesis 1 on the absence of attitudes
towards uncertainty.17 These attitudes strongly vary across contexts. The nonpara-
metric tests (Table 3.5) indicate that the median of αS
iis negative while the median
of αE
iis positive. The parametric tests (Table 3.7) indicate that the mean of αA
i
differs from zero, while we cannot reject (at p=5%) that the means of αE
iand αS
i
are zero. Note, however, that the estimated values of αk
iand δk
iare not normally
distributed. The p-values from the Shapiro-Wilk W test are all below 0.001. Thus,
the main empirical rationale for rejecting Hypothesis 1 comes from the rejection of
the joint nullity of all parameters, from pessimism by the median subject in the stag-
hunt game and optimism by the median subject in the entry game. The estimates
for the AMBIGUITY condition alone would not be sufficient for rejecting Hypoth-
esis 1, because the median of αA
iequals zero (Table 3.5). The nullity of parameter
δk
i, in turn, comes as a persistent empirical finding across all tests and all treatments.
17Strictly speaking, joint tests reported at the bottom of Table 3.7 constitute the target testbed
for Hypothesis 1, although it should also be noted that they remain mute on the precise reasons
(i.e., which parameters are non-null) for the potential rejection. From this perspective, single-
parameter tests reported in Table 3.5 and the first six lines of Table 3.7 provide complementary
information.
105
Result 1 – We document systematic attitudes toward uncertainty. Parameters
αk
iare not distributed around 0 under strategic uncertainty pointing to pessimism
regarding the behavior of the other player under strategic complementarity, and to
optimism under strategic substitutability. Beside this, we do not find a systematic
preference for or aversion against strategic uncertainty.
Building on this result, we now turn to the formal comparisons of αand δ
between the three experimental conditions of uncertainty (ambiguity, stag-hunt,
and entry game) and test Hypothesis 2. Table 3.8 summarizes pairwise median
comparisons based on the Wilcoxon signed rank test, as estimated on either the
restricted or the unrestricted sample. Once again, the general finding goes against
our initial hypothesis: we observe more optimism in the entry game than in the
stag-hunt game or in the benchmark AMBIGUITY condition.18 Figure 3.1 provides
additional visual support of this result: the cumulative distribution function of α
in the entry game first-order stochastically dominates the remaining ones, while not
such differences arise for δ.
Table 3.8: Nonparametric comparisons of uncertainty attitudes across
treatments
Sample
Restricted (N= 125) Unrestricted (N= 223)
Comparison αk
iδk
iαk
iδk
i
Ambiguity – Stag hunt 0.226 0.478 0.021 0.571
Ambiguity – Entry <0.001 0.407 <0.001 0.893
Stag hunt – Entry <0.001 0.671 <0.001 0.410
Notes: Columns 2-5 provide p-values from two-sided Wilcoxon signed rank tests.
Parametric estimates presented in Table 3.6 point to similar conclusions: the en-
try game induces significantly stronger optimism as compared to both AMBIGUITY
and the stag-hunt game (p < 0.05 in all comparisons).19 A parametric comparison
of δacross treatments does not yield significant results at the 5% level.
18Echoing Footnote 14, one caveat here is that the symmetry assumption required by Wilcoxon
signed rank test may not hold in our data. An alternative nonparametric sign test yields the
same results with one exception: αk
iis significantly different between the stag-hunt game and the
AMBIGUITY condition (see Appendix C.2 for details).
19These comparisons require testing for the equality of E(αE
i) with E(αA
i) and E(αS
i).
106
Figure 3.1: Cumulative density functions of uncertainty attitude parameters across
conditions
(a) Optimism (b) Aversion
Notes: Data from the restricted sample trimmed to r∈[−3,3] (N= 107). The x axis in second graph contains
neglog transformation of δk
i:sign(δk
i)log(1 + |δk
i|) to account for a wide range of values taken by this variable.
Result 2 – Subjects distinguish between the different sources of uncertainty.
Uncertainty coming from interaction under strategic substitutability gives rise to
more optimism as compared to both ambiguity and interaction under strategic com-
plementarity. Strategic complementarity does not induce significant changes in atti-
tudes towards uncertainty as compared to ambiguity. We do not find significant and
systematic differences across the three treatments in terms of preferences towards
the source of uncertainty.
In Appendix C.3, we provide additional analyses on the individual underpinnings
of attitudes towards uncertainty based on the individual characteristics described
in our pre-results reviewed report. We do not find any systematic association of
individual characteristics with the six parameters of interest.
3.6 Conclusion
We have developed a method for measuring strategic-uncertainty attitudes and dis-
tinguishing them from risk and ambiguity attitudes. We elicit certainty equivalents
of participating in two strategic 2x2 games (stag-hunt and market-entry games) as
well as certainty equivalents of related lotteries that yield the same possible payoffs
with exogenously given probabilities (risk) and lotteries with unknown probabilities
(ambiguity). We use this information to identify for each game and for the ambigu-
ous environment two parameters of a structural model of uncertainty attitudes. The
parameters of this model capture subject-specific uncertainty aversion and optimism
regarding the subject’s subjective probability for the desired outcome. We then test
107
whether there are significant differences in the distribution of uncertainty attitudes
between games with strategic complements, games with strategic substitutes, and
ambiguous lotteries.
We find systematic attitudes towards uncertainty that vary across contexts.
While there is no evidence for a preference for, nor for an aversion against, ambigu-
ity or strategic uncertainty (in the sense of a fixed effect of the source of uncertainty
on utility), the median subject seems to be pessimistic about the behavior of the
other player in the stag-hunt game, and optimistic in the entry game, where opti-
mism/pessimism are proportional to the difference between the utility expressed by
stated WTAs in a given game and the subjective expected utility derived from the
stated probability for the other player’s choice.
In the entry game, optimism means that the median subject’s evaluation of the
game is shifted from her expected utility in direction of the higher payoff that arises
if the other player chooses the action opposing her own. In the stag-hunt game, the
median subject’s evaluation is shifted from her expected utility towards the lower
payoff. In stag hunt, the lower payoff arises if both players choose opposing actions.
Thus, the median subject evaluates both games with an extra weight on the other
player choosing the action opposed to her own.
Our results also show that the entry game stands out, because the distribution of
optimism in the entry game stochastically dominates the distribution of optimism in
stag-hunt and ambiguity treatments. This reflects the results by Nagel et al. (2018)
that indicate a higher degree of strategic uncertainty and higher levels of reasoning
in entry games than in stag-hunt games and lotteries.
Stag-hunt and entry games differ in the reasoning process leading to a decision. If
a player has an initial preference for one action, say L, and considers what she should
do had the other player also chosen L, then her initial preference is confirmed in the
stag-hunt game. If her partner reasons in the same way, it is optimal for both to
choose L. In the entry game, however, if the other player thinks like her and chooses
L, then she should choose action R instead; however, if the other player follows the
same reasoning as her, then she should switch back to L. This inconclusive reasoning
process may be the underlying reason for higher brain activity in Nagel et al. (2018)
and for the deviation between the stated value (WTA) of a game and its subjective
expected utility. Eventually, the extra weight associated with an opposing action
expressed by optimism in entry and pessimism in stag-hunt games is a precaution
against the other player applying a different reasoning process leading to a different
action.
Our findings further complement the literature on the choice/preference rela-
tionship in games. For instance, Chark and Chew (2015) compare choices in a
coordination game with strategic complementarities when the opponent is another
108
human or a die in presence of a safe opt-out option. Their findings indicate that the
source of uncertainty does not significantly alter choices in this game.20 In another
related experiment, Calford (2020) finds that uncertainty aversion measured with a
game can account for choices in another game. Our results are consistent with the
literature finding source-dependence in uncertainty attitudes (e.g., Abdellaoui et al.,
2011;Chark and Chew,2015). We add to this literature by developing a general
method for identifying and comparing attitudes towards strategic uncertainty. We
focus on attitude measurement in two prototypical games, but the method can be
easily applied to other settings.
Finally, our empirical evidence highlights the general importance of individual
probability distortion (rather than a domain-specific utility function) for under-
standing decision-making under uncertainty. This finding corroborates some of the
previous research on modelling uncertainty in individual (i.e., non-strategic) choices
(see, e.g., Abdellaoui et al.,2011;Attema et al.,2013) and further extends it by
showing that the relative importance of probability weighting also applies to strate-
gic contexts.
20However, they find indicative evidence for the role of source of uncertainty in another game (i.e.,
matching pennies) relative to coordination game. Their accompanying evidence from neuroimaging
data favor ambiguity attitudes over social preferences in explaining these results.
109
Appendix C
Appendix
C.1 Instructions
Whether Game 1 or 2 is played first is randomly chosen by the computer. Here we
only present instructions where Game 1 is played first.
Welcome!
You are about to take part in an economic experiment. You are not allowed to
talk to other participants during the experiment. If you have a cell phone, please
switch it off. If you have a question at any time, please raise your hand, and someone
will come to help you. Please do not ask your question aloud. If the question is
relevant for all participants, we will repeat it and answer it aloud. If you violate
these rules, we must exclude you from the experiment and from payment.
All the information you provide, the decisions you make, as well as the amount
of your gains from this experiment will remain strictly confidential and anonymous.
Participation in this experiment will earn you money. Your earnings will depend
on your decisions and may also be affected by the decisions made by others.
The experiment consists of five parts. You will receive specific instructions for
each part as the experiment goes on. At the end of the experiment, only one part
out of Parts 1 to 4 will be chosen at random to determine your final payoff for the
experiment, where each of these four parts has the same chance to be randomly
drawn. Within each part, you make several decisions. If a part is randomly chosen
for payment, one of those decisions will be drawn for payment by another random
mechanism of the computer, where each decision has the same chance to be ran-
domly drawn. Hence, only one of your decisions will affect your final payoff, but it
could be any one of your decisions. For showing up in time, you additionally obtain
5 Euros. The fifth part does not offer you the chance to earn money.
110
Specific Instructions for Part 1
In this part of the experiment, you will face 22 lotteries. 11 of them pay either
15 or 20 Euros. The others pay either 5 or 25 Euros. For both payoff types, the
probability to get the higher of the two possible payoffs varies from 0 to 100% in
steps of 10%.
For each of the 22 lotteries, we ask you the following question:
•Which amount (in Euros) would you prefer to receive with certainty instead
of letting the lottery determine your payoff?
You need to enter your answers for these questions in the columns “Opt-out
value for lottery that pays either 15 or 20 Euros” and “Opt-out value for lottery
that pays either 5 or 25 Euros”, respectively. You can state any value from 0 to
30 Euros, with up to two decimals. Your answers to these questions will determine
your candidate payoff for this part of the experiment with the following two-step
procedure:
If this part is selected for payoff, the computer will randomly select one of the 22
lotteries. Second, the computer will randomly draw an amount from 0.00 to 30.00
Euros with two decimals (each value in the interval is equally likely).
•If the randomly drawn amount is larger than or equal to your stated “Opt-
out value” for the selected lottery, your payoff is the amount drawn by the
computer.
•If the amount drawn by the computer is smaller than your stated “Opt-out
value” for the selected lottery, your payoff will be determined by the rules of
this lottery. This means, you will get the higher of the two possible payoffs
with the probability pstated in the left column. You will get the lower of the
two possible payoffs with the remaining probability 1 −p.
Example:
•Suppose that the computer selects the lottery that pays either 15 or 20 Euros
with a probability of receiving the higher payoff p= 90%. Suppose that your
stated “Opt-out value” for this lottery is equal to 17.50.
•If the amount drawn by the computer is at least 17.50, you will receive this
amount. So, if the drawn amount is 26.09, you receive 26.09 Euros.
111
•If the number drawn by the computer is smaller than your opt-out-value, say
9.79, your payoff for this part is determined by the selected lottery. Here, you
will receive 20 Euros with probability p= 90%. With probability 1−p= 10%,
you will receive 15 Euros.
You will see those 22 lotteries listed on your screen as described in the Table
below. Once you state both of your “Opt-out values” for each of the 22 lotteries
given in the Table below, you need to confirm these answers by clicking on the
“CONFIRM” button. You can change these “Opt-out values” as long as you have
not confirmed them.
Probability with which the
computer selects the higher
payoff
Opt-out values for lottery
that pays either 15eor 20e
Opt-out values for lottery
that pays either 5eor 25e
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Before beginning the actual Part 1, you will perform the same task with five
different lotteries. This phase is for practice purposes and will not influence your
payoff. You will also receive feedback about the random selections of the computer
in this practice round and about the consequence of the two-step procedure using
your stated opt-out values. Note that in the real experiment, you will not be in-
formed about these outcomes before the end of the experiment.
Specific Instructions for Part 2
In this part of the experiment, you will face 2 lotteries. One of them pays either
15 or 20 Euros. The other pays either 5 or 25 Euros. Note that these are the same
112
payoffs offered by the lotteries as in the previous part. But now, you will not be
informed about the probability with which the computer chooses the higher payoff.
The computer is programmed in such a way that the probability with which the
higher payoff is paid is one of the probabilities stated in Part 1, i.e.: 0, 10%, 20%,
. . . , 100%. The computer selects this probability before you submit your decision
for this part. Each of these 11 probabilities might be the one applied to the lotteries
in this part, but they are not equally likely. This means, some probabilities are
more likely to be drawn than others. However, you will not receive any further
information about the precise random mechanism.
For each of the two lotteries, we ask you the following question:
•Which amount (in Euros) would you prefer to receive with certainty instead
of letting the lottery determine your payoff?
You need to enter your answers for these questions in the boxes “Opt-out value
for lottery that pays either 15 or 20 Euros” and “Opt-out value for lottery that
pays either 5 or 25 Euros”, respectively. You can state any value from 0.00 to 30.00
Euros, with up to two decimals. Your answers to these questions will determine
your payoff for this part of the experiment with the following two-step procedure:
If this part is selected for payoffs, the computer will randomly select one of the
two lotteries. Second, the computer will randomly draw an amount from 0.00 to
30.00 (each amount in the interval is equally likely).
•If the randomly drawn amount is larger than or equal to your stated “Opt-
out value” for the selected lottery, your payoff is the amount drawn by the
computer.
•If the amount drawn by the computer is smaller than your stated “Opt-out
value” for the selected lottery, your payoff will be determined by the rules of
the chosen lottery. This means, you will get either of the two possible payoffs
15 or 20 Euros if the lottery that pays either 15 or 20 is selected and 5 or 25
Euros if the lottery that pays either 5 or 25 is selected.
Once you stated the two “Opt-out values”, we will ask you about your guess how
likely it is that the computer selects the higher payoff. We are asking your guess for
the following question:
•Out of 10 draws, how many times does the computer select the higher payoff?
If your guess exactly matches the true number of draws that the computer selects
the higher payoff, your payoff from this decision will be 20 Euros. If your guess is
not exactly accurate, then you may receive 20 or 10 Euros. The likelihood to receive
the high payoff (20 e) is higher, the closer your guess is to the expected number
113
of draws. This means the more accurate your guess is, the higher your payoff from
this decision will be. You can look up the precise mechanism rewarding your stated
beliefs by clicking on the button “more information.” The mechanism makes sure
that it is in your best interest to state your true belief about the expected number
of draws.1
Summary and Payoff Procedure for Part 2:
In this part of the experiment, you will first state your “Opt-out values” for
the two lotteries. Second, you will state your guess about the likelihood that the
computer selects the higher payoff.
A random mechanism will decide how your candidate payoff for this part of the
experiment will be determined. 2 out of 3 times, it will be determined based on the
two-step procedure which uses your stated “Opt-out values” as described in Part 1.
1 out of 3 times, it will be determined based on the accuracy of your stated guess.
C.1.1 Instructions for Games
Specific Instructions for Part 3
In this part, you are randomly matched with another participant in this session.
We will never inform you about the identity of this other participant. You and this
other participant will each choose between two Actions L and R.
The payoffs (in Euro) for you and the other participant are presented in the
Table below: in each cell, the first amount is your payoff, and the second amount is
the other participant’s payoff. These payoffs can be summarized as follows:
•If you and the participant you are matched with both choose L, you both
receive 20 Euros;
•If you choose L and the participant you are matched with chooses R, then you
receive 15 Euros and the other participant receives 5 Euros;
•If you and the participant you are matched with both choose R, you both
receive 25 Euros;
•If you choose R and the participant you are matched with chooses L, then you
receive 5 Euros and the other participant receives 15 Euros.
1The computer interface contains a button opening a pop-up window with specific description
of this procedure:
If this decision is selected for final payment, your gain will be determined according to the following
procedure. First, the computer calculates DIFF: the difference between your answer and the correct
answer, and then computes its square value: DIF F 2 = DIF F ∗DIF F . Second, the computer
randomly draws an integer number between 0 and 100 (each realization being equally likely).
If the value of DIFF2 is below that random integer, your payoff equals 20 euros; otherwise, your
payoff equals 10 euros.
114
Decision situation in Part 3 and associated payoffs in Euro.
Your decision
The other participant’s decision
L R
L 20e, 20e15e, 5e
R 5e, 15e25e, 25e
First, you and the other participant will decide between Actions L and R. We
call this “Decision 1”.
If this part is selected for payoffs, with 1/3 probability, your payoff as well as
the other participant’s payoff are determined by your and the other participant’s
Decision 1 as described above.
Once you made your Decision 1 (and before payoffs are determined), we ask you
to state two “Opt-out values” similar to the ones in Parts 1 and 2. The precise
questions are the following:
•If the computer replaces your decision with Action L, which amount (in Euro)
would you prefer to receive with certainty instead of continuing with Action
L?
•If the computer replaces your decision with Action R, which amount (in Euro)
would you prefer to receive with certainty instead of continuing with Action
R?
Just as in Parts 1 and 2, you need to state an amount from 0.00 to 30.00 Euros
for both questions above. You need to enter your answers for these questions in the
columns “Opt-out value for Action L” and “Opt-out value for Action R”, respec-
tively. You can state any value from 0.00 to 30.00 Euros, up to two decimals. Your
answers to these questions will determine your payoff for this part of the experiment
with the following two-step procedure: If Part 3 is selected for payoffs, with 1/3
probability, your payoff will be determined based on the two-step procedure which
uses your stated “Opt-out values”. In this case, the computer will randomly select
one of the two actions L or R for you. Second, the computer will randomly draw an
amount from 0.00 to 30.00 Euros (each amount in the interval is equally likely).
•If the randomly drawn amount is larger than or equal to your stated “Opt-
out value” for the action selected by the computer, your payoff is the amount
drawn by the computer.
•If the amount drawn by the computer is smaller than your stated “Opt-out
value” for the action selected by the computer, your payoff will be determined
by this action and the action chosen in “Decision 1” by the participant you
are matched with.
115
Example:
Suppose the computer replaces your action by R and draws the amount 21.24.
If your opt-out value for Action R is smaller than 21.24, you receive 21.24 Euros. If
your opt-out value is larger, your payoff depends on the other participant’s Decision
1. If the other participant has chosen L, you receive 5 Euros. If the other participant
has chosen R, you receive 25 Euros.
Once you stated the two “Opt-out values”, we will ask you about your guess how
likely it is that the other participants in this room choose Action R. We are asking
your guess for the following question:
•How many of the other 10 participants in this session choose Action R?
The payoff for your guess will be determined in the same way as in Part 2.
If your guess exactly matches the true number of choices for Action R, your
payoff from this decision will be 20 Euros. If your guess is not exactly accurate,
then you may receive 20 or 10 Euros. The likelihood to receive the high payoff (20
e) is higher, the closer your guess is to the expected number of draws. This means
the more accurate your guess is, the higher your payoff from this decision will be.
You can look up the precise mechanism rewarding your stated beliefs by clicking
on the button “more information.” The mechanism makes sure that it is in your
interest to state your true belief about the expected number of draws.
Finally, you need to confirm your decisions by clicking on the “CONFIRM” but-
ton. You can change your decisions as long as you have not confirmed them.
Summary and Payoff Procedure for Part 3:
In this part of the experiment, you will answer four questions. First, you will
state your preferred action (either L or R) for Decision 1. Second, you will state the
two “Opt-out values” in case the computer replaces your decision by L or R. Third,
you will state your guess on how many out of 10 randomly drawn other participants
would choose Action R as their preferred action.
Another random mechanism will decide how your candidate payoff for this part
of the experiment will be determined. 1 out of 3 times, it will be determined based
on yours and the other participant’s preferred action. 1 out of 3 times, it will be
determined based on the two-step procedure that uses your stated “Opt-out values”.
1 out of 3 times, it will be determined based on the accuracy of your stated guess.
Specific Instructions for Part 4
In Part 4, you will make exactly the same decisions as in Part 3. You are matched
with another participant (possibly different from Part 3). The only difference com-
116
pared to Part 3 is in the payoffs that you and the other participant receive depending
on your choices between Action L and Action R.
The payoffs (in Euro) for you and the other participant are presented in the table
below: in each cell, the first amount is your payoff, and the second amount is the
other participant’s payoff. These payoffs can be summarized as follows:
•If you and the participant you are matched with both choose L, you both
receive 5 Euros;
•If you choose L and the participant you are matched with chooses R, then you
receive 25 Euros and the other participant receives 20 Euros;
•If you and the participant you are matched with both choose R, you both
receive 15 Euros;
•If you choose R and the participant you are matched with chooses L, then you
receive 20 Euros and the other participant receives 25 Euros.
Decision situation in Part 4 and associated payoffs in Euro.
Your decision
The other participant’s decision
L R
L 5e, 5e25e, 20e
R 20e, 25e15e, 15e
Summary and Payoff Procedure:
You will answer the same four questions as in Part 3 and your candidate payoff
for this part of the experiment will be determined based on the same mechanism.
Completion and Questionnaires
You have completed the first four parts of the experiment. In each part, the
payoff resulting from one of your decisions is chosen as the candidate payoff for that
part of the experiment. One of these four candidate payoffs will be selected as your
final payoff by a random mechanism (each candidate payoff is equally likely to be
your final payoff). Before announcing your final payoff, we ask you to answer a series
of questions (Part 5). You will answer these questions using the interface on your
computer screen. Please follow the specific instructions on your screen to answer
these questions.
117
C.1.2 Example of Comprehension Quiz for the Treatment
(inserted on screens before Parts 3 and 4; information will be adapted to the games
used in the respective parts)
Before making your decisions for Part 3, please answer the following questions:
1. You will interact with another, randomly matched, participant;
•True
•False
Answer: True
2. In the decision situation of Part 3, if you choose L and the other participant
chooses R, your associated payoff is
•5e
•15 e
•20 e
•25 e
Answer (Game 1): 15 e
Answer (Game 2): 25 e
3. In the decision situation of Part 3, if you choose R and the other participant
chooses L, your associated payoff is
•5e
•15 e
•20 e
•25 e
Answer (Game 1): 5e
Answer (Game 2): 20 e
118
C.2 Additional Tables and Figures
Table C.1: Seemingly unrelated regressions with treatment order effects
Sample: (1) (2) (3) (4)
Dep. variable: αk
iδk
iαk
iδk
iαk
iδk
iαk
iδk
i
Indep. Variable
1[k=S].031 -.661 -.087 880.34* .113 -2.089 -.048 88.012
(.099) (1.190) (.057) (531.73) (.146) (1.681) (.067) (79.483)
1[k=E] .363** -.623 .349** 742.04 .539** -1.850 .541** 22.871
(.151) (1.190) (.147) (533.36) (.228) (1.707) (.218) (27.070)
StagFirst -.014 .730 -.020 663.03 -.093 -.429 -.097 11.172
(.079) (1.861) (.082) (766.11) (.099) (2.892) (.100) (198.11)
StagFirst ×1[k=S] -.158 -1.216 -.086 147.57 -.211 .324 -.110 330.39
(.159) (3.120) (.148) (845.60) (.266) (4.707) (.250) (554.20)
StagFirst ×1[k=E] -.037 -2.136 -.284 -168.900 .089 -1.236 -.406 344.64
(.329) (3.129) (.215) (637.31) (.591) (4.871) (.354) (317.72)
Constant -.114** .045 -.099* -1092.19 -.064 .964 -.056 -166.60
(.055) (.718) (.056) (710.73) (.057) (.984) (.059) (154.77)
Observations (clusters) 624 (208) 561 (187) 354 (118) 321 (107)
Chow test 0.702 0.448 0.466 0.232 0.552 0.483 0.317 0.493
Joint Chow test 0.627 0.269 0.483 0.299
Notes: StagFirst is a binary variable set to 1 if the stag-hunt game is played before the entry game, and to 0 otherwise. 1[k=T] is a
binary variable set to 1 for condition T, and to 0 otherwise. Standard errors are clustered at the subject level and reported in parentheses.
In all models, we exclude cases with indefinite δk
ias well as those with estimated rioutside the range (-100,100). Specifications (1) and
(3) use neglog transformation of δk
i. In specifications (2) and (4), estimated riis trimmed to the range [−3,3]. Specifications (1) and
(2)/(3) and (4) use unrestricted/restricted sample. The last two rows provide the resulting p-values from Chow tests that is the joint
insignificance of all the coefficients in front of the dummy for the specific parameter and for the entire SUR model. Significance levels: *
p<0.1 ** p<0.05 *** p<0.01.
Table C.2: Nonparametric comparisons of strategic uncertainty attitudes
across treatments
Sample
Restricted (N= 125) Unrestricted (N= 223)
Comparison αk
iδk
iαk
iδk
i
Ambiguity – Stag hunt 0.005 0.275 <0.001 0.734
Ambiguity – Entry <0.001 0.203 0.007 0.497
Stag hunt – Entry <0.001 0.779 <0.001 0.441
Notes: Columns 2-5 provide p-values from two-sided Wilcoxon signed rank tests.
119
C.3 Individual Underpinnings of Attitudes towards
Uncertainty
In this appendix, we explore individual underpinnings of attitudes towards uncer-
tainty. We use a seemingly unrelated regression model to estimate six simultaneous
equations. Each of the six individual preference parameters yi∈ {αA
i, αS
i, αE
i, δA
i, δS
i, δE
i}
is regressed on a set of individual characteristics:
yi=by,0+by,1(ˆsi) + by,2Raven Scorei+by,3RMET Scorei+by,4SSS Scorei
+X
k
cy,kSocDemInfk
i+wi,
where:
•ˆsiis the individual noise parameter estimated by ML from the RISK treatment
data;
•Raven Scoreiis the Raven test score;
•RMET Scoreiis the Reading the Mind in the Eyes Test score;
•SSS Scoreiis the total score on the Sensation Seeking Scale (SSS);
•SocDemInfk
iis a set of kbasic socio-demographic variables: age, gender (Fe-
male is an indicator variable that takes the value one for female subjects)
and major (Econ Buss and Engineer are also indicator variables that take
the value one when subjects’ major is economics or business and engineering,
respectively);
•wiis the residual.
Table C.3 reports the estimated results. Although there is no systematic associ-
ation between any of the explanatory variables and the six parameters of interest,
we do reject a joint hypothesis of coefficient nullity across the three αregressions
with p < 0.001; we do not so, however, for the three δregressions. This suggests
that the heterogeneity in pessimism (α) observed in our (restricted) experimental
sample is partially transmitted by individual differences which, however, cannot
account for the heterogeneity in the general preferences towards uncertainty (δ).
However, we also note that this result should be handled with care, since it is not
entirely confirmed in unrestricted sample estimations. Estimates provided in Table
C.4 point to a weak statistical link between our set of explanatory variables and the
six parameters of interest.
120
Table C.3: Seemingly unrelated regressions with individual characteristics: restricted
sample
αA
iαS
iαE
iδA
iδS
iδE
i
ˆσi-.031 -.147 -.068 .630 -.028 -.202
(.029) (.127) (.168) (.589) (.640) (.669)
Raven Score -.019 -.123 -.068 -.332 .214 .219
(.015) (.086) (.063) (.744) (.739) (.762)
RMET Score .005 -.003 -.020 -.453 .724 .820
(.016) (.016) (.023) (.517) (.513) (.546)
SSS total .015 .035 -.010 .505* -.292 -.324
(.010) (.031) (.059) (.277) (.284) (.286)
Female -.163* .418 .704* -.202 -2.595 -2.604
(.098) (.314) (.419) (2.283) (2.322) (2.362)
Age .002 -.006 -.015 .001 -.062 -.081
(.005) (.014) (.018) (.192) (.191) (.199)
Econ Buss -.030 -.032 .757 -.636 -3.485 -4.612
(.146) (.172) (1.174) (4.264) (4.568) (4.478)
Engineer .097 .202 -.323 1.726 -4.353** -4.614**
(.090) (.274) (.406) (2.181) (2.215) (2.286)
Constant -.284 1.037 2.307* 1.668 -8.631 -9.287
(.294) (.941) (1.339) (7.534) (7.870) (8.216)
Joint insignificance (p-value): 0.336 0.716 0.005 0.359 0.690 0.672
Notes: Standard errors are clustered at the subject level and reported in parentheses. Data correspond to specification
(3) in Table 3.6 (N= 118). Parameter δk
iis neglog-transformed. Significance levels: * p < 0.1, ** p < 0.05, ***
p < 0.01. Joint insignificance of coefficients for the three α(δ) regressions:p<0.001 (p= 0.707). Joint insignificance
of coefficients across the six models: p<0.001.
121
Table C.4: Seemingly unrelated regressions with individual characteristics: unre-
stricted sample
αA
iαS
iαE
iδA
iδS
iδE
i
ˆσi-.035* -.057 -.049 -.078 -.178 -.229
(.021) (.065) (.083) (.465) (.477) (.476)
Raven Score -.021 -.051 -.016 -.160 -.109 -.080
(.013) (.049) (.037) (.477) (.477) (.474)
RMET Score .010 -.008 -.025* -.174 .456 .543
(.013) (.012) (.015) (.346) (.334) (.351)
SSS total .014* .023 -.019 .138 .044 .008
(.007) (.020) (.036) (.217) (.225) (.214)
Female -.091 .257 .454* -1.250 -1.668 -1.663
(.082) (.194) (.266) (1.824) (1.867) (1.822)
Age .000 .002 -.009 .022 -.099 -.127
(.005) (.010) (.013) (.166) (.163) (.170)
Econ Buss .070 .014 .386 -1.533 -2.293 -3.399
(.110) (.137) (.679) (2.790) (2.875) (2.861)
Engineer .112 .039 -.297 -.438 -1.100 -2.335
(.083) (.177) (.249) (1.622) (1.639) (1.636)
Constant -.350 .088 1.667* 3.591 -6.278 -6.582
(.298) (.590) (.901) (6.233) (6.369) (6.248)
Joint insignificance (p-value): 0.087 0.001 0.023 0.729 0.870 0.839
Notes: Standard errors are clustered at the subject level and reported in parentheses. Data correspond to
specification (1) in Table 3.6 (N= 208). Parameter δk
iis neglog-transformed. Significance levels: * p < 0.1, **
p < 0.05, *** p < 0.01. Joint insignificance of coefficients for the three α(δ) regressions: p<0.001 (p= 0.606).
Joint insignificance of coefficients across the six models: p<0.001.
C.4 Screenshots
122
Figure C.1: Screen used in RISK treatment
Figure C.2: Screen used in STRATEGICUNCERTAINTY treatment (stag-hunt
game)
123
Chapter 4
Better than Perceived? Correcting
Misperceptions about Central
Bank Inflation Forecasts
An earlier version of this paper was circulated as: Bulutay, M. (2024). Better
than Perceived? Correcting Misperceptions about Central Bank Inflation Forecasts.
Berlin School of Economics Discussion Paper, No 34.
4.1 Introduction
Central banks publish their macroeconomic forecasts not only to inform the pub-
lic about the future of the economy, but also to manage expectations. However,
disagreements about the future persist between central banks and private agents.
For central banks, disagreement is particularly troubling when it comes to future
inflation, because inflation expectations can translate into inflation and deanchoring
can hinder the transmission of monetary policy. For private agents, it is inefficient
not to adopt the central bank’s inflation forecast because forming personal forecasts
is costly in terms of time and resources.1
Could the inflation disagreement between central banks and private agents be
explained by the latter underestimating the former’s forecasting ability and therefore
relying on their own assessments? Recent research shows that the public pays more
attention to inflation news when inflation is high and volatile (Weber et al.,2023;
Pf¨auti,2023;Korenok et al.,2023). Thus, larger forecast errors may be overweighted
when people try to think about forecast accuracy. If so, the public’s perception of
the central bank’s accuracy may be biased toward an underestimation of accuracy.
1Furthermore, previous research shows that central banks have an information advantage over
private agents, especially in times of high uncertainty (Gavin and Mandal,2003;El-Shagi et al.,
2016). See Binder and Sekkel (2023) for a review of the literature on central bank forecasts.
124
Central banks should correct such possible misperceptions in order to better influ-
ence inflation expectations. Public perception is also crucial for the independence
of central banks (Ehrmann and Fratzscher,2011).
This article uses two novel survey modules to (i) measure German households’
beliefs about the ECB’s inflation forecasting accuracy and (ii) test the causal effect of
central bank communication on any misperceptions. These modules are integrated
into the Deutsche Bundesbank’s Survey on Consumer Expectations in two waves
in 2022 and 2023, when inflation was high and volatile. Both modules include pre-
registered experiments that exogenously vary information sets and thus show the
causal effect of information on private expectations.
The first experiment, conducted in September 2022, elicits beliefs about the over-
all accuracy of the ECB’s inflation forecasts up to the time of the survey. Partici-
pants are then randomly assigned to receive treatment-specific information. While
all treatment groups (except the control group) are informed about the ECB’s most
recent medium-term inflation forecast, some are also informed about the average
accuracy of past inflation forecasts. Thanks to the random assignment, one can
estimate the causal effect of being exposed to correct information about the ECB’s
inflation forecast accuracy on variables measured later in the survey, such as inflation
expectations, consumption plans, and trust in the ECB.
The results show that only 13% of German households believe that the ECB’s
forecasts are as accurate as they actually are. A larger share (21%) believes that
the average absolute forecast error is larger than the largest forecast error ever
made by the ECB. Cross-sectional correlations show that the underestimation of
the ECB’s forecasting accuracy is negatively related to self-reported trust in the
ECB, even after controlling for a rich set of covariates, including education. Thus,
these misperceptions cannot be explained by illiteracy or misunderstanding alone
and seem to reflect trust in the ECB.
Information about the accuracy of past forecasts lowers inflation expectations,
reduces uncertainty about future inflation, promotes trust in the ECB, and discour-
ages the consumption of certain goods, such as major items (e.g., cars, furniture)
and clothing. Using instrumental variable estimation to account for endogeneity,
I identify the causal relationship between trust in the ECB and inflation expecta-
tions. This analysis shows that the information shifts inflation expectations through
its effect on trust in the ECB. In terms of marginal effects, a one standard deviation
increase in trust in the ECB reduces inflation expectations by 5.3% to 8.5% and
inflation uncertainty by 2.4% to 7.1%.
The second experiment takes place one year after the first, in September 2023.
This time, respondents are asked to report their short-term inflation expectations
and to guess the ECB’s one-year-ahead forecast for the same inflation. These two
125
responses make it possible not only to document possible misperceptions, but also
to identify the source and expected direction of the error. Later, respondents are
provided with information on the current inflation rate in the euro area and/or the
ECB’s inflation forecast. After the information phase, the survey measures trust in
the ECB, but in an indirect way. Instead of asking how much the respondent trusts
the ECB on a scale of 0 to 10, as in the previous experiment and as is common in the
literature, the question elicits the degree of agreement with six statements. These
statements capture different facets of central banking related to trust, including
the honesty of the ECB’s communication, the credibility of the inflation target,
the inclusiveness of monetary policy, and the adherence to the mandate. Thus,
the question measures trust without actually using the term trust and with more
granularity.2
A majority of respondents (62%) believe that the ECB’s inflation forecast will
undershoot actual inflation, reflecting a belief in the ECB’s optimism. While 19%
believe in the opposite deviation (i.e. pessimism), another 19% think that the ECB’s
forecast will be exactly right. Cross-sectional correlations again show that these be-
liefs reflect trust in the ECB. Using self-reported trust in the ECB from an earlier
question in the same wave of the survey, I show that those who believe the ECB
is optimistic report 0.24 standard deviations less trust in the ECB. However, be-
liefs about pessimism are not significantly correlated with self-reported trust in the
ECB. Thus, optimism in forecasts seems to be more dangerous for a central bank’s
reputation than pessimism.
Information treatments change public opinion about the ECB. Respondents who
receive information about the actual inflation forecast report higher trust in the
ECB, as measured by the average agreement with the six statements. However,
information about the current inflation rate does not significantly affect agreement
with statements and even neutralizes the positive effect of information about the
forecast.3Regarding the subscales, the results show that the positive effect of in-
formation on public opinion is strongest for the credibility of the inflation target.
Respondents report stronger agreement with the statement that the ECB will en-
sure price stability within three years. In addition to credibility, information also
improves the perceived honesty of the ECB’s communication, and better convinces
2This approach also provides a way to harmonize the results in this literature, as it is not clear
what is reported in response to ”trust in the central bank” questions. Different studies also use
different wording. While most studies ask respondents to indicate their trust in the central bank
on a scale of 0 to 10 (Christelis et al.,2020), some mention specific characteristics such as ”trust
in the ability to achieve price stability” (Hoffmann et al.,2022) or trust in the central bank ”to
care about the economic well-being of all (citizens)” (D’Acunto et al.,2021).
3At the time of the experiment, inflation was already below the ECB’s one-year-ahead inflation
forecast of 6.3%. This information may have revealed significant forecast errors, albeit in the
opposite direction of respondents’ initial beliefs.
126
respondents of the benefits of the ECB’s policy for their household.
This study contributes to the literature that examines the effects of macroe-
conomic information on households’ inflation expectations (Armantier et al.,2016;
Cavallo et al.,2017;Binder and Rodrigue,2018;Coibion et al.,2022;Kostyshyna
and Petersen,2024), consumption decisions (Roth and Wohlfart,2020;Dr¨ager et al.,
2022;Coibion et al.,2023), and attitudes toward the central bank (Bholat et al.,
2019;D’Acunto et al.,2021;Brouwer and de Haan,2022;Dr¨ager and Nghiem,2023;
Ehrmann et al.,2023;M´eon and Hayo,2023;Ash et al.,2024). In summary, trust in
(the credibility of) the central bank turns out to be a very sticky variable in terms
of the response to information interventions. In a closely related study, McMahon
and Rholes (2024) conduct an online experiment and introduce exogenous variation
in the forecast accuracy of a hypothetical central bank. They find that forecast ac-
curacy systematically affects the credibility of a central bank. The results presented
here complement the findings of this literature by demonstrating the benefits of in-
forming the public about forecast accuracy and by highlighting the caveats of such
a communication campaign. To the best of my knowledge, this is the first study to
document beliefs about a central bank’s forecasts.
Second, my results contribute to the literature that investigates the implications
of trust in central banks for monetary policy. Using dynamic general equilibrium
models, the previous literature shows that trust in central banks matters for the
transmission of monetary policy through its influence on expectations and risk at-
titudes (Bursian and Faia,2018;Hommes and Lustenhouwer,2019;Haldane et al.,
2020). Besides these models, several studies use household surveys to show the
causal effect of central bank trust on inflation expectations through instrumental
variable estimation (Mellina and Schmidt,2018;Christelis et al.,2020). Using a
similar analysis, I find a very similar quantitative relationship between trust and
inflation expectations. Monetary authorities can use these measures to assess the
importance of trust for inflation expectations.
Finally, the results presented here are related to the growing literature that in-
tervenes in misperceptions about economic facts. Recent evidence includes misper-
ceptions about outside options in the labor market (J¨ager et al.,2022), public debt
(Roth et al.,2022), returns to active investment (Haaland and Naess,2023), or the
gender wage gap (Settele,2022). These studies typically show that large segments
of society are uninformed or misinformed. Information interventions show that cor-
recting these misperceptions on seemingly niche topics has a significant return in
terms of beliefs, decisions, and policy demand.
127
4.2 First Experiment
This section describes an information provision experiment that aims to generate
exogenous variation in information sets. The experiment is implemented in the
September 2022 wave of the Bundesbank Online Panel - Households (BOP-HH)
survey. This survey, which has been running monthly since 2019, elicits the per-
ceptions and expectations of about 2000-6000 households in Germany on variables
such as inflation, house prices, and income, as well as consumption plans, policy
preferences, and so on. Table D.1 in the appendix summarizes the demographic
profile of the sample.
4.2.1 Design and Implementation
Figure 4.1: Flow of the first experiment
Q1: prior inf
T1 = Placebo T2 = Forecast T3 = T2 +
Accuracy T4 = T3
Q2: perc inaccuracy
Q3: post max,
post min
Q4: ecb trust
Notes: The graph shows the timeline of the first experiment. The blue boxes show the questions with their labels.
The green boxes show the information treatments with their labels.
Design
The experiment consists of four stages. These stages are shown in Figure 4.1. In
the first stage, inflation expectations in the euro area for the calendar year 2024 are
elicited by the following question:
”What do you think the rate of inflation or deflation in the euro area
will roughly be in 2024?”
If the respondent expects deflation, they enter a negative value. Answers to this
question are coded as prior inf.4After this stage, respondents answer a series of
4The original German texts, along with all other experimental material, can be found in the
Appendix D.1.
128
questions about their income and house prices in the core of the survey. At the
end of these questions, respondents enter the second stage to receive information.
They are randomly assigned to one of four treatment conditions. These treatments
are referred to as T1, T2, T3, and T4. There is a higher probability of assignment
(30%) for T2 and T4 because they are the main treatments.
The first group (T1) serves as an active control group. Respondents are given a
placebo information, which is the population growth rate in Germany (2% between
2010-21). This serves as a remedy for numerical anchoring bias, as the information
in the other groups is numerically similar.
The remaining groups receive a text with information about the ECB’s inflation
forecast. All start with an introductory text explaining the frequency of the fore-
casts and their relevance for the Governing Council’s decision-making process. In
addition, the groups receive treatment-specific information.
The second group (T2) is informed of the ECB’s inflation forecast for 2024, as
announced in the September 2022 press release, with the following text
”This September, the ECB projected a decline in annual euro area
inflation to 2.3% by the end of 2024”.
The last two groups (T3 and T4) are also informed of the ECB’s inflation forecast,
but the same text includes the following additional information:
”The ECB’s projections for the euro area inflation rate deviated by
less than one percentage point on average from the actual inflation rates
in the period from 2001 (when the projections began) to 2021.”
So there is no difference between T3 and T4 in terms of the information provided.
The difference is in the implementation. Respondents in T4 are asked to indicate
their beliefs about past forecast accuracy immediately before being shown the true
answer, while respondents in T3 are only shown the information. The following
question is used to elicit beliefs about past forecast accuracy:
”By how much do you think the ECB’s projections deviated on av-
erage from the actual inflation rates in the period from 2001 (when the
projections began) to 2021? Please give your best estimate.”
The options provide four ranges between ”0-1 percentage point (pp)”, ”1-2 pp”, ”2-3
pp”, and ”3 pp or more”.5Responses are coded with perc inaccuracy.
5The choice of ranges is motivated by two factors. First, households typically round their
responses to whole numbers, especially when they are very uncertain (Binder,2017). The closest
whole numbers to the correct answer are 0 and 1, hence ”0-1 pp”. There are four ranges to ensure
that respondents do not heuristically choose the middle range.
129
Immediately after reading the treatment-specific information in the second stage,
respondents are again asked to form expectations about the euro area inflation rate
in 2024. The question asks
”What are the minimum and maximum values you think the inflation
rate or deflation rate could have in the euro area in 2024?”
The answers are coded with post min and post max and used to infer two key mo-
ments. The difference post max −post min is used as a proxy for the variance of
expectations (i.e., inflation uncertainty) and is coded as inf unc. The midpoint of
post max and post min is used as a proxy for the point expectation and is coded as
post inf.6
The final stage of the module elicits respondents’ trust in institutions on a scale
of 0-10, where 0 indicates ”no trust at all” and 10 indicates ”absolute trust”. I elicit
trust in five institutions. These are the ECB, the Federal Government, the Court
of Justice of the European Union (CJEU), the Bundesbank, and media enterprises
(presented in random order and coded name trust). Such a formulation reduces the
experimental demand effect through obfuscation. In addition, two of the institutions
(the CJEU and media enterprises) will serve as instruments in an instrumental
variable estimation (see section 4.2.2).
Procedures and Data Selection
A total of 5527 respondents participated in the September 2022 wave of the survey.
Invitations are sent between 15-29 September 2022, shortly after the release of the
ECB’s inflation forecasts on 8 and 9 September. Euro area inflation stood at 9.1%
in August 2022 and has not yet peaked.
Using pre-registered exclusion criteria (see AsPredicted #107388), I drop respon-
dents who chose not to answer (N=63) or chose the ”Do not know” option (N=300)
to at least one question in the module (except perc inaccuracy question). I also
exclude respondents who expect the maximum inflation rate to be the same as the
minimum (N=198). Finally, 185 respondents who spent more than 2 hours in stages
(2)-(3) or less than 3 seconds in the information provision stage are excluded from
the sample to ensure data quality. This leaves 4863 respondents for the analysis.
Inflation expectations are further winsorized at the 5th/95th percentiles to reduce
the impact of outliers.7
6Although the midpoint is at best a noisy proxy for the point expectation, this assumption has
no downside for treatment effect analyses as long as the measurement error is orthogonal to the
treatment assignment.
7To illustrate, there are 65 respondents (1.3% of the final sample) who expect deflation or an
inflation rate above 40% pre-information.
130
4.2.2 Results
This section presents the results of the first experiment. Two predictions are pre-
registered:
1. Correcting misperceptions about the ECB’s forecast accuracy increases trust
in the ECB.
2. Trust mediates the impact of information on inflation expectations.
The Appendix D.2 presents a simple model framework that can be used to organize
these predictions.
Perceived vs. Objective Forecast Accuracy
Figure 4.2: Beliefs about inaccuracy and the actual distribution of forecast errors
Notes: The bars show the distribution of forecast errors, as measured by the absolute difference between the ECB
staff forecasts (i.e. excluding the Eurosystem forecasts) and realizations. The number above braces refers to the
proportion of respondents in T4 group who believe that the average absolute forecasts error is within the range
covered by the brace (N= 1440). The last brace covers >3. Data for forecasts cover the period from March 2001
to September 2021. Source: Author’s calculation based on Eurostat data.
The perc inaccuracy data from the fourth treatment can be used to document
the extent of misperception in the entire sample, since the question is asked before
treatment assignment. Figure 4.2 compares perc inaccuracy with actual data, as
shown by the distribution of absolute forecast errors over all forecasts. On the one
hand, only 13.5% of respondents think that the average absolute forecast error is
in the range ”0-1 pp”, while 77 of the calculated absolute errors out of 94 inflation
forecasts are actually in this range. On the other hand, at least 21% of the sample
believe that the average absolute forecast error is larger than the maximum error
131
made by the ECB over this period (2.3 pp). Taken together, these facts suggest that
German households significantly underestimate the ECB’s forecast accuracy.
Could it be that the responses to perc inaccuracy are due to misunderstanding or
noise? Three observations suggest that this is not the case. First, only 3 respondents
choose the ”do not know” option in response to this question, significantly fewer than
the usual number in the survey for the other questions. Second, the distribution of
perc inaccuracy is asymmetric, which is difficult to reconcile with random responses.
Third, the correlation between responses to perc inaccuracy and ecb trust is strong
even after controlling for demographic covariates including education (see Table D.2
in the Appendix). Taken together, confusion does not appear to be the main source
of variation in perc inaccuracy.
Treatment Effects on Inflation Expectations
How does correcting misperceptions with information affect the subjective distribu-
tion of inflation expectations? To measure the treatment effects on the revision of
point expectations, I run the following regression:
post infi−prior infi
| {z }
revisioni
=β0+β1×(2% −prior infi)
+
4
X
j=2
βj×treatj,i ×(2.3% −prior infi) + ei(4.1)
where the revision in inflation expectations is explained by the difference between
the signal (2% in T1, 2.3% in others) and the prior and its interaction with the
treatment. The regressors treatj,i are dummy variables that take the value one
if the observation is in treatment j. This specification follows from the Bayesian
updating framework, where the posterior belief is the weighted average of the prior
and the signal:
post =ω×signal + (1 −ω)×prior.
With respect to the equation (4.1), ωj=β1+βj. Thus, the parameters βjmeasure
treatment-specific learning rates, while β0and β1reflect mismeasurement due to
different question formats, experimenter demand, anchoring, etc. The main tests
are H0:b
β4−b
β2= 0 and H0:b
β3−b
β2= 0.
To test the effect of the treatments on inflation uncertainty, I estimate the fol-
lowing regression:
inf unci=α1+
4
X
j=2
αj×treatj,i +εi(4.2)
where inflation uncertainty is explained by the treatment indicators. Similarly, the
132
main tests are H0:bα4−bα2= 0 and H0:bα3−bα2= 0.
Table 4.1 reports the regression estimates. Three results emerge. First, respon-
dents in all treatment conditions learn from the information they are given and
express lower uncertainty relative to the control condition. Second, T4 works better
than T2 in terms of leading to more revision (t-stat= 1.74, p-value= 0.082) and
decreasing uncertainty across treatments (t-stat= −3.46, p-value= 0.001). Third,
T3 does not significantly affect either the revision of inflation expectations or infla-
tion uncertainty compared to T2. Thus, the forecast accuracy information is only
effective when the misperceptions are ”explicitly” revealed.8
Table 4.1: Treatment effects on learning and uncer-
tainty
(1) (2)
revision (βj)inf unc (αj)
T2 (= Forecast) 0.219∗∗∗ -1.152∗∗∗
(0.019) (0.148)
T3 (= T2 + Accuracy) 0.209∗∗∗ -1.293∗∗∗
(0.022) (0.158)
T4 (= T3 + Question) 0.253∗∗∗ -1.567∗∗∗
(0.019) (0.145)
β1,α10.352∗∗∗ 6.418∗∗∗
(0.017) (0.119)
β01.706∗∗∗ –
(0.067) (–)
N4863 4863
R20.35 0.03
Notes: Regression results based on equations (4.1) and (4.2) are re-
ported in columns (1) and (2), respectively. Robust standard errors are
reported in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.
Information, Trust, and Inflation Expectations
In this section, I examine the effect of information treatments on trust in the ECB
and the quantitative relationship between trust and inflation expectations. Fig-
ure 4.3 illustrates the proposed causal chain between these three variables using a
directed acyclic graph, which amounts to a mediation framework. Causal identifica-
8These results are robust to the inclusion of demographic control variables and to non-
winsorization of the data (see Table D.4 in the Appendix).
133
Figure 4.3: Directed acyclic graph showing the causal mechanisms
T M Y
ϕ1ϕ2
ϕ3
Notes:T: information (treatment), M:ecb trust (mediator), and Y:post inf or inf unc (outcome). Solid lines
refer to the causal mechanisms of interest, the dotted line refers to the direct effect and the dashed line refers to the
possible presence of post-treatment confounders.
tion of the parameters ϕ1and ϕ2shown in Figure 4.3 with the standard mediation
analysis of Baron and Kenny (1986) is problematic in the presence of confounders
and endogeneity (Bullock et al.,2010;Imai et al.,2011;Acharya et al.,2016). Instru-
mental variables (IVs) can be used to identify the causal effects (Imai et al.,2011).
The following two-stage least squares (2SLS) estimation can be used to estimate ϕ1,
ϕ2, and ϕ3:
Mi=ζ+ϕ1T+ρZi+XB +ϵi2,(4.3)
Yi=ψ+ϕ3T+ϕ2c
Mi+XB +ϵi1(4.4)
where Mis ecb trust,Zrefers to the instruments, and Xis a vector of controls. The
treatment dummy Ttakes the value one if the observation is from either T3 or T4,
and zero if it is from T2. The outcome variables in the second stage (Y) are either
post inf or inf unc. I propose trust in two non-economic institutions as instruments
for trust in the ECB. These institutions are the CJEU and media enterprises. The
main identifying assumption is that trust in these institutions is related to trust in
the ECB, while they are exogenous to post inf and inf unc.9
9In a similar analysis, Mellina and Schmidt (2018) use trust in three European institutions
including the CJEU and three German institutions as instruments. Christelis et al. (2020) use trust
in other people and the frequency of being cheated by a repair person in the past as instruments
on trust in the central bank.
134
Table 4.2: Causal Mechanisms with 2SLS
(1) (2) (3)
Dep. var.: ecb trust post inf inf unc
treatment (T) 0.049∗∗ -0.035 -0.274∗∗∗
(0.023) (0.083) (0.100)
ecb trust (c
M) -0.556∗∗∗ -0.294∗∗∗
(0.065) (0.075)
cjeu trust (Z1) 0.572∗∗∗
(0.014)
media trust (Z2) 0.197∗∗∗
(0.015)
constant 0.111 4.034∗∗∗ 7.004∗∗∗
(0.077) (0.275) (0.335)
N3886 3886 3886
adjusted R20.526 0.282 0.165
F-statistic 2147.27 87.037 50.866
J-statistic - 0.575 0.341
p-value - 0.448 0.559
Notes: The table reports the results of 2SLS regressions described in
equations (4.3) and (4.4). Trust-related variables are standardized
(mean zero, sd one). The p-value shows the results of the overiden-
tification test of all instruments (based on Hansen’s J-stat). The
following control variables are included: age (discrete), female (bi-
nary), university graduate (binary), personal income below 1500
Euro (binary), single-person household (binary), born in the GDR
before 1989 (binary), terms for the regions (binary), and prior infla-
tion expectations (continuous). Robust standard errors are reported
in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.
Table 4.2 shows the results. I begin by assessing the quality of the instruments.
First, the first stage F-statistic of 2147 is well above the conventional threshold
required for valid instruments. Second, both instruments are positively correlated
with the mediator. Third, the test for overidentification of the instruments (the
Hansen J statistic) does not reject the hypothesis of joint validity of the instruments.
Overall, these diagnostics do not indicate a lack of validity of the instruments.
The results of the first stage show that exposing households to information about
forecast accuracy (as was done in T3 and T4) increases trust in the ECB (b
ϕ1= 0.049,
t-stat = 2.12, p-value = 0.034). The results of the second stage show that trust in the
ECB is negatively related to inflation expectations and uncertainty. A one standard
deviation increase in trust is associated with a 0.56 percentage point decrease in
135
post inf (95% CI [-0.68,-0.43]) and a 0.29 percentage point decrease in inf unc (95%
CI [-0.44,-0.15]). However, the treatment indicator is only significant for inf unc,
which indicates that the absence of direct effects on post inf.10
Persistence, Consumption, and Attention
A natural question to ask is whether the information interventions have lasting
effects on trust and translate into economic behavior. This section addresses such
questions using the panel dimension of the survey.11 In the month following the
experiment (October 2022), the following variables are measured: (i) trust in the
ECB, (ii) changes in attention to inflation news, and (iii) changes in consumption
plans. Trust in the ECB is measured as before on a scale from 0 to 10. Changes in
attention are measured by the following question:
”Has your interest in inflation developments changed in recent weeks?”
Respondents could indicate that they pay more, less or the same attention to infla-
tion. Consumption plans are measured by the following question:
”Are you likely to spend more or less on the following items over the
next twelve months than in the last twelve months?”
Respondents answer for nine different categories. I focus on consumption of durable
goods, such as major purchases (e.g., cars, furniture) and clothing, and on savings.
Table 4.3 shows the results of linear regressions. For ecb trust, the effects are
already attenuated after one month. For the other outcome variables, T4 shows per-
sistent effects. Respondents in this group are 2.5% less likely to report having paid
less attention to inflation news in recent weeks (t-stat = −1.75, p-value = 0.081),
3.1% less likely to report an increase in spending intentions for major purchases
(t-stat = −1.69, p-value = 0.091), and 2.7% less likely to report more spending on
clothing and footwear (t-stat = −1.91, p-value = 0.056). Relative to T2, informa-
tion in T4 also reduces the probability of reporting an increase in consumption for
major purchases by 3.7% (t-stat = −2.21, p-value = 0.027). For savings plans and
paying more attention to inflation news, none of the treatments have a significant
coefficient. In summary, information about accuracy has discernible effects on some
behaviors that persist for at least four weeks when delivered after a question, as was
done in T4.
10These results are robust to excluding T3 from the sample and using only cjeu trust as the
instrument (see Tables D.5 and D.6 in the Appendix).
1157% of respondents who participated in the experiment remain in the panel the following
month (October 2022). This number drops abruptly to 27% in November 2022. The scope of this
exercise is therefore limited to testing the effects that last four weeks.
136
Table 4.3: Treatments effects on trust, attention, and consumption one month later
(1) (2) (3) (4) (5) (6)
Attention Spend more
Dep. var. ecb trust More Less Major Clothing Savings
T2 (= Forecast) -0.093 0.025 -0.026∗0.006 -0.007 -0.003
(0.138) (0.027) (0.014) (0.019) (0.015) (0.016)
T3 (= T2 + Accuracy) -0.002 0.007 -0.002 -0.019 -0.005 -0.002
(0.144) (0.029) (0.017) (0.020) (0.016) (0.018)
T4 (= T3 + Question) -0.190 0.011 -0.025∗-0.031∗-0.027∗0.004
(0.137) (0.027) (0.014) (0.018) (0.014) (0.016)
constant 4.275∗∗∗ 0.390∗∗∗ 0.107∗∗∗ 0.243∗∗∗ 0.171∗∗∗ 0.239∗∗∗
(0.298) (0.059) (0.029) (0.043) (0.035) (0.040)
N2772 2794 2794 2738 2738 2738
R20.03 0.00 0.01 0.02 0.02 0.03
Notes: The table reports OLS regressions on six outcome variables measured in the October wave of the survey (one
month after the experiment). The following control variables are included: age (discrete), female (binary), university
graduate (binary), personal income below 1500 Euro (binary), single-person household (binary), born in the GDR
before 1989 (binary), and terms for regions (binary). Regressions where the dependent variable is consumption plan
also control for expected changes in income (continuous). Robust standard errors are reported in parentheses. *
p < 0.1, ** p < 0.05, *** p < 0.01.
4.3 Second Experiment
This section presents a second complementary experiment implemented in BOP-
HH Wave 45 (September 2023). There are two main design changes. The first is
that households’ quantitative beliefs about short-term inflation and beliefs about
the ECB’s short-term inflation forecast are elicited. From the answers to these
questions, it is possible to infer the expected forecast error and its direction. Second,
post-information trust in the ECB is measured indirectly via a questionnaire and
without using the term trust in any part of the question. This approach allows
for more granularity and provides information on the effects of such information
treatments.
4.3.1 Design and Implementation
Design
The experiment consists of four stages. These stages are shown in Figure 4.4. In
the first stage, trust in the ECB is measured on a scale from 0 to 10, using the same
137
Figure 4.4: Flow of the first experiment
Q1: prior trust
Q2: exp inf,
perc forecast
G1 = Control G2 = Inflation G3 = Forecast G4 = G2 + G3
Q3: post trust
Notes: The graph shows the timeline of the second experiment. The blue boxes show the questions with their labels.
The green boxes show the information treatments with their labels.
question as in stage (4) of the first experiment. This is coded as prior trust. In the
second stage, respondents are asked to indicate their expectation for inflation in the
euro area in the calendar year and their belief in the ECB’s forecast for the same
calendar year one year ahead. The following question is used:
”What do you think the inflation rate in the euro area will be in 2023
overall, i.e. between December 2022 and December 2023? And what
inflation rate do you think the ECB forecasted in its projections for 2023
back in December 2022?”
The answers are coded with exp inflation and perc forecast.
After this question, respondents move to the third stage where they receive
information. They are randomly assigned to one of four groups. The first group
(G1) is a pure control group that receives no information.12 The group G2 receives
the most recent annual inflation rate as information, along with the following text:
”You will now be shown up-to-date information on the inflation rate
in the euro area. According to the latest statistics, the inflation rate in
the euro area between July 2022 and July 2023 was 5.3%.”
The group G3 receives the ECB’s one-year ahead inflation forecast for calendar year
2023 with the following text:
”You will now be shown up-to-date information on the inflation rate
in the euro area. In December 2022, the ECB forecasted that the inflation
rate in the euro area would be 6.3% by December 2023.”
12Because the information and the beliefs elicited after the information are on a different scale,
there is no need to account for the numerical anchoring bias.
138
The group G4 receives both sets of information.
After receiving the information, respondents indicate their level of agreement
(on a scale of 1-7) with six statements. These statements are shown in Table 4.4.
They are designed to capture different facets of trust in the ECB.13 The average
agreement across items is used as a measure of trust in the ECB and is coded as
post trust.
Table 4.4: Statements used to indirectly measure trust in the ECB
No Statement Label
(a) The ECB will ensure price stability in the euro area over the next three years. credibility
(b) The ECB looks after the economic well-being of everyone in the euro area. inclusivity
(c) The ECB acts within the limits of its remit. legitimacy
(d) The ECB communicates with the public in a transparent and honest manner. honesty
(e) The ECB has sufficient expertise to understand general economic developments. expertise
(f) The ECB makes decisions that benefit people like me. interest
Notes: Presented in random order. Scale of answers is 1 (completely disagree) to 7 (completely agree).
Procedures and Data Selection
In total, 3999 respondents participated September 2023 Wave of the BOP-HH. Us-
ing pre-registered exclusion criteria (see AsPredicted #145978), I drop respondents
who stated either ”No answer” or ”Don’t know” to one of the three questions before
treatments.. Respondents who do not know the ECB or who expects inflation (defla-
tion) to be above 25% or below -5% are also excluded. This leaves 3643 respondents
for the analysis and amounts to an exclusion rate of less than 10%. Demographic
variables do not vary much across the initial and the final samples (see Table D.7).
Exclusion is also balanced across treatments.
4.3.2 Results
This section presents the results of the second experiment. Three directional pre-
dictions are pre-registered:
1. Most households believe that the ECB forecast will undershoot actual inflation.
2. This belief (of optimism) is associated with lower trust in the ECB.
13These items are drawn and adapted from a variety of studies in the literature. For example,
statement (a) is used by Ehrmann et al. (2023) as a measure of the ECB’s credibility. A combination
of statements (b) and (f) is used by D’Acunto et al. (2021) to infer trust in the Fed. Similarly, Kril
et al. (2016) use a 17-item questionnaire to measure trust in and credibility of the Bank of Israel.
139
3. Information interventions have a positive impact on trust in the ECB.
Expected Inflation and Perceived Forecast
Figure 4.5: Expected inflation and perceived inflation forecast
0 5 10 15 20 25 30
Perceived one-year ahead ECB forecast for 2023
0 5 10 15 20 25 30
Euro area inflation expectations for 2023 (as of September 2023)
Notes: Circle size indicates frequency of observations. The solid line has a slope of 45 degrees. Two observations
where perc forecast was greater than 30% are removed from the graph for visual illustration (N= 3641).
Figure 4.5 shows respondents’ one-quarter ahead inflation expectations and their
beliefs of the ECB’s one-year ahead inflation forecast. A majority of respondents
believe that the ECB’s forecast will be below actual inflation (N= 2255, 62%). This
reflects the belief that the ECB’s inflation forecast is undershooting and therefore too
optimistic.14 While some believe that the ECB will overshoot inflation (N= 689,
19%), there are also many who believe that the ECB is right on target (N= 699,
19%). Among those who think the ECB will undershoot, 82% perceive the forecast
to be below what it actually is (perc forecast <6.3%) and 78% expect inflation to
be higher than the current rate (exp inflation >5.3%).
14Rather than optimism, this difference may reflect the effect of having more information, as
households make their forecast at a later date than the ECB. In Figure D.1 in Appendix D.3, the
x-axis variable is replaced by 12-month ahead inflation expectations for Germany in January 2023
in order to equalize the forecasting horizons. The results show a similar share of respondents who
believe that the ECB will undershoot.
140
Table 4.5: Perceived direction of the
ECB’s forecast error and trust in the ECB
(1) (2)
Dep. var.: prior trust
ecb optimistic -0.243∗∗∗ -0.236∗∗∗
(0.043) (0.042)
ecb pessimistic 0.095∗0.082
(0.052) (0.051)
constant 0.133∗∗∗ 0.044
(0.038) (0.087)
Controls included No Yes
N3643 3643
R20.02 0.06
Notes: The dependent variable (prior trust) is stan-
dardized to have a mean of zero and a standard de-
viation of one. The following control variables are in-
cluded to the regression (2): age (discrete), female (bi-
nary), university graduate (binary), personal income
below 1500 Euro (binary), single-person household
(binary), born in the GDR before 1989 (binary), and
terms for regions (binary). Robust standard errors
are reported in parentheses. * p < 0.1, ** p < 0.05,
*** p < 0.01.
Could respondents’ beliefs about the direction of the forecast error be an in-
dicator of (lack of) trust? Correlational evidence supports this insight. Table 4.5
shows the results of linear regressions in which prior trust is explained by dummy
variables that take the value one if a respondent believes that the ECB will under-
shoot (ecb optimistic) or overshoot (ecb pessimistic) inflation. The reference group
believes that the ECB is exactly right. The results show an asymmetric relationship.
While beliefs that the central bank will undershoot inflation are negatively related
to self-reported trust in the ECB, beliefs about overshooting inflation are not. Thus,
the perception of optimistic forecasts hurts the ECB more.
Effects of Information Treatments
How do the information treatments affect public opinion about the ECB? Table 4.6
shows the results of linear regressions explaining either the average agreement across
six statements (i.e., post trust) or individual responses to each of the six statements
141
with treatment dummies.15 A positive coefficient reflects an improvement in the
public’s perception of the institution. All regressions control for prior trust and the
initial deviations of perc forecast and exp inflation from 6.3% and 5.3%, respectively.
Table 4.6: Effects of information treatments on the public’s perception of the ECB
(1) (2) (3) (4) (5) (6) (7)
Dep. var.: post trust credibility inclusivity legitimacy honesty expertise interest
G2 (= Inflation) 0.041 0.045 0.028 0.028 0.052 0.020 0.047
(0.030) (0.036) (0.035) (0.036) (0.035) (0.036) (0.035)
G3 (= Forecast) 0.081∗∗∗ 0.106∗∗∗ 0.065∗0.048 0.074∗∗ 0.049 0.072∗∗
(0.030) (0.037) (0.035) (0.036) (0.035) (0.037) (0.035)
G4 (= G2 + G3) 0.035 0.060∗0.046 0.001 0.046 -0.003 0.039
(0.030) (0.036) (0.035) (0.036) (0.035) (0.036) (0.035)
constant -0.010 -0.009 -0.011 -0.001 -0.014 -0.007 -0.012
(0.023) (0.028) (0.026) (0.028) (0.027) (0.028) (0.027)
N3629 3629 3629 3629 3629 3629 3629
R20.59 0.41 0.44 0.40 0.43 0.41 0.46
Notes: Results of linear regressions in which agreement on six statements is explained by treatment dummies. All regressions control
for prior trust and the initial deviations of perc forecast and exp inflation from 6.3% and 5.3%, respectively. The dependent variables
are standardized to have a mean of zero and a standard deviation of one. Robust standard errors are reported in parentheses. * p < 0.1,
** p < 0.05, *** p < 0.01.
The results show that informing respondents about the ECB’s inflation forecast,
as in the G3 group, has a positive effect on trust in the ECB. The breakdown by
statement shows a similar result. The main effect of the information in G3 is on the
credibility of the ECB’s inflation target credibility (t-stat= 2.87, p-value= 0.004).
Treated respondents are also more likely to agree with the statement about the
ECB’s honesty (t-stat= 2.09, p-value= 0.037), to perceive more personal benefits
from monetary policy (t-stat= 2.04, p-value= 0.042), and to view the ECB’s policy
as more inclusive in the euro area (t-stat= 1.85, p-value= 0.065). In contrast, current
inflation information does not significantly affect public opinion in any category.
Interestingly, adding current inflation information to inflation forecast information,
as in G4, neutralizes the positive effect of the latter.
4.4 Conclusion
Using two surveys of German households conducted over two years, this article
shows that public perceptions of the ECB’s forecast errors matter for monetary
1514 respondents who choose not to answer at least one of the statements are excluded from this
analysis.
142
policy. The first survey documents that households significantly underestimate the
ECB’s forecast accuracy. The second survey shows that this underestimation reflects
beliefs about overly optimistic forecasts. In both cases, information that challenges
these perceptions has the intended effects on inflation expectations, uncertainty,
consumption plans, and trust in the central bank.
The policy implications of these results are subject to several caveats. The first
experiment demonstrates the importance of correcting misperceptions in a salient
way, as one of the treatments did by asking a question before presenting the belief-
challenging information. Second, these experiments were conducted under special
circumstances, when inflation was high and volatile, and may not be replicable under
the opposite conditions of low and stable inflation. Third, information may have
countervailing effects when it is delivered in a bundle, as was one of the treatments
in the second experiment. Information experiments provide an inexpensive way
to study these caveats before implementing communication policies for the general
population.
143
Appendix D
Appendix
D.1 Experimental Materials
All the material used in the experiment (e.g., questions, treatment texts) can be
found in the Bundesbank’s website both in German and in English:
https://www.bundesbank.de/en/bundesbank/research/survey-on-consumer-expectations
Waves 33 and 45 are relevant for the treatment texts and questions.
D.2 A Model of Belief Updating with Trust
This section presents a model of Bayesian belief updating that can be used to gen-
erate predictions in the first experiment. In the model, the quality of the so-called
primary signal (i.e., information about the state) is unknown, but can be partially
inferred from a secondary signal (i.e., information about the quality of the primary
signal). I conceptualize the framework of the model based on the expectation for-
mation process of a household, where the central bank’s inflation forecast is the
signal.
D.2.1 Model
Household ihas the prior belief that future inflation πis normally distributed with
mean π0(i) and variance σ2
0(i). The central bank publishes its inflation forecast
πf, an unbiased but noisy signal of future inflation distributed as N(π, σ2
f). The
variance σ2
fis referred to as the forecast uncertainty. For convenience, σ2
0and σ2
fare
assumed to be orthogonal.
Using Bayes rule, I can express the posterior of the household for inflation with
π1(i) = π0(i) + α(i)(πf−π0(i)),(D.1)
144
where α(i) is the weight household igives to the inflation forecast. It equals
α(i) =
σ2
0(i)
σ2
0(i)+σ2
f,if {πf, σ2
f} ∈ Ω(i)
0,otherwise
(D.2)
where Ω(i) is the information set of the household i. The variance of the posterior
(hereafter referred to as the posterior inflation uncertainty) equals
σ2
1(i) =
σ2
0(i)σ2
f
σ2
0(i)+σ2
f,if {πf, σ2
f} ∈ Ω(i)
σ2
0(i),otherwise.
(D.3)
The setup shown so far is a standard Bayesian updating model. From now on, I
assume that the household does not know the forecast uncertainty, i.e. {σ2
f}is not
in Ω(i). He has the prior that the forecast uncertainty is normally distributed with
mean σ2
f,0(i). To convey the quality of the inflation forecast πf, the central bank
can publish its past forecast performance as a secondary signal. Statistically, this
corresponds to the variance of past forecast errors σ2
h, where the subscript hrefers
to history. Past forecast errors are unbiased and normally distributed. Using this
setup, one can express the posterior about forecast uncertainty with:
σ2
f,1(i) =
σ2
f,0(i) + ω(i)(σ2
h−σ2
f,0(i)),if {πf, σ2
h} ∈ Ω(i)
σ2
f,0(i),otherwise
(D.4)
where ω(i) is the weight iassigns to the past performance.
Trust is introduced into the model in a reduced-form manner through the house-
hold’s belief on forecast uncertainty σ2
f,0(i) (i.e. perceived performance). This belief
can potentially deviate from the objective performance σ2
h, as in
κ0(i)≡σ2
f,0(i)
σ2
h
=g(τ0(i))
σ2
h
(D.5)
where κ0(i) is a coefficient that reflects the magnitude and direction of these devi-
ations prior to the central bank communication. If κ0(i) is greater (smaller) than
1, then iunderestimates (overestimates) the central bank’s forecast performance.
Trust in the central bank τ0(i)∈[0,1] enters the framework through its relation to
κ0(i). It is assumed that g′<0, so that trust and underestimation are negatively
related, ceteris paribus.
145
D.2.2 Hypotheses
The model makes a number of predictions. The following predictions require that the
public initially underestimates the central bank’s performance (κ0>1). Let us call
the central bank’s decision to communicate past forecast performance ”correcting
misperceptions”.
Hypotheses 1: Correcting misperceptions increases the weight that households
place on the inflation forecast (i.e. α(i)).
Hypotheses 2: Correcting misperceptions reduces inflation uncertainty (i.e.
σ2
1(i)).
These hypotheses can be verified by assuming κ1< κ0that information reduces
underestimation. Thus, the household notices that the signal is of higher quality.
These simple predictions hold even in the absence of the concept of trust (i.e.,
τ0⊥σ2
f,0).
Hypotheses 3: Correcting misperceptions increases trust in the central bank
(i.e. τ1> τ0).
This hypothesis is not based on the model, but finds its justification in the litera-
ture. Information interventions on factual misperceptions typically change people’s
attitudes towards the misperceived topic.
Hypothesis 4: Trust in the central bank mediates the effects of correcting
misperceptions on inflation expectations.
The justification for this hypothesis is as follows. First, the third hypothesis
must be corroborated (i.e., τ1> τ0). Higher trust leads to a weaker underestimation
by the public due to the assumption of g′<0. This leads to the same effects in the
first two hypotheses. Thus, there is a dual effect of communication policy to correct
misperceptions when it also influences trust.
146
D.3 Supporting Material
D.3.1 First Experiment
Table D.1: Demographic profile
Pre-exclusion Post-exclusion Refreshers
Age 57.99 58.16 54.93
(15.10) (15.03) (15.27)
Female 0.410 0.387 0.456
(0.492) (0.487) (0.499)
University graduate 0.499 0.523 0.473
(0.500) (0.500) (0.500)
Personal income <1500 euros 0.291 0.280 0.277
(0.454) (0.449) (0.448)
Single-person household 0.257 0.254 0.260
(0.437) (0.436) (0.439)
Born in the GDR before 1990 0.155 0.153 0.181
(0.362) (0.360) (0.385)
Northern Germany 0.168 0.170 0.150
(0.374) (0.376) (0.357)
Western Germany 0.258 0.258 0.231
(0.438) (0.438) (0.422)
Southern Germany 0.405 0.405 0.410
(0.491) (0.491) (0.492)
Eastern Germany 0.168 0.167 0.210
(0.374) (0.373) (0.407)
N5527 4863 520
Notes: Descriptives for demographic variables are reported for the full and restricted (post-exclusion)
samples. Standard deviations are reported in parentheses below means.
147
Table D.2: Ordered logistic regression on the ECB per-
ception
(1) (2)
Dep. var.: perc accuracy
ecb trust -0.109∗∗∗ -0.109∗∗∗
(0.023) (0.031)
Age 0.010∗∗∗ 0.010∗
(0.004) (0.005)
Female 0.434∗∗∗ 0.381∗∗
(0.120) (0.185)
University graduate -0.323∗∗∗ -
(0.118) (.)
Personal income <1500 euros 0.294 0.247
(0.304) (0.394)
Single-person household -0.385 -0.420
(0.321) (0.429)
Born in the GDR before 1990 0.362 0.269
(0.300) (0.406)
Northern Germany -0.057 -0.063
(0.309) (0.415)
Western Germany -0.121 -0.209
(0.285) (0.384)
Southern Germany -0.094 -0.008
(0.275) (0.371)
N1022 517
Pseudo R20.0237 0.0186
Notes: The dependent variable perc inaccuracy quantifies the per-
ceived value of average forecast errors. Trust in the ECB is measured
post-treatment. Column (2) filters the data to university graduated
sample. Robust standard errors are reported in parentheses. * p < 0.1,
** p < 0.05, *** p < 0.01.
148
Table D.3: Demographic characteristics of respondents by the level of
misperception
perc accuracy 0-1 pp. 1-2 pp. 2-3 pp. ≥3 pp.
Age 56.17 58.21 59.58 61.66
(15.10) (15.22) (14.89) (13.55)
Female 0.297 0.370 0.433 0.436
(0.458) (0.483) (0.496) (0.497)
University degree 0.626 0.552 0.421 0.475
(0.485) (0.498) (0.494) (0.500)
Personal income <1500 euros 0.297 0.251 0.277 0.304
(0.458) (0.434) (0.448) (0.461)
Single-person household 0.277 0.228 0.247 0.264
(0.449) (0.420) (0.432) (0.442)
Born in the GDR before 1989 0.128 0.111 0.165 0.178
(0.335) (0.314) (0.371) (0.383)
Northern Germany 0.221 0.137 0.159 0.185
(0.416) (0.344) (0.366) (0.389)
Western Germany 0.200 0.309 0.216 0.231
(0.401) (0.463) (0.412) (0.422)
Southern Germany 0.415 0.414 0.439 0.389
(0.494) (0.493) (0.497) (0.488)
Eastern Germany 0.164 0.140 0.186 0.195
(0.371) (0.347) (0.390) (0.397)
N195 614 328 303
Notes: The table reports the mean demographic characteristic of respondents by response
categories to the ECB perception question. Standard deviations are reported in parentheses
below means.
149
Table D.4: Robustness regressions for reduced-form treatment
effects: With controls and without winsorization
(1) (2) (3) (4)
revision revision inf unc inf unc
β20.266∗∗∗ 0.217∗∗∗
(0.067) (0.019)
β30.166∗∗ 0.208∗∗∗
(0.083) (0.022)
β40.292∗∗∗ 0.250∗∗∗
(0.076) (0.019)
α2-1.170∗∗∗ -1.081∗∗∗
(0.369) (0.141)
α3-1.519∗∗∗ -1.241∗∗∗
(0.354) (0.151)
α4-2.023∗∗∗ -1.447∗∗∗
(0.325) (0.137)
β10.390∗∗∗ 0.344∗∗∗
(0.068) (0.017)
β0or α12.201∗∗∗ 2.753∗∗∗ 7.378∗∗∗ 9.899∗∗∗
(0.223) (0.234) (0.255) (0.320)
Winsorized No Yes No Yes
Controls included No Yes No Yes
N4863 4863 4863 4863
R20.40 0.36 0.01 0.11
Notes: The table reports regression estimates from equations (4.1) and (4.2).
Columns (1) and (3) use non-winsorized data, while columns (2) and (4) include
control variables. The following control variables are included: age (discrete),
female (binary), university graduate (binary), personal income below 1500 Euro
(binary), single-person household (binary), born in the GDR before 1989 (binary),
and terms for the regions (binary). Robust standard errors are reported in paren-
theses. * p < 0.1, ** p < 0.05, *** p < 0.01.
150
Table D.5: Robustness regressions for the causal
mechanisms: Only T4
(1) (2) (3)
Dep. var.: ecb trust post inf inf unc
treatment (T) 0.051∗∗ -0.113 -0.390∗∗∗
(0.026) (0.093) (0.110)
ecb trust (c
M) -0.596∗∗∗ -0.321∗∗∗
(0.073) (0.085)
cjeu trust (Z1) 0.567∗∗∗
(0.016)
media trust (Z2) 0.208∗∗∗
(0.017)
constant 0.142 4.141∗∗∗ 6.867∗∗∗
(0.087) (0.316) (0.375)
N2905 2905 2905
adjusted R20.534 0.283 0.169
F-stat 1631.93 67.938 41.180
J-stat – 0.246 0.059
p-value – 0.620 0.808
Notes: The table reports the results of 2SLS regressions where
the treatment indicator takes a value of one only for T4. Trust-
related variables are standardized. The following control variables
are included: age (discrete), female (binary), university graduate
(binary), personal income below 1500 Euro (binary), single-person
household (binary), born in the GDR before 1989 (binary), terms
for the regions (binary), and prior inflation expectations (continu-
ous). Robust standard errors are reported in parentheses. * p < 0.1,
** p < 0.05, *** p < 0.01.
151
Table D.6: Robustness regressions for the causal
mechanisms: Only cjeu trust as instrument
(1) (2) (3)
Dep. var.: ecb trust post inf inf unc
treatment (T) 0.050∗∗ -0.035 -0.274∗∗∗
(0.024) (0.083) (0.100)
ecb trust (c
M) -0.544∗∗∗ -0.283∗∗∗
(0.067) (0.077)
cjeu trust (Z1) 0.670∗∗∗
(0.011)
constant 0.168∗∗ 4.028∗∗∗ 6.999∗∗∗
(0.079) (0.275) (0.336)
N3886 3886 3886
adjusted R20.498 0.283 0.165
F-statistic 3586.79 86.002 50.644
Notes: The table reports the results of 2SLS regressions with a
single instrument (cjeu trust). Trust-related variables are stan-
dardized. The following control variables are included: age (dis-
crete), female (binary), university graduate (binary), personal
income below 1500 Euro (binary), single-person household (bi-
nary), born in the GDR before 1989 (binary), terms for the re-
gions (binary), and prior inflation expectations (continuous). Ro-
bust standard errors are reported in parentheses. * p < 0.1, **
p < 0.05, *** p < 0.01.
152
D.3.2 Second Experiment
Table D.7: Demographic profile
Pre-exclusion Post-exclusion Refreshers
Age 56.87 56.86 49.94
(15.92) (15.84) (17.04)
Female 0.410 0.387 0.454
(0.492) (0.487) (0.498)
University graduate 0.506 0.518 0.550
(0.500) (0.500) (0.498)
Personal income <1500 euros 0.301 0.290 0.285
(0.459) (0.454) (0.452)
Single-person household 0.266 0.263 0.247
(0.442) (0.440) (0.432)
Born in the GDR before 1990 0.152 0.151 0.155
(0.359) (0.358) (0.363)
Northern Germany 0.169 0.164 0.165
(0.374) (0.370) (0.372)
Western Germany 0.260 0.261 0.235
(0.439) (0.439) (0.424)
Southern Germany 0.406 0.410 0.420
(0.491) (0.492) (0.494)
Eastern Germany 0.166 0.165 0.179
(0.372) (0.371) (0.384)
N3999 3643 502
Notes: Descriptives for demographic variables are reported for the full and restricted (post-exclusion)
samples. Standard deviations are reported in parentheses below means.
153
Table D.8: Correlation matrix for different facets of public trust in the ECB
prior trust tCredibility inclusivity legitimacy honesty expertise self interest
prior trust 1.0000
tCredibility 0.6176 1.0000
inclusivity 0.6705 0.6563 1.0000
legitimacy 0.6405 0.5798 0.5998 1.0000
honesty 0.6505 0.6234 0.6975 0.6278 1.0000
expertise 0.6222 0.5699 0.6035 0.6439 0.6017 1.0000
self interest 0.6673 0.6460 0.7099 0.5800 0.6933 0.5793 1.0000
Notes: The table reports correlations only for the control group G1.
Figure D.1: Expected inflation (for Germany) and perceived inflation forecast
0 5 10 15 20 25 30
Perceived one-year ahead ECB forecast for 2023
0 5 10 15 20 25 30
Inflation expectations for Germany (as of January 2023)
Notes: Circle size indicates frequency of observations. The solid line has a slope of 45 degrees. Five observations
where perc forecast or inflation expectations were greater than 30% are removed from the graph for visual illustration
(N= 744).
154
Chapter 5
Conclusion
The chapters of this dissertation aim at demonstrating the usefulness of controlled
experiments in addressing issues relevant to macroeconomics. This chapter provides
a brief discussion of the results, highlighting the main implications and limitations.
Chapter 1provides evidence for the robustness of the experience effects to a
particular real-world complication of heterogeneous shocks. Such a demonstra-
tion would not be clean outside the laboratory, because each economy experiences
a unique sequence of shocks with non-comparable institutions/agents across and
within time. In the laboratory, one can repeat the same decision situation as many
times as necessary, with control over the environment. The main limitation of this
experiment is that we have only one treatment condition, so we cannot generalize
to all shock sequences. We leave this to future work. In terms of expectations, we
find that the Heuristic Switching Model (HSM) fits the aggregate data well. The
best-fitting HSM model suggests that there is no increase in the share of funda-
mentalist/rational expectations. Adaptive learning seems to be responsible for the
acceleration of convergence. Overall, these results support the view that gradual
convergence to equilibrium is a good description of economies with inexperienced
agents, but may fall short when agents have a lot of experience with large shocks.
Chapter 2finds that prices react asymmetrically to positive and negative cost
shocks, even in the absence of the frictions often claimed to cause the phenomenon.
With experiments, we can solve the identification problem of market power, intro-
duce arbitrarily large shocks into the equilibria, and drop real-world complications
such as capacity constraints or search costs to isolate the root cause. Our results
suggest that imperfect competition in the form of tacit collusion is a sufficient cause
of the phenomenon. However, the usual antidote to tacit collusion, the introduc-
tion of more competitors, has no discernible effect. Future work can use alternative
competition policies as treatments. Also, our oligopolists do not have perfect in-
formation about each other’s prices, which may have facilitated collusion. From
a macroeconomic modeling perspective, these results support the view that asym-
155
metric price transmission is the rule rather than the exception. Hence, the game
theoretic foundations of DSGE models need to be reconsidered.1
Chapter 3uses laboratory experiments to identify deep parameters of the util-
ity function. Notably, we ask whether strategic-uncertainty is treated differently
from normal ambiguity, and if so, whether this is due to aversion to this source
or pessimism about other players’ decisions. This identification requires measuring
beliefs and certainty equivalents of comparable lotteries and games, which is un-
likely to occur in the field data. We find that pessimism is significantly higher in the
stag-hunt game than in the market entry game and a comparable ambiguous lottery.
Thus, strategic-uncertainty attitudes may not be stable across games. Applying this
method to other games can inform us about the characteristics of a game that affect
stability. Future studies can also compare our parameters with other methods, such
as the belief-hedges method of Baillon et al. (2018,2021). The implications of these
attitudes for behavior such as equilibrium selection is also an open avenue.
Chapter 4employs survey experiments to test implications of central bank com-
munication about its inflation forecasts and accuracy on the general population.
There is a shorter path between the results of these experiments and policy because
the subjects are drawn from a representative sample and are unaware of the exper-
iment. The results show that communicating forecast accuracy in a short message
with a neutral tone is largely beneficial for the European Central Bank. However,
there are caveats. The salience of the intervention is found to be critical, as well as
the bundling of information in a text. Moreover, these results come from a period
of high and volatile inflation. Future studies could test alternative formulations of
the communication and look more closely at longer-term effects. Forecast accuracy
information partially affects consumption plans in our experiment, although these
effects are modest. It is worth replicating these findings and extending their scope
to a wide range of decisions that households make based on future inflation.
1Recently, Wang and Werning (2022) introduce a dynamic oligopoly game into an otherwise
standard New Keynesian DSGE. Also, Rebelo et al. (2024) propose a macro model that generates
such an asymmetry using behavioral foundations (e.g., dual-system theory).
156
Bibliography
Abbink, K. and J. Brandts (2008): “Pricing in Bertrand competition with
increasing marginal costs,” Games and Economic Behavior, 63, 1–31.
Abdellaoui, M., A. Baillon, L. Placido, and P. P. Wakker (2011): “The
rich domain of uncertainty: Source functions and their experimental implementa-
tion,” American Economic Review, 101, 695–723.
Acharya, A., M. Blackwell, and M. Sen (2016): “Explaining causal findings
without bias: Detecting and assessing direct effects,” American Political Science
Review, 110, 512–529.
Ahrens, S., I. Pirschel, and D. J. Snower (2017): “A theory of price adjust-
ment under loss aversion,” Journal of Economic Behavior and Organization, 134,
78–95.
Amir, R., P. Erickson, and J. Jin (2017): “On the microeconomic foundations
for linear demand for differentiated products,” Journal of Economic Theory, 169,
641–665.
Angeletos, G.-M. and J. La’o (2010): “Noisy business cycles,” NBER Macroe-
conomics Annual, 24, 319–378.
Anufriev, M. and C. Hommes (2012): “Evolutionary selection of individual
expectations and aggregate outcomes in asset pricing experiments,” American
Economic Journal: Microeconomics, 4, 35–64.
Anufriev, M., C. H. Hommes, and R. H. Philipse (2013): “Evolutionary
selection of expectations in positive and negative feedback markets,” Journal of
Evolutionary Economics, 23, 663–688.
Arifovic, J. and J. Duffy (2018): “Heterogeneous agent modeling: Experimen-
tal evidence,” in Handbook of Computational Economics, ed. by C. Hommes and
B. LeBaron, Elsevier, vol. 4, 491–540.
157
Armantier, O., S. Nelson, G. Topa, W. Van der Klaauw, and B. Zafar
(2016): “The price is right: Updating inflation expectations in a randomized price
information experiment,” Review of Economics and Statistics, 98, 503–523.
Ash, E., H. Mikosch, A. Perakis, and S. Sarferaz (2024): “Seeing and
hearing is believing: The role of audiovisual communication in shaping inflation
expectations,” Working Paper no. 515, KOF Swiss Economic Institute.
Assenza, T., P. Heemeijer, C. H. Hommes, and D. Massaro (2021): “Man-
aging self-organization of expectations through monetary policy: A macro exper-
iment,” Journal of Monetary Economics, 117, 170–186.
Attema, A. E., W. B. Brouwer, and O. L’Haridon (2013): “Prospect theory
in the health domain: A quantitative assessment,” Journal of Health Economics,
32, 1057–1065.
Bacon, R. W. (1991): “Rockets and feathers: the asymmetric speed of adjustment
of UK retail gasoline prices to cost changes,” Energy Economics, 13, 211–218.
Baillon, A., H. Bleichrodt, C. Li, and P. P. Wakker (2021): “Belief
hedges: Measuring ambiguity for all events and all models,” Journal of Economic
Theory, 198, 105353.
Baillon, A., Z. Huang, A. Selim, and P. P. Wakker (2018): “Measuring
ambiguity attitudes for all (natural) events,” Econometrica, 86, 1839–1858.
Baillon, A., N. Liu, and D. van Dolder (2017): “Comparing uncertainty
aversion towards different sources,” Theory and Decision, 83, 1–18.
Ball, L. and N. G. Mankiw (1994): “Asymmetric price adjustment and eco-
nomic fluctuations,” Economic Journal, 104, 247–261.
Bao, T., J. Duffy, and C. Hommes (2013): “Learning, forecasting and opti-
mizing: An experimental study,” European Economic Review, 61, 186–204.
Bao, T., C. Hommes, J. Sonnemans, and J. Tuinstra (2012): “Individual
expectations, limited rationality and aggregate outcomes,” Journal of Economic
Dynamics and Control, 36, 1101–1120.
Baron, R. M. and D. A. Kenny (1986): “The moderator–mediator variable
distinction in social psychological research: Conceptual, strategic, and statistical
considerations,” Journal of Personality and Social Psychology, 51, 1173–1182.
158
Baron-Cohen, S., S. Wheelwright, J. Hill, Y. Raste, and I. Plumb
(2001): “The “Reading the Mind in the Eyes” Test revised version: a study with
normal adults, and adults with Asperger syndrome or high-functioning autism,”
Journal of Child Psychology and Psychiatry and Allied Disciplines, 42, 241–251.
Bayer, R.-C. and C. Ke (2018): “What causes rockets and feathers? An ex-
perimental investigation,” Journal of Economic Behavior and Organization, 153,
223–237.
Beard, T. R. and R. O. Beil (1994): “Do people rely on the self-interested
maximization of others? An experimental test,” Management Science, 40, 252–
262.
Becker, G. M., M. H. DeGroot, and J. Marschak (1964): “Measuring
utility by a single-response sequential method,” Behavioral Science, 9, 226–232.
Benabou, R. and R. Gertner (1993): “Search with learning from prices: does
increased inflationary uncertainty lead to higher markups?” Review of Economic
Studies, 60, 69–93.
Bholat, D., N. Broughton, J. Ter Meer, and E. Walczak (2019): “En-
hancing central bank communications using simple and relatable information,”
Journal of Monetary Economics, 108, 1–15.
Binder, C. and A. Rodrigue (2018): “Household informedness and long-run
inflation expectations: Experimental evidence,” Southern Economic Journal, 85,
580–598.
Binder, C. C. (2017): “Measuring uncertainty based on rounding: New method
and application to inflation expectations,” Journal of Monetary Economics, 90,
1–12.
Binder, C. C. and R. Sekkel (2023): “Central bank forecasting: A survey,”
Journal of Economic Surveys, 1–23.
Bohnet, I. and R. Zeckhauser (2004): “Trust, risk and betrayal,” Journal of
Economic Behavior and Organization, 55, 467–484.
Borenstein, S., A. C. Cameron, and R. Gilbert (1997): “Do gasoline prices
respond asymmetrically to crude oil price changes?” Quarterly Journal of Eco-
nomics, 112, 305–339.
Borenstein, S. and A. Shepard (1996): “Sticky Prices, Inventories, and Market
Power in Wholesale Gasoline Markets,” Working Paper 5468, National Bureau of
Economic Research.
159
Bos, I. and D. Vermeulen (2022): “On the microfoundation of linear oligopoly
demand,” The BE Journal of Theoretical Economics, 22, 1–15.
Brandenburger, A. (1996): “Strategic and structural uncertainty in games,”
in Wise Choices: Games, Decisions, and Negotiations, ed. by R. Zeckhauser,
R. Keene, and J. Sibenius, Harvard Business School Press, 221–232.
Brandts, J., D. J. Cooper, and R. A. Weber (2015): “Legitimacy, com-
munication and leadership in the turnaround game,” Management Science, 61,
2627–2645.
Bray, M. (1982): “Learning, estimation, and the stability of rational expectations,”
Journal of Economic Theory, 26, 318–339.
——— (1983): “Convergence to rational expectations equilibrium,” in Individual
Forecasting and Aggregate Outcomes, ed. by R. Frydman and E. S. Phelps, Cam-
bridge: Cambridge University Press, 123–137.
Brock, W. A. and C. H. Hommes (1997): “A rational route to randomness,”
Econometrica, 65, 1059–1095.
Brouwer, N. and J. de Haan (2022): “The impact of providing information
about the ECB’s instruments on inflation expectations and trust in the ECB:
Experimental evidence,” Journal of Macroeconomics, 73, 103430.
Brown, S. P. and M. K. Yucel (2000): “Gasoline and crude oil prices: why the
asymmetry?” Economic & Financial Review, 23–29.
Bullock, J. G., D. P. Green, and S. E. Ha (2010): “Yes, but what’s the
mechanism? (don’t expect an easy answer),” Journal of Personality and Social
Psychology, 98, 550–558.
Bursian, D. and E. Faia (2018): “Trust in the monetary authority,” Journal of
Monetary Economics, 98, 66–79.
Byrne, D. P. and N. De Roos (2019): “Learning to coordinate: A study in
retail gasoline,” American Economic Review, 109, 591–619.
Calford, E. (2020): “Uncertainty Aversion in Game Theory: Experimental Evi-
dence,” Journal of Economic Behavior and Organization, 176, 720–734.
Camerer, C. F., T.-H. Ho, and J.-K. Chong (2004): “A cognitive hierarchy
model of games,” Quarterly Journal of Economics, 119, 861–898.
160
Camerer, C. F. and R. Karjalainen (1994): “Ambiguity aversion and non-
additive beliefs in non-cooperative games: Experimental evidence,” in Models
and Experiments in Risk and Rationality, ed. by B. Munier and M. J. Machina,
Springer.
Camerer, C. F. and M. Weber (1992): “Recent developments in modeling
preferences: Uncertainty and ambiguity,” Journal of Risk and Uncertainty, 5,
325–370.
Cason, T. N. and D. Friedman (2002): “A laboratory study of customer mar-
kets,” BE Journal of Economic Analysis & Policy, 2, 1–43.
Cavallo, A., G. Cruces, and R. Perez-Truglia (2017): “Inflation expec-
tations, learning, and supermarket prices: Evidence from survey experiments,”
American Economic Journal: Macroeconomics, 9, 1–35.
Chark, R. and S.-H. Chew (2015): “A neuroimaging study of preference for
strategic uncertainty,” Journal of Risk and Uncertainty, 50, 209–227.
Charness, G., T. Garcia, T. Offerman, and M. C. Villeval (2020): “Do
measures of risk attitude in the laboratory predict behavior under risk in and
outside of the laboratory?” Journal of Risk and Uncertainty, 60, 99–123.
Chateauneuf, A., J. Eichberger, and S. Grant (2007): “Choice under un-
certainty with the best and worst in mind: Neo-additive capacities,” Journal of
Economic Theory, 137, 538–567.
Chen, Y. and M. Riordan (2007): “Price and variety in the spokes model,”
Economic Journal, 117, 897–921.
Chierchia, G., R. Nagel, and G. Coricelli (2018): “Betting ”on nature” or
”betting on others”: Anti-coordination induces uniquely high levels of entropy,”
Scientific Reports, 8, 1–11.
Christelis, D., D. Georgarakos, T. Jappelli, and M. van Rooij (2020):
“Trust in the central bank and inflation expectations,” International Journal of
Central Banking, 16, 1–37.
Coibion, O., D. Georgarakos, Y. Gorodnichenko, and M. Van Rooij
(2023): “How does consumption respond to news about inflation? Field evidence
from a randomized control trial,” American Economic Journal: Macroeconomics,
15, 109–152.
161
Coibion, O., Y. Gorodnichenko, and M. Weber (2022): “Monetary policy
communications and their effects on household inflation expectations,” Journal of
Political Economy, 130, 1537–1584.
Cooper, K., H. Schneider, and M. Waldman (2021): “Limited rationality
and the strategic environment: Further evidence from a pricing game,” Journal
of Behavioral and Experimental Economics, 90, 101632.
Cooper, K. B., H. S. Schneider, and M. Waldman (2017): “Limited ratio-
nality and the strategic environment: Further theory and experimental evidence,”
Games and Economic Behavior, 106, 188–208.
Cooper, R. and A. John (1988): “Coordinating coordination failures in Keyne-
sian models,” Quarterly Journal of Economics, 103, 441–463.
Cornand, C. and F. Heinemann (2019a): “Experiments in macroeconomics:
methods and applications,” in Handbook of Research Methods and Applications in
Experimental Economics, Edward Elgar Publishing, 269–294.
——— (2019b): “Monetary policy obeying the Taylor principle turns prices into
strategic substitutes,” Journal of Economic Behavior and Organization, forth-
coming.
Cornea-Madeira, A., C. Hommes, and D. Massaro (2019): “Behavioral het-
erogeneity in US inflation dynamics,” Journal of Business & Economic Statistics,
37, 288–300.
Costa-Gomes, M. A. and G. Weizs¨
acker (2008): “Stated beliefs and play in
normal-form games,” Review of Economic Studies, 75, 729–762.
D’Acunto, F., A. Fuster, and M. Weber (2021): “Diverse policy committees
can reach underrepresented groups,” Working Paper no. 29275, National Bureau
of Economic Research.
Danz, D., L. Vesterlund, and A. J. Wilson (2020): “Belief Elicitation: Limit-
ing Truth Telling with Information on Incentives,” Working Paper 27327, National
Bureau of Economic Research.
Davidson, R. and J. G. MacKinnon (2000): “Bootstrap tests: How many
bootstraps?” Econometric Reviews, 19, 55–68.
Davis, D. (2009): “Pure numbers effects, market power, and tacit collusion in
posted offer markets,” Journal of Economic Behavior and Organization, 72, 475–
488.
162
Davis, D. and O. Korenok (2011): “Nominal shocks in monopolistically com-
petitive markets: An experiment,” Journal of Monetary Economics, 58, 578–589.
Deaton, A. and J. Muellbauer (1980): “An Almost Ideal Demand System,”
American Economic Review, 70, 312–326.
Deck, C. A. and B. J. Wilson (2008): “Experimental gasoline markets,” Journal
of Economic Behavior and Organization, 67, 134–149.
Dr¨
ager, L., M. Lamla, and D. Pfajfar (2022): “How to limit the spillover
from the 2021 inflation surge to inflation expectations?” Working Paper Series in
Economics 407, L¨uneburg.
Dr¨
ager, L. and G. Nghiem (2023): “Inflation literacy, inflation expectations,
and trust in the central bank: A Survey experiment,” Working Paper no. 10539,
Center for Economic Studies & Ifo Institute.
Duersch, P. and T. A. Eife (2019): “Price competition in an inflationary envi-
ronment,” Journal of Monetary Economics, 104, 48–66.
Duffy, J. (2017): “Macroeconomics: A Survey of Laboratory Research,” in The
Handbook of Experimental Economics, Volume 2, ed. by J. H. Kagel and A. E.
Roth, Princeton: Princeton University Press, 1–90.
Dufwenberg, M. and U. Gneezy (2000): “Price competition and market con-
centration: an experimental study,” International Journal of Industrial Organi-
zation, 18, 7–22.
Dufwenberg, M., T. Lindqvist, and E. Moore (2005): “Bubbles and expe-
rience: An experiment,” American Economic Review, 95, 1731–1737.
Ehrmann, M. and M. Fratzscher (2011): “Politics and monetary policy,”
Review of Economics and Statistics, 93, 941–960.
Ehrmann, M., D. Georgarakos, and G. Kenny (2023): “Credibility gains
from communicating with the public: Evidence from the ECB’s new monetary
policy strategy,” Working Paper no. 2785, European Central Bank.
El-Shagi, M., S. Giesen, and A. Jung (2016): “Revisiting the relative fore-
cast performances of Fed staff and private forecasters: A dynamic approach,”
International Journal of Forecasting, 32, 313–323.
Evans, G. W. and S. Honkapohja (2001): Learning and expectations in macroe-
conomics, Princeton University Press.
163
Farjam, M. (2019): “On whom would I want to depend; humans or computers?”
Journal of Economic Psychology, 72, 219–228.
Fehr, E. and J.-R. Tyran (2001): “Does money illusion matter?” American
Economic Review, 91, 1239–1262.
——— (2005): “Individual irrationality and aggregate outcomes,” Journal of Eco-
nomic Perspectives, 19, 43–66.
——— (2008): “Limited rationality and strategic interaction: the impact of the
strategic environment on nominal inertia,” Econometrica, 76, 353–394.
Fiala, L. and S. Suetens (2017): “Transparency and cooperation in repeated
dilemma games: a meta study,” Experimental Economics, 20, 755–771.
Fischbacher, U. (2007): “z-Tree: Zurich toolbox for ready-made economic exper-
iments,” Experimental Economics, 10, 171–178.
Fischhoff, B., P. Slovic, and S. Lichtenstein (1977): “Knowing with cer-
tainty: The appropriateness of extreme confidence.” Journal of Experimental Psy-
chology: Human Perception and Performance, 3, 552.
Fonseca, M. A. and H.-T. Normann (2012): “Explicit vs. tacit collusion—The
impact of communication in oligopoly experiments,” European Economic Review,
56, 1759–1772.
Frederick, S. (2005): “Cognitive reflection and decision making,” Journal of
Economic Perspectives, 19, 25–42.
Freeman, K. (1948): Ancilla to the pre-Socratic philosophers: A complete transla-
tion of the fragments in Diels, Fragmente der Vorsokratiker, Harvard University
Press.
Frey, G. and M. Manera (2007): “Econometric models of asymmetric price
transmission,” Journal of Economic Surveys, 21, 349–415.
Gavin, W. T. and R. J. Mandal (2003): “Evaluating FOMC forecasts,” Inter-
national Journal of Forecasting, 19, 655–667.
Gilboa, I. and D. Schmeidler (1989): “Maxmin expected utility with a non-
unique prior,” Journal of Mathematical Economics, 18, 141–153.
——— (1995): “Case-based decision theory,” Quarterly Journal of Economics, 110,
605–639.
164
Gode, D. K. and S. Sunder (1993): “Allocative efficiency of markets with zero-
intelligence traders: Market as a partial substitute for individual rationality,”
Journal of Political Economy, 101, 119–137.
Green, E. J. and R. H. Porter (1984): “Noncooperative collusion under im-
perfect price information,” Econometrica, 52, 87–100.
Greiner, B. (2015): “Subject pool recruitment procedures: organizing experi-
ments with ORSEE,” Journal of the Economic Science Association, 1, 114–125.
——— (2016): “Strategic uncertainty aversion in bargaining: Experimental evi-
dence,” Mimeo.
Haaland, I. and O.-A. E. Naess (2023): “Misperceived Returns to Active In-
vesting,” Working Paper no. 10257, Center for Economic Studies & Ifo Institute.
Haldane, A., A. Macaulay, and M. McMahon (2020): “The 3 E’s of central
bank communication with the public,” Staff Working Paper no. 847, Bank of
England.
Haltiwanger, J. and M. Waldman (1985): “Rational expectations and the
limits of rationality: An analysis of heterogeneity,” American Economic Review,
75, 326–340.
——— (1989): “Limited rationality and strategic complements: the implications for
macroeconomics,” Quarterly Journal of Economics, 104, 463–483.
Hanaki, N., E. Akiyama, and R. Ishikawa (2018): “Effects of different ways of
incentivizing price forecasts on market dynamics and individual decisions in asset
market experiments,” Journal of Economic Dynamics and Control, 88, 51–69.
Hanaki, N., Y. Koriyama, A. Sutan, and M. Willinger (2019): “The strate-
gic environment effect in beauty contest games,” Games and Economic Behavior,
113, 587–610.
Hanaki, N. and A. Masili¯
unas (2021): “Market concentration and incentives
to collude in Cournot oligopoly experiments,” Discussion paper 1131, ISER.
Haruvy, E., Y. Lahav, and C. N. Noussair (2007): “Traders’ expectations
in asset markets: experimental evidence,” American Economic Review, 97, 1901–
1920.
Heemeijer, P., C. Hommes, J. Sonnemans, and J. Tuinstra (2009): “Price
stability and volatility in markets with positive and negative expectations feed-
back: An experimental investigation,” Journal of Economic Dynamics and Con-
trol, 33, 1052–1072.
165
Heinemann, F. (2012): “Understanding financial crises: The contribution of ex-
perimental economics,” Annales d’Economie et de Statistique, 107-108, 7–29.
Heinemann, F., R. Nagel, and P. Ockenfels (2009): “Measuring strategic
uncertainty in coordination games,” Review of Economic Studies, 76, 181–221.
Hey, J. D., A. Morone, and U. Schmidt (2009): “Noise and bias in eliciting
preferences,” Journal of Risk and Uncertainty, 39, 213–235.
Hoffmann, M., E. Moench, L. Pavlova, and G. Schultefrankenfeld
(2022): “Would households understand average inflation targeting?” Journal of
Monetary Economics, 129, S52–S66.
Hommes, C. (2011): “The heterogeneous expectations hypothesis: Some evidence
from the lab,” Journal of Economic Dynamics and Control, 35, 1–24.
——— (2021): “Behavioral and experimental macroeconomics and policy analysis:
A complex systems approach,” Journal of Economic Literature, 59, 149–219.
Hommes, C. and J. Lustenhouwer (2019): “Inflation targeting and liquidity
traps under endogenous credibility,” Journal of Monetary Economics, 107, 48–62.
Hommes, C., J. Sonnemans, J. Tuinstra, and H. Van de Velden (2005):
“Coordination of expectations in asset pricing experiments,” Review of Financial
Studies, 18, 955–980.
Hommes, C. H. (2006): “Heterogeneous agent models in economics and finance,”
in Handbook of Computational Economics, ed. by L. Tesfatsion and K. Judd,
Elsevier, vol. 2, 1109–1186.
——— (2018): “Behavioral & experimental macroeconomics and policy analysis: a
complex systems approach,” Working Paper No. 2201, European Central Bank.
Horstmann, N., J. Kr¨
amer, and D. Schnurr (2018): “Number effects and
tacit collusion in experimental oligopolies,” Journal of Industrial Economics, 66,
650–700.
Hossain, T. and R. Okui (2013): “The binarized scoring rule,” Review of Eco-
nomic Studies, 80, 984–1001.
Huck, S., H.-T. Normann, and J. Oechssler (2004): “Two are few and
four are many: number effects in experimental oligopolies,” Journal of Economic
Behavior and Organization, 53, 435–446.
166
Hussam, R. N., D. Porter, and V. L. Smith (2008): “Thar she blows: Can
bubbles be rekindled with experienced subjects?” American Economic Review,
98, 924–37.
Hyndman, K., E. Y. Ozbay, A. Schotter, and W. Z. Ehrblatt (2012):
“Convergence: an experimental study of teaching and learning in repeated
games,” Journal of the European Economic Association, 10, 573–604.
Imai, K., L. Keele, D. Tingley, and T. Yamamoto (2011): “Unpacking the
black box of causality: Learning about causal mechanisms from experimental and
observational studies,” American Political Science Review, 105, 765–789.
Isaac, R. M. and J. M. Walker (1988): “Group size effects in public goods
provision: The voluntary contributions mechanism,” The Quarterly Journal of
Economics, 103, 179–199.
Ivanov, A. (2011): “Attitudes to ambiguity in one-shot normal-form games: An
experimental study,” Games and Economic Behavior, 71, 366–394.
J¨
ager, S., C. Roth, N. Roussille, and B. Schoefer (2022): “Worker beliefs
about outside options,” Working paper no. 29623, National Bureau of Economic
Research.
Kahneman, D. and A. Tversky (1972): “Subjective probability: A judgment
of representativeness,” Cognitive Psychology, 3, 430–454.
Kelsey, D. and S. le Roux (2015): “An experimental study on the effect of
ambiguity in a coordination game,” Theory and Decision, 79, 667–688.
Kocher, M. G. and M. Sutter (2006): “Time is money—Time pressure, in-
centives, and the quality of decision-making,” Journal of Economic Behavior and
Organization, 61, 375–392.
Kop´
anyi-Peuker, A. and M. Weber (2021): “Experience does not eliminate
bubbles: Experimental evidence,” The Review of Financial Studies, 34, 4450–
4485.
Korenok, O., D. Munro, and J. Chen (2023): “Inflation and attention thresh-
olds,” Review of Economics and Statistics, 1–28.
Kostyshyna, O. and L. Petersen (2024): “Communicating inflation uncer-
tainty and household expectations,” Working Paper no. 2023–63, Bank of Canada.
167
Kril, Z., D. Leiser, and A. Spivak (2016): “What determines the credibility
of the central bank of Israel in the public eye,” International Journal of Central
Banking, 12, 67–93.
Lerner, A. P. (1934): “The concept of monopoly and the measurement of
monopoly power,” Review of Economic Studies, 1, 157–175.
Li, C., U. Turmunkh, and P. P. Wakker (2020): “Social and strategic ambi-
guity versus betrayal aversion,” Games and Economic Behavior, 123, 272–287.
Loy, J.-P., C. R. Weiss, and T. Glauben (2016): “Asymmetric cost pass-
through? Empirical evidence on the role of market power, search and menu costs,”
Journal of Economic Behavior and Organization, 123, 184–192.
Marquardt, P., C. N. Noussair, and M. Weber (2019): “Rational expecta-
tions in an experimental asset market with shocks to market trends,” European
Economic Review, 114, 116–140.
Maskin, E. and J. Tirole (1988): “A theory of dynamic oligopoly, II: Price
competition, kinked demand curves, and Edgeworth cycles,” Econometrica, 56,
571–599.
Mauersberger, F. and R. Nagel (2018): “Levels of reasoning in Keynesian
Beauty Contests: a generative framework,” in Handbook of computational eco-
nomics, Elsevier, vol. 4, 541–634.
McMahon, M. and R. Rholes (2024): “Building Central Bank Credibility: The
Role of Forecast Performance,” .
Mellina, S. and T. Schmidt (2018): “The role of central bank knowledge
and trust for the public’s inflation expectations,” Discussion Paper no. 32/2018,
Deutsche Bundesbank.
M´
eon, P.-G. and B. Hayo (2023): “Preaching to the agnostic: Inflation reporting
can increase trust in the central bank but only among people with weak priors,”
Working Paper no. 10636, Center for Economic Studies & Ifo Institute.
Meyer, J. and S. von Cramon-Taubadel (2004): “Asymmetric price trans-
mission: a survey,” Journal of Agricultural Economics, 55, 581–611.
Milgrom, P. and J. Roberts (1990): “Rationalizability, learning, and equilib-
rium in games with strategic complementarities,” Econometrica, 58, 1255–1277.
Morgan, J., H. Orzen, and M. Sefton (2006): “An experimental study of
price dispersion,” Games and Economic Behavior, 54, 134–158.
168
Morris, S. and H. S. Shin (2003): “Global games: Theory and applications,”
in Advances in Economics and Econometrics, ed. by M. Dewatripont, L. Hansen,
and S. Turnovsky, Cambridge University Press, vol. 1, 56–114.
Muth, J. F. (1961): “Rational expectations and the theory of price movements,”
Econometrica, 29, 315–335.
Nagel, R. (1995): “Unraveling in guessing games: An experimental study,” Amer-
ican Economic Review, 85, 1313–1326.
Nagel, R., A. Brovelli, F. Heinemann, and G. Coricelli (2018): “Neu-
ral mechanisms mediating degrees of strategic uncertainty,” Social Cognitive and
Affective Neuroscience, 13, 52–62.
Offerman, T., J. Sonnemans, G. Van De Kuilen, and P. P. Wakker
(2009): “A truth serum for non-bayesians: Correcting proper scoring rules for
risk attitudes,” Review of Economic Studies, 76, 1461–1489.
Orzen, H. (2008): “Counterintuitive number effects in experimental oligopolies,”
Experimental Economics, 11, 390–401.
Peltzman, S. (2000): “Prices rise faster than they fall,” Journal of Political Econ-
omy, 108, 466–502.
Perdiguero-Garc´
ıa, J. (2013): “Symmetric or asymmetric oil prices? A meta-
analysis approach,” Energy policy, 57, 389–397.
Petersen, L. and A. Winn (2014): “Does money illusion matter? Comment,”
American Economic Review, 104, 1047–62.
Pfajfar, D. and B. ˇ
Zakelj (2014): “Experimental evidence on inflation expec-
tation formation,” Journal of Economic Dynamics and Control, 44, 147–168.
Pf¨
auti, O. (2023): “The inflation attention threshold and inflation surges,” arXiv
preprint arXiv:2308.09480.
Plonsky, O., K. Teodorescu, and I. Erev (2015): “Reliance on small samples,
the wavy recency effect, and similarity-based learning.” Psychological Review, 122,
621–647.
Potters, J. and S. Suetens (2009): “Cooperation in experimental games of
strategic complements and substitutes,” Review of Economic Studies, 76, 1125–
1147.
169
Prevost, M., M.-E. Carrier, G. Chowne, P. Zelkowitz, L. Joseph, and
I. Gold (2014): “The Reading the Mind in the Eyes test: Validation of a French
version and exploration of cultural variations in a multi-ethnic city,” Cognitive
Neuropsychiatry, 19, 189–204.
Reagan, P. B. and M. L. Weitzman (1982): “Asymmetries in price and quan-
tity adjustments by the competitive firm,” Journal of Economic Theory, 27, 410–
420.
Rebelo, S., M. Santana, and P. Teles (2024): “Behavioral Sticky Prices,”
Working Paper 32214, National Bureau of Economic Research.
Roth, C., S. Settele, and J. Wohlfart (2022): “Beliefs about public debt and
the demand for government spending,” Journal of Econometrics, 231, 165–187.
Roth, C. and J. Wohlfart (2020): “How do expectations about the macroe-
conomy affect personal expectations and behavior?” Review of Economics and
Statistics, 102, 731–748.
Salop, S. C. (1979): “Monopolistic competition with outside goods,” Bell Journal
of Economics, 10, 141–156.
Schotter, A. and I. Trevino (2014): “Belief elicitation in the laboratory,”
Annual Review of Economics, 6, 103–128.
Settele, S. (2022): “How do beliefs about the gender wage gap affect the demand
for public policy?” American Economic Journal: Economic Policy, 14, 475–508.
Shestakova, N., O. Powell, and D. Gladyrev (2019): “Bubbles, experience
and success,” Journal of Behavioral and Experimental Finance, 22, 206–213.
Smith, V. L. (1962): “An experimental study of competitive market behavior,”
Journal of Political Economy, 70, 111–137.
Smith, V. L., G. L. Suchanek, and A. W. Williams (1988): “Bubbles,
crashes, and endogenous expectations in experimental spot asset markets,” Econo-
metrica, 56, 1119–1151.
Sonnemans, J. and J. Tuinstra (2010): “Positive expectations feedback exper-
iments and number guessing games as models of financial markets,” Journal of
Economic Psychology, 31, 964–984.
Stahl, D. O. (1996): “Boundedly rational rule learning in a guessing game,”
Games and Economic Behavior, 16, 303–330.
170
Sutan, A. and M. Willinger (2009): “Guessing with negative feedback: An
experiment,” Journal of Economic Dynamics and Control, 33, 1123–1133.
Tversky, A. and D. Kahneman (1973): “Availability: A heuristic for judging
frequency and probability,” Cognitive Psychology, 5, 207–232.
Wang, O. and I. Werning (2022): “Dynamic oligopoly and price stickiness,”
American Economic Review, 112, 2815–2849.
Weber, M., B. Candia, T. Ropele, R. Lluberas, S. Frache, B. H. Meyer,
S. Kumar, Y. Gorodnichenko, D. Georgarakos, O. Coibion, et al.
(2023): “Tell me something I don’t already know: Learning in low and high-
inflation settings,” Working Paper no. 31485, National Bureau of Economic Re-
search.
World Health Organization (2018): “WHO methods and data sources for
country-level causes of death 2000–2016,” Tech. rep., Global Health Estimates
Technical Paper WHO/HMM/IER/GHE/2018.1, WHO, Geneva.
Yang, H. and L. Ye (2008): “Search with learning: understanding asymmetric
price adjustments,” RAND Journal of Economics, 39, 547–564.
Zhelobodko, E., S. Kokovin, M. Parenti, and J.-F. Thisse (2012): “Mo-
nopolistic competition: beyond the constant elasticity of substitution,” Econo-
metrica, 80, 2765–2784.
171
Publications
The following publications have been used as chapters of this dissertation:
1. Accepted manuscript: Bulutay, M., C. Cornand, and A. Zylbersztejn (2022).
“Learning to deal with repeated shocks under strategic complementarity: An
experiment,” Journal of Economic Behavior and Organization, 200, 1318–1343.
https://doi.org/10.1016/j.jebo.2020.05.023
2. Accepted manuscript: Bulutay, M., D. Hales, P. Julius, and W. Tasch (2021).
“Imperfect tacit collusion and asymmetric price transmission,” Journal of Eco-
nomic Behavior and Organization, 192, 584–599.
https://doi.org/10.1016/j.jebo.2021.10.018
3. Accepted manuscript: Bruttel, L., M. Bulutay, C. Cornand, F. Heinemann,
and A. Zylbersztejn (2023). “Measuring strategic-uncertainty attitudes,” Ex-
perimental Economics, 26, 522–549.
https://doi.org/10.1007/s10683-022-09779-2.
4. Preprint: Bulutay, M. (2024). “Better than Perceived? Correcting Mispercep-
tions about Central Bank Inflation Forecasts,” Working Paper no. 34, Berlin
School of Economics.
https://doi.org/10.48462/opus4-5355