Document [original]

Essays in Experimental Macroeconomics

vorgelegt von

Muhammed Bulutay, M.Res.

ORCID: 0000-0002-8148-8815

an der Fakult¨at VII - Wirtschaft und Management

der Technischen Universit¨at Berlin

zur Erlangung des akademischen Grades

Doktor der Wirtschaftswissenschaften

(Dr. rer. oec.)

genehmigte Dissertation

Promotionsausschuss:

Vorsitzende: Prof. Dr. Dorothea K¨ubler

Gutachter: Prof. Dr. Frank Heinemann

Gutachter: Prof. Dr. Georg Weizs¨acker

Tag der wissenschaftlichen Aussprache: 18. Juli 2024

Berlin 2024

Acknowledgements

I would like to begin by thanking my principle supervisor, Frank Heinemann, for

giving me the opportunity and autonomy to pursue my research aspirations. His

relentless rigor, patience, availability, and adeptness at seeing between the lines have

enriched my research. I am also grateful to Georg Weizs¨acker for agreeing to be the

second reader of this dissertation. His intellectual guidance has been elemental from

the very beginning of my doctoral studies. I could not have asked for better role

models in my career than my supervisors.

I am deeply indebted to my co-authors, Lisa Bruttel, Camille Cornand, Dave

Hales, Frank Heinemann, Patrick Julius, Weiwei Tasch, and Adam Zylbersztejn.

Their contributions have improved the chapters of this thesis and resulted in several

publications, of which I am proud. I am grateful to my colleagues, friends, and

mentors at the Berlin School of Economics for their feedback and support, includ-

ing Ciril Bosch-Rosa, Michael Burda, Francesco Capozza, Dirk Engelmann, Samuel

Fahim, G¨okhan Ider, Dorothea K¨ubler, Levent Neyse, and Verena Fuhr. I have

also received help and advice from many people outside of Berlin during seminars,

summer schools, and conferences, including John Duffy, Nicolas Jacquemet, Ryan

Rholes, and Tobias Schmidt.

Last but not least, I would like to express my gratitude to all those who gave

me moral support and encouragement when I needed it. My friends have made this

journey a joyful one. My mother S¸emsig¨ul and my brother Fatih have made me the

man I am and have never wavered in their support. My partner Lˆal Yolge¸cenli has

left her fingerprints on every part of this thesis, from brainstorming research ideas

to polishing every draft. Her calming presence has carried me through the ups and

downs. I dedicate this dissertation to her.

Summary

This dissertation consists of four essays that use controlled experiments to address

macroeconomic issues.

Chapter 1(coauthored with Camille Cornand and Adam Zylbersztejn) investi-

gates the robustness of experience effects in laboratory experiments. We compare

two scenarios in a beauty-contest game characterized by strategic complementarity:

A baseline in which the same shock hits the economy four times, and a treatment

in which the shocks are heterogeneous. We find that convergence to equilibrium

accelerates with each shock in both scenarios. We then run a horse-race exercise

between models of expectation formation.

Chapter 2(coauthored with David Hales, Patrick Julius, and Weiwei Tasch)

examines the sources of asymmetric price transmission. We show that prices respond

asymmetrically to cost shocks in laboratory markets where subjects act as producers

and demand is computerized. We show that subjects engage in tacit collusion, which

is accentuated after negative shocks and is not expectation driven. Increasing the

number of competitors from 3 to 10 does not significantly change the degree of

asymmetry.

Chapter 3(coauthored with Lisa Bruttel, Camille Cornand, Adam Zylbersztejn,

and Frank Heinemann) develops an experimental method for measuring strategic-

uncertainty attitudes. We compare certainty equivalents and beliefs across lotteries

where the payoff depends on either the outcome of a stag-hunt game, a market-entry

game, an ambiguous lottery, or a risky lottery. We use a model to determine whether

participants exhibit optimism/pessimism or strategic-uncertainty aversion/seeking.

We find that the median player is pessimistic in the stag-hunt game and optimistic

in the entry game, but neutral toward strategic uncertainty.

Chapter 4employs survey experiments to measure German households’ beliefs

about the European Central Bank’s inflation forecasts. I find that the accuracy

of these forecasts is severely underestimated. Information treatments are effective

in changing inflation expectations and uncertainty. A causal mechanism analysis

shows that intervention increases trust in the central bank, which in turn changes

inflation expectations.

Chapter 5concludes with a discussion of these results.

Zusammenfassung

Diese Dissertation umfasst vier Aufs¨atze, die kontrollierte Experimente zur Behand-

lung makro¨okonomischer Fragestellungen nutzen.

Kapitel 1(mitverfasst von Camille Cornand und Adam Zylbersztejn) untersucht

die Robustheit von Erfahrungseffekten in Laborversuchen. Wir vergleichen zwei

Szenarien in einem Sch¨onheitswettbewerbsspiel, das durch strategische Komplemen-

tarit¨at gekennzeichnet ist: ein Baseline-Szenario, in dem die gleiche ¨okonomische

Schockwirkung viermal auftritt, und ein Treatment, in dem die Schocks heterogen

sind. Wir stellen fest, dass die Konvergenz zum Gleichgewicht mit jedem Schock

schneller verl¨auft. Anschließend f¨uhren wir einen Vergleich der Vorhersagen zwis-

chen verschiedenen Modellen der Erwartungsbildung durch.

Kapitel 2(mitverfasst von David Hales, Patrick Julius und Weiwei Tasch) un-

tersucht die Ursachen asymmetrischer Preis¨ubertragungen. Wir zeigen, dass Preise

asymmetrisch auf Kostenschocks in Laborm¨arkten reagieren, in denen Probanden

als Produzenten agieren und die Nachfrage computergesteuert ist. Wir zeigen, dass

die Probanden stillschweigende Absprachen treffen, die sich nach negativen Schocks

verst¨arken und nicht erwartungsgesteuert sind. Eine Erh¨ohung der Anzahl der Wet-

tbewerber von 3 auf 10 ¨andert das Ausmaß der Asymmetrie nicht wesentlich.

Kapitel 3(mitverfasst von Lisa Bruttel, Camille Cornand, Adam Zylbersztejn

und Frank Heinemann) entwickelt eine experimentelle Methode zur Messung von

Einstellungen zur strategischen Unsicherheit. Wir vergleichen Sicherheits¨aquivalente

und Erwartungen von Probanden f¨ur Lotterien, deren Auszahlungen entweder vom

Ausgang eines stag hunt game, eines Markteintrittsspiels, einer Lotterie mit gegebe-

nen Wahrscheinlichkeiten und einer Lotterie mit ungewissen Wahrscheinlichkeiten.

Mithilfe eines Modells bestimmen wir, ob die Teilnehmer Optimismus/Pessimismus

oder eine Abneigung/Vorliebe gegen¨uber strategischer Unsicherheit zeigen. Wir

finden heraus, dass der durchschnittliche Spieler im stag hunt game pessimistisch

und im Eintrittsspiel optimistisch ist, jedoch neutral gegen¨uber strategischer Un-

sicherheit.

Kapitel 4verwendet Umfrageexperimente, um die ¨

Uberzeugungen deutscher

Haushalte bez¨uglich der Inflationsprognosen der Europ¨aischen Zentralbank zu messen.

Die Genauigkeit dieser Prognosen wird stark untersch¨atzt. Informations erweisen

III

sich als wirksam, um Inflationserwartungen und die damit verbundene Unsicherheit

zu ver¨andern. Eine Analyse des kausalen Mechanismus zeigt, dass die Intervention

das Vertrauen in die Zentralbank erh¨oht, was wiederum die Inflationserwartungen

ver¨andert.

Kapitel 5schließt mit einer Diskussion dieser Ergebnisse ab.

Contents

0 Introduction 1

1 Learning to Deal with Repeated Shocks under Strategic Comple-

mentarity: An Experiment 4

1.1 Introduction ................................ 4

1.2 Related Literature and Hypotheses ................... 6

1.3 Method .................................. 9

1.3.1 Guessing Game under Strategic Complementarity ....... 9

1.3.2 Experimental Design ....................... 10

1.3.3 Procedures ............................. 12

1.4 Results ................................... 14

1.4.1 Adjustment Dynamics ...................... 14

1.4.2 Expectation Formation ...................... 17

1.5 Discussion ................................. 24

1.6 Conclusion ................................. 25

A Appendix 28

A.1 Experimental Material .......................... 28

A.1.1 Instructions and Comprehension Questions Translated to En-

glish ................................ 28

A.1.2 Tests Translated to English ................... 31

A.1.3 Score Measurement, Procedures and References for Tests . . . 33

A.2 Additional Figures and Tables ...................... 34

A.3 Robustness Analyses ........................... 35

A.3.1 Calibration of Free Parameters in HSM ............. 35

A.3.2 Cognitive Skills and Individual Expectations .......... 37

2 Imperfect Tacit Collusion and Asymmetric Price Transmission 42

2.1 Introduction ................................ 42

2.2 Related Literature ............................ 46

2.2.1 Field Evidence .......................... 46

2.2.2 Theoretical Explanations ..................... 46

2.2.3 APT and Experiments ...................... 48

2.3 Method .................................. 50

2.3.1 Pricing Game ........................... 50

2.3.2 Experimental Design ....................... 52

2.3.3 Procedures ............................. 52

2.4 Hypotheses ................................ 53

2.5 Results ................................... 56

2.5.1 Estimation of the Asymmetry .................. 57

2.5.2 Market Power ........................... 60

2.5.3 Deviations from Best-Response ................. 63

2.6 Discussion ................................. 66

B Appendix 69

B.1 Quadratic Utility and Linear Demand .................. 69

B.2 Experimental Material .......................... 71

B.2.1 Instructions ............................ 71

B.2.2 Comprehension Questions .................... 73

B.2.3 Payoff Table and Experimental Interface ............ 74

B.3 Additional Analysis ............................ 77

B.3.1 Regression Results of Asymmetry ................ 77

B.3.2 Pass-through Rates ........................ 79

B.3.3 Non-parametric Test on Excess Market Power ......... 79

3 Measuring Strategic-Uncertainty Attitudes 81

3.1 Introduction ................................ 81

3.2 Related Literature ............................ 83

3.3 Experimental Design and Procedures .................. 86

3.3.1 Treatments ............................ 88

3.3.2 Implementation Details ...................... 91

3.4 Theoretical Framework .......................... 92

3.4.1 Identification and Uncertainty Attitudes ............ 94

3.4.2 Hypotheses ............................ 98

3.5 Results ................................... 98

3.5.1 Data Selection ........................... 99

3.5.2 Comparison of Certainty Equivalents ..............100

3.5.3 Main Results ...........................101

3.6 Conclusion .................................107

C Appendix 110

C.1 Instructions ................................110

C.1.1 Instructions for Games ......................114

C.1.2 Example of Comprehension Quiz for the Treatment . . . . . . 118

C.2 Additional Tables and Figures ......................119

C.3 Individual Underpinnings of Attitudes towards Uncertainty ......120

C.4 Screenshots ................................122

4 Better than Perceived? Correcting Misperceptions about Central

Bank Inflation Forecasts 124

4.1 Introduction ................................124

4.2 First Experiment .............................128

4.2.1 Design and Implementation ...................128

4.2.2 Results ...............................131

4.3 Second Experiment ............................137

4.3.1 Design and Implementation ...................137

4.3.2 Results ...............................139

4.4 Conclusion .................................142

D Appendix 144

D.1 Experimental Materials ..........................144

D.2 A Model of Belief Updating with Trust .................144

D.2.1 Model ...............................144

D.2.2 Hypotheses ............................146

D.3 Supporting Material ...........................147

D.3.1 First Experiment .........................147

D.3.2 Second Experiment ........................153

5 Conclusion 155

VII

List of Figures

1.1 Decision Screen .............................. 13

1.2 Guesses by treatments and rounds .................... 14

1.3 Median individual guesses ........................ 17

1.4 Impact of heuristics across periods ................... 23

A.1 Test picture in the RMET ........................ 33

A.2 Individual guesses across periods .................... 36

A.3 Cluster guesses across periods ...................... 36

A.4 Impact factors from various HSMs .................... 38

2.1 Average pricing behavior across periods and group sizes. ....... 56

2.2 Cumulative response to shocks ...................... 57

2.3 Excess market power across periods and group size .......... 61

2.4 Deviations from best-response action and errors in expectations . . . 64

B.1 Payoff matrix ............................... 74

B.2 Screen: Price setting ........................... 75

B.3 Screen: Guess setting ........................... 75

B.4 Screen: Feedback ............................. 76

3.1 Cumulative density functions of uncertainty attitude parameters across

conditions .................................107

C.1 Screen used in RISK treatment .....................123

C.2 Screen used in STRATEGICUNCERTAINTY treatment (stag-hunt

game) ...................................123

4.1 Flow of the first experiment .......................128

4.2 Beliefs about inaccuracy and the actual distribution of forecast errors 131

4.3 Directed acyclic graph showing the causal mechanisms ........134

4.4 Flow of the first experiment .......................138

4.5 Expected inflation and perceived inflation forecast ...........140

D.1 Expected inflation (for Germany) and perceived inflation forecast . . 154

VIII

List of Tables

1.1 Convergence in the previous experiments ................ 7

1.2 Experimental design parameters ..................... 11

1.3 Post-shock deviations from NE ..................... 16

1.4 The effect of heterogeneity in shocks on guesses of last post-shock phase 17

1.5 Description of selected expectation rules for comparison ........ 19

1.6 RMSE of expectation rules ........................ 24

A.1 Description of subject pool ........................ 34

A.2 Improvements from similarity refinement in the initial post-shock period 35

A.3 Average of impact factors in the HSM ................. 35

A.4 RMSE of HSM under different values of free parameters ........ 37

A.5 RMSE of expectation rules and individual characteristics ....... 40

A.6 Prediction errors and individual test scores ............... 41

2.1 Experimental design parameters ..................... 52

2.2 Asymmetry in the immediate pass-through rates ............ 59

2.3 Excess market power ........................... 62

2.4 Deviations from best-response ...................... 66

B.1 Estimation of asymmetry ......................... 78

B.2 Asymmetry in the pass-through rates after 14 periods ......... 79

B.3 Non-parametric test on excess market power .............. 80

3.1 Game 1 and associated payoffs ...................... 87

3.2 Game 2 and associated payoffs ...................... 87

3.3 Decision table in the RISK treatment .................. 90

3.4 Comparison of certainty equivalents ...................100

3.5 Summary of estimated uncertainty attitudes ..............102

3.6 Uncertainty attitudes across treatments: parametric estimates from

seemingly unrelated regressions .....................104

3.7 Results of mean testing across specifications ..............105

3.8 Nonparametric comparisons of uncertainty attitudes across treatments 106

C.1 Seemingly unrelated regressions with treatment order effects . . . . . 119

C.2 Nonparametric comparisons of strategic uncertainty attitudes across

treatments .................................119

C.3 Seemingly unrelated regressions with individual characteristics: re-

stricted sample ..............................121

C.4 Seemingly unrelated regressions with individual characteristics: un-

restricted sample .............................122

4.1 Treatment effects on learning and uncertainty .............133

4.2 Causal Mechanisms with 2SLS ......................135

4.3 Treatments effects on trust, attention, and consumption one month

later ....................................137

4.4 Statements used to indirectly measure trust in the ECB ........139

4.5 Perceived direction of the ECB’s forecast error and trust in the ECB 141

4.6 Effects of information treatments on the public’s perception of the ECB142

D.1 Demographic profile ...........................147

D.2 Ordered logistic regression on the ECB perception ...........148

D.3 Demographic characteristics of respondents by the level of mispercep-

tion ....................................149

D.4 Robustness regressions for reduced-form treatment effects: With con-

trols and without winsorization .....................150

D.5 Robustness regressions for the causal mechanisms: Only T4 ......151

D.6 Robustness regressions for the causal mechanisms: Only cjeu trust as

instrument .................................152

D.7 Demographic profile ...........................153

D.8 Correlation matrix for different facets of public trust in the ECB . . . 154

Chapter 0

Introduction

(I would) rather discover one cause than

gain the Kingdom of Persia.

—Democritus1

Many questions in macroeconomics, such as what drives (dis)inflation or how

fiscal stimulus affects economic outcomes, are causal in nature. The most credi-

ble approach to establishing causality is through controlled experiments, because

this method allows researchers to manipulate isolated variables at will. Tradition-

ally, many have viewed such a method as unavailable for macroeconomic research.

Despite these reservations, the utility of controlled experiments in macroeconomic

research is increasingly recognized (see Cornand and Heinemann 2019a;Hommes

2021 for reviews). Experiments can be used to evaluate models under ideal con-

ditions, to test behavioral assumptions underlying microfoundations, to estimate

otherwise unobservable parameters, and to resolve theoretical ambiguities, such as

those regarding equilibrium selection. This dissertation consists of four papers that

use controlled experiments, both in and out of the laboratory, to demonstrate how

experiments can help address macroeconomic issues.

Chapter 1(coauthored with Camille Cornand and Adam Zylbersztejn) con-

tributes to the long-standing debate about whether individual irrationality can have

aggregate effects. A common argument of those who claim that irrationality has

weak consequences is that biases cannot persist in the long run due to learning. In-

deed, a large literature shows that inertia in convergence to equilibrium in repeated

games fades as subjects gain experience with the environment. In this chapter,

we investigate the robustness of such experience effects, documented in controlled

experiments, to a real-world complication. Subjects in our experiments play a re-

1Democritus, fragment 118, in Ancilla to the Pre-Socratic Philosophers: A Complete Trans-

lation of the Fragments in Diels, Fragmente der Vorsokratiker, translated by Kathleen Freeman

(Cambridge: Harvard University Press, 1948).

peated beauty contest game characterized by strategic complementarity.2We shock

the Nash equilibrium in two ways. In the baseline, the Nash equilibrium is repeat-

edly shifted by the same shock, and in the treatment, the shocks are non-identical.

We argue that non-identical shocks lead to reduced transfer among adaptive learn-

ers. Our data do not support this hypothesis. Subjects in our treatment and control

groups learn to deal with shocks at similar rates. Thus, we confirm the robustness

of the experience effect to our particular real-world complication. The rest of the

chapter is devoted to determining the best fitting expectation formation model.

In chapter 2(coauthored with David Hales, Patrick Julius, and Weiwei Tasch),

we use another laboratory experiment to investigate the sources of the so-called

”rockets and feathers” effect. This effect refers to the observation that producer

prices rise rapidly but fall slowly in response to corresponding changes in marginal

costs. It is difficult to reconcile such an asymmetric response function with the

monopolistic competition at the core of most contemporary DSGE models. Many

explanations have been put forward to address this puzzle. Most attribute it to

market frictions, while some see imperfect competition as the root cause. However,

econometric tests based on observational data are inconclusive, especially due to the

tacit nature of collusion and the joint presence of frictions and imperfect competition.

In this experiment, we address these challenges by setting up a frictionless price-

setting game with exogenously determined marginal costs and number of producers.

Participants interact for a finite number of periods and experience positive and

negative shocks to the marginal cost of the unique good they sell, while demand is

computerized. We document asymmetric price transmission in most markets. This

asymmetry is robust to the number of sellers in a market, except in the case of

duopoly, where we observe near perfect collusion. In other markets, we also observe

periods of ”excess” market power that accentuate after negative cost shocks. We

use reported beliefs about other sellers’ prices to show that asymmetric increases in

market power are not driven by expectational errors.

Chapter 3(coauthored with Lisa Bruttel, Camille Cornand, Adam Zylbersztejn,

and Frank Heinemann) focuses on the role of strategic-uncertainty attitudes in

equilibrium selection in coordination games. Coordination problems underlie many

macroeconomic phenomena, including bank runs, debt crises, liquidity crises, and

speculative attacks. A major challenge is that standard theoretical frameworks, such

as Nash equilibrium, offer limited guidance on which equilibria are likely to prevail.

Strategic uncertainty can be used as a refinement tool, especially if individuals ex-

hibit systematic attitudes toward strategic uncertainty. However, such attitudes

2Mauersberger and Nagel (2018) argue that this game is analogous to New Keynesian dynamic-

stochastic general equilibrium (DSGE) models that incorporate sentiment (e.g., Angeletos and

La’o,2010).

are difficult to measure because one must compare the utility of a game with the

utility of an appropriately matched non-game situation and take beliefs and risk

attitudes into account. In this study, we develop a method with these properties.

We test the method experimentally on two coordination games that differ only in

their strategic environment: A stag-hunt game (complements) and a market-entry

game (substitutes). We measure certainty equivalents and beliefs for lotteries whose

payoff depends on either the outcome of a stag-hunt game, a market-entry game, an

ambiguous lottery, or a risky lottery. Using a model, we decompose differences in

certainty equivalents into either (a) optimism or pessimism about the probability of

achieving the desired payoff, or (b) a fixed (dis)utility caused by the source of un-

certainty. Overall, we find that participants exhibit optimism in the stag-hunt game

and pessimism in the entry game while they do not experience a flat (dis)utility due

to the game.

Chapter 4examines the sources of disagreement between households and central

banks about future inflation. Such disagreement is bad for central banks, which are

trying to anchor private expectations, and for households, which have lower quality

information. To understand the driving forces, I survey German households’ beliefs

about the European Central Bank’s (ECB’s) inflation forecasts by embedding two

modules in the Bundesbank’s Survey on Consumer Expectations in 2022 and 2023.

Notably, the survey measures perceived forecast accuracy and the expected direction

of forecast error. The results indicate a widespread underestimation of the ECB’s

inflation forecast accuracy. Households also consider the ECB’s forecast in 2022 to

be overly optimistic. Both beliefs are strongly correlated with self-reported trust

in the ECB, even after controlling for socio-economic characteristics. Challenging

misperceptions about forecast accuracy reduces households’ disagreement with the

ECB’s inflation forecast, reduces uncertainty about future inflation, improves self-

reported trust in the ECB, and has some effect on consumption plans. Further

analysis shows that trust in the ECB is causally related to inflation expectations.

The remainder of this chapter is devoted to understanding which specific aspects

of public opinion about the ECB are changed by the communication of inflation

forecasts and how they are related to trust.

The last chapter provides a brief discussion of these papers and an outlook on

future research.

Chapter 1

Learning to Deal with Repeated

Shocks under Strategic

Complementarity: An Experiment

An earlier version of this chapter is published as: Bulutay, M., Cornand, C. and

Zylbersztejn, A., 2022. Learning to deal with repeated shocks under strategic com-

plementarity: An experiment. Journal of Economic Behavior & Organization, 200,

pp.1318-1343.

1.1 Introduction

How long would it take for market outcomes to fully adjust to the new equilibrium

level in response to an exogenous shock? In a seminal paper on the rational expecta-

tions (RE) hypothesis, Muth (1961) demonstrates that convergence to equilibrium

is instantaneous in a frictionless economy if the errors in agents’ expectations are

not highly correlated as they cancel out at the aggregate level. However, the em-

pirical evidence points to systematic errors due to heuristic-based reasoning under

which the aggregate outcomes may exhibit substantial inertia. Whether and how

the adjustment would be delayed in the presence of nonrational expectations is a

key question for policy-makers – central banks that aim at engineering structural

changes – and for actors in markets where equilibrium is frequently shifting due to

shocks. If adjustment is sluggish and shocks occur frequently, aggregates may rarely

be in accordance with the equilibrium path predictions generated by the impulse-

response analyses of RE-based models.

Early experimental evidence from double auctions shows that equilibrium prices

emerge within a few periods (Smith,1962). Convergence occurs even in the presence

of zero-intelligence computer traders who submit random bids and asks if these bids

are constrained with a budget (Gode and Sunder,1993). Nonetheless, persistent

deviations from equilibrium are reported in different types of competitive markets

(e.g., asset market experiments, AMEs henceforth, Smith et al. 1988). Thus, the

extent to which limited rationality influences market outcomes depends on the char-

acteristics of the market.

The type of strategic environment governing the market is one of the key char-

acteristics determining the impact of limited rationality on behavior and outcomes.

Following the theoretical work of Haltiwanger and Waldman (1985,1989), Fehr and

Tyran (2005,2008) experimentally test the role of the strategic environment on the

adjustment dynamics after a monetary shock. In accordance with the theoretical

predictions, the adjustment is immediate when actions are strategic substitutes,

and gradual when actions are strategic complements. The role of the strategic

environment has been further experimentally investigated in Learning-to-Forecast

Experiments (LtFEs, Heemeijer et al. 2009,Bao et al. 2012), guessing games (Sutan

and Willinger 2009,Cooper et al. 2017,Hanaki et al. 2019) and duopoly games

(Potters and Suetens,2009).1The main pattern emerging from these studies is that

deviations from equilibrium tend to be larger and more persistent under strategic

complementarity as compared to strategic substitutability.2

Herein, we focus on strategic complementarity which comes as an important

feature of various economic contexts including macroeconomic coordination, bank

runs, and oligopoly competition.3As argued by Hommes (2006), strategic comple-

mentarity is crucial for modeling asset markets characterized by a positive feedback

mechanism between expectations on asset prices and the realizations of these prices.

The literature still lacks consensus on how repeated shocks (whether they are

identical or not) could affect adjustment under strategic complementarity. On the

one hand, the initial deviations from RE may subsequently disappear due to expe-

rience effects, as commonly reported in AMEs (e.g., Smith et al. 1988). In a recent

study, Cooper et al. (2017) show that these results can be extended to guessing

games. They introduce three identical shocks into Nash equilibrium (NE) in a peri-

odic manner and report slight acceleration in the adjustment speed over shocks. On

the other hand, experimental studies based on AMEs and LtFEs question the robust-

ness of experience effects (Kop´anyi-Peuker and Weber 2021;Shestakova et al. 2019).

Hussam et al. (2008) argue that experience effects critically rely on the stationarity

of the environment. Accordingly, both Cooper et al. (2017, p. 207) and Fehr and

Tyran (2008, p. 387) conjecture that in case of repeated nonidentical shocks, the

impact of nonrational expectations would persist. However, neither paper provides

1See Hommes (2011) and Arifovic and Duffy (2018) for an overview of the Learning-to-Forecast

literature.

2Hanaki et al. (2019) are the first to term this phenomenon as the strategic environment effect.

3See Milgrom and Roberts (1990) for more examples.

an empirical test of this conjecture. Our work aims at filling this gap.

We experimentally test the conjecture of a relative persistence of nonidentical

shocks in a guessing game with strategic complementarity (based on Cooper et al.,

2017). We introduce large periodic negative shocks to the NE and compare adjust-

ment dynamics between two experimental conditions: one where shocks are identical

and another where they are not. During the first and last post-shock phases, the NE

are the same in both conditions. Through this design, we are able to measure the

treatment effect of experiencing nonidentical shocks (i) on the aggregate adjustment

speed, and (ii) on the way individuals form expectations. Related to (i), we find

that post-shock adjustment accelerates due to repetition. Compared to the initial

post-shock adjustment, it takes fewer periods for the adjustment to occur after fur-

ther shocks. However, we fail to identify a significant effect of nonidentical shocks

on the pace of adjustment. Related to (ii), our results show that experience may

not be enough to deplete na¨ıvety, at least not within four repetitions of the game.

Our contribution to the literature is twofold. Firstly, we document the robustness

of the findings of Cooper et al. (2017) in the context of identical shocks, and fur-

ther extend their findings to a more complex environment with nonidentical shocks.

Based on this experimental variation of negative shocks, we report that the inertia

in adjustment is a robust feature of markets governed by strategic complementarity

and that it does not depend on the stationarity of periodic shocks. Secondly, the

data on expectations across subjects and over time allow us to study the individual

underpinnings of the observed aggregate dynamics. To avoid arbitrariness in model

selection, we consider a wide range of backward-looking expectation rules and take

their predictions to the experimental data. This novel horse race exercise reveals

that upgrading expectation rules with similarity-based learning approach improves

their predictive power under identical shocks. Notably, the best performing model

is a simple nonparametric reformulation of na¨ıve expectations with similarity-based

learning (first proposed by Cooper et al.,2017). We discuss its behavioral founda-

tions and relate it to the previous literature.

The remainder of this chapter is organized as follows. Section 1.2 outlines our

research hypotheses. Section 1.3 presents our methodology: the guessing game and

the way we implement it in the lab. Section 1.4 summarizes the main results which

are then discussed in Section 1.5. Lastly, Section 1.6 concludes by summarizing the

main findings, as well as the implications and limitations of the study.

1.2 Related Literature and Hypotheses

Table 1.1: Convergence in the previous experiments

Study Type of Shock size Convergence

environment (in %) period

Fehr and Tyran (2001)1Pricing decision -67% & +100% 13 & 4

Fehr and Tyran (2008)2Pricing decision -50% 9

Davis and Korenok (2011)3Monopolistic competition +100% 21

Petersen and Winn (2014)1Pricing decision -67% & +92% 8 & 4

Cooper et al. (2017)4Guessing game -77% 8

1Shock as the change in average equilibrium price in the nominal treatment with human opponents.

2Shock as the change in average equilibrium price in nominal treatment.

3Shock as the change in monopolistically competitive prices in the BASE/PUB treatment. Prices remain significantly

different than competitive level in the first reported 20 post-shock periods.

4Shock as the change in the NE guess in first round.

Table 1.1 summarizes the evidence from experimental studies that investigate the

dynamics of convergence following large shocks when actions are strategic comple-

ments. Here, the ”Convergence period” reported in column 4 indicates the number

of periods for the general activity level (price, guess, etc.) to become statistically in-

distinguishable (at the 5% level) from the post-shock theoretical equilibrium value.4

The general pattern in those data is that convergence takes time when actions are

strategic complements. In particular, adjustment to the NE tends to be slow after

the initial shock, even though acceleration may still occur when markets are re-

peated (Cooper et al.,2017).5We expect to observe the same pattern in a slightly

modified environment.6

Hypothesis 1: When shocks are identical, adjustment to the NE is slow and

gradual after the initial shock, but accelerates over repetition of the same market.

Albeit robust in stationary environments, experience effects are argued to be

sensitive to the complexity of the environment. For instance, in the AME and LtFE

studies by Kop´anyi-Peuker and Weber (2021) and Hussam et al. (2008), bubbles

do not disappear despite repetition.7Hussam et al. (2008) also report that bub-

bles reignite even with twice-experienced subjects following drastic changes in the

environment (e.g., the amount of liquidity in the market). Moreover, Cooper et al.

4We retrieved the information about the convergence period directly from each article. Thus,

the table does not account for the differences in experimental designs and statistical methods used

across these studies.

5This is also a standard finding across AMEs. For instance, Smith et al. (1988), Dufwen-

berg et al. (2005) and Haruvy et al. (2007) show that repeating market interactions three times

eliminates bubbles.

6The scope of these modifications along with their rationale are described in Section 1.3.

7According to Kop´anyi-Peuker and Weber (2021), a possible explanation is that this occurs

because interactions in their experiment have an indefinite horizon.

(2017) and Fehr and Tyran (2008) conjecture that nonidentical shocks may thwart

experience effects which constitutes one rationale of our second hypothesis.8The

other rationale builds on Cooper et al. (2017)’s theoretical result of adjustment ac-

celeration under identical shocks. Extending their reasoning to nonidentical shocks,

we formulate the following hypothesis.

Hypothesis 2: The adjustment accelerates with market repetition at a slower

pace in the presence of nonidentical shocks than with identical shocks.

We now turn to the possible explanations of adjustment dynamics. Several

studies provide a descriptive explanation of the observed inertia based on nonrational

expectations. Yet, they strongly diverge in terms of the best fitting model. For

instance, Fehr and Tyran (2008) report that their data are best organized by a model

in which all agents exhibit na¨ıve expectations.9Cooper et al. (2017), in turn, obtain

the best fit with heterogeneous groups: one rational player and three nonrational

players whose expectations follow a version of na¨ıve expectations rule adapted to a

repeated shocks design. Other studies point to trend-following expectations (Haruvy

et al. 2007) or even RE (Marquardt et al. 2019) as best describing their experimental

evidence.

We note, however, that the aformentioned studies either do not compare the fit

of their model with other expectation rules, or only consider a relatively narrow set

of competing rules.10 More systematic comparisons exist in the LtFE literature.

The design of LtFEs is particularly well-suited for investigating expectations since

the experimental task is to forecast the prices one-period-ahead. Trend-following

has been repeatedly shown to outperform all the others under homogeneous ex-

pectations (Bao et al. 2012;Anufriev et al. 2013;Heemeijer et al. 2009). Pfajfar

and ˇ

Zakelj (2014) estimate the share of RE and simple expectations in their New

Keynesian LtFEs. They arrive to the conclusion that the RE (simple rules) cannot

be rejected for 30-45% (35-50%) of subjects. This finding has been subsequently

confirmed by Marquardt et al. (2019). In the context of the evolutionary heuristic

switching model (HSM, Anufriev and Hommes 2012), Cornea-Madeira et al. (2019)

8Bao et al. (2012) study large nonidentical shocks in LtFEs by introducing two large shocks

to the rational expectations (RE) equilibrium. However, their design does not propose a way to

test the effect of nonidentical shocks with respect to identical ones. To the best of our knowledge,

our study is the first to directly test the impact of the heterogeneity of shocks in a controlled

environment.

9In their model of fully adaptive expectations, players expect the outcome of the last period to

reoccur.

10For instance, Marquardt et al. (2019) only consider three models: myopic, trend and RE.

Moreover, the parameters of their trend model resemble what other studies denote as strong trend-

following rule (e.g., Anufriev and Hommes 2012). In Section 1.5, we discuss why strong trend-

following may not be a suitable rule for environments like AMEs.

estimate the weights of na¨ıve and fundamentalist rules in inflation expectations in

the U.S. inflation data spanning from 1968:Q4 to 2015:Q2. Despite a substantial

time variation, they find that 65% of individuals form na¨ıve expectations, the share

of which increases in reaction to large inflationary shocks, thus creating self-fulfilling

inflation persistence.

Based on this body of empirical literature, we conclude the following. First, the

best fitting expectation models vary across different experimental settings. Second,

for the experimental settings closest to ours (i.e., guessing games and LtFEs) simple

backward-looking expectation models outperform RE. This observation leads us to

our third hypothesis.

Hypothesis 3: Backward-looking expectation rules in the form of heuristics fit

the data better than RE.

Finally, we provide the first out-of-the-sample test of the relative performance of

the expectation rule proposed by Cooper et al. (2017). This rule seems promising in

the context of repeated shocks since it echoes the similarity-based learning approach

(Gilboa and Schmeidler,1995;Plonsky et al.,2015). Accordingly, a player expects

the outcome of the last period to reoccur in stable phases. After observing a shock,

the player reviews all the past periods and expects the outcome of the period fol-

lowing the previous occurence of the same shock. We denote this rule as similarity-

based na¨ıve expectations (SBNE). In the same vein, we extend the adaptive and

trend-following models and denote them respectively as similarity-based adaptive

expectations (SBAE) and similarity-based trend-following expectations (SBTF). We

conjecture that this class of rules best explains behavior under repeated identical

shock.

Hypothesis 4: Under identical shocks, the rules that are augmented with the

similarity-based learning approach provide the best fit to the experimental data.

1.3 Method

1.3.1 Guessing Game under Strategic Complementarity

To investigate whether repeating identical shocks improves the speed of adjustment,

and whether nonidentical shocks slow down this process, we refer to a repeated

guessing game under strategic complementarity that is adapted from Nagel (1995).

Our experimental game also resembles those used in LtFEs with positive feedback.

The fundamental difference between these two designs is that while guessing games

provide full information on the game structure (including the parameters), LtFEs

provide only qualitative information about the market structure. Nevertheless, Son-

nemans and Tuinstra (2010) show that convergence dynamics are similar when the

feedback parameters are equal.11

In each period t∈[1, T], a group of Nplayers simultaneously choose a number

(rounded up to two decimals) from the closed set pi,t ∈[0,100], where i= 1, . . . , N.

Each player ihas a target number yi,t that is calculated as

yi,t =b¯p−i,t +a+ξt,(1.1)

where ¯p−i,t is the average number chosen by the remaining players12 at period t,a

and bare positive constant numbers with b∈(0,1), and ξtis a deterministic large

shock which takes the values

ξt=





0,if t≤T/2

ξ, if t > T/2.

(1.2)

The constant term bgenerates strategic complementarity among the players’

actions. The player with the smallest guessing error |yi,t −pi,t |wins the fixed stage

game payoff F. In case of a tie, the payoff is equally split among the winners.

This game has a unique Nash equilibrium which corresponds to an interior so-

lution: pNE

t=a+ξt

(1−b).13 Here, pNE

tis invariant for the first T/2 periods, called the

pre-shock periods. The negative shock ¯

ξshifts the equilibrium downwards at period

T/2 + 1.14 These remaining periods with a new equilibrium are the post-shock peri-

ods. In addition, shocks are repeated: a sequence of Tperiods (pre- and post-shock)

is repeated over Rrounds.

1.3.2 Experimental Design

Our experimental manipulation consists in varying the value of ¯

ξrover rounds.

For identical shocks (baseline), ¯

ξr=¯

ξfor all r∈[1, R]. For nonidentical shocks

11See Sonnemans and Tuinstra (2010) for a detailed comparison of guessing games and LtFEs.

12Sutan and Willinger (2009) report that the inclusion of own guesses causes a significant amount

of confusion among subjects. Therefore, we opted for excluding player’s own guess from the target

formula.

13The proof is based on the iterated elimination of dominated strategies. See Nagel (1995)

for details. This equilibrium is also a RE equilibrium. Bray (1983) shows that when b < 1, a

misspecified expectation rule – ordinary least-squares learning – almost surely converges to the

RE equilibrium. However, she also emphasizes that this does not imply unbiased expectations. As

she notes (Bray 1982, pp. 330), “[r]ational expectations are, if anything, a long run rather than a

short run phenomenon.”

14Of course, future studies could consider alternative implementations, such as positive shocks.

(treatment), the size of the shock varies across rounds. Importantly, the equilibrium

solution outlined above applies to both cases, so that players are always incentivized

to play the NE.

The calibration of the experimental game is summarized in Table 1.2. A group

of 5 participants play the guessing game for 4 rounds, and each round is composed

of 16 periods. This yields a total of 64 guessing decisions per player. In period 9 of

each round, a negative shock ¯

ξshifts parameter afrom 15 to a value that depends

on the experimental condition.

In baseline, shocks are identical and the shock component equals −9 in every

rounds. In treatment, shocks are not identical and the shock component is charac-

terized by the sequence (−9,−6,−12,−9) in rounds (1,2,3,4). Thus, the post-shock

NE is the same in the first and last rounds in both conditions. This allows us to cap-

ture the effect of experiencing nonidentical shocks by comparing adjustment speeds

in round 4.

Table 1.2: Experimental design parameters

General parameters

Number of periods per round T= 16

Number of rounds per session R= 4

Group size N= 5

Stage game prize F= 4.40 euros

Slope of target formula b= 0.75

Pre-shock value of constant a= 15

Pre-shock equilibrium pNE

pre = 60

Post-shock equilibrum (pNE

post) Baseline Treatment

Round 1 24 24

Round 2 24 36

Round 3 24 12

Round 4 24 24

The design of this experiment closely follows Cooper et al. (2017), with some

noteworthy modifications. First, in their study, groups are composed of 4 players.

Following Hanaki et al. (2019), we increase the number of players to 5 per group.15

Second, in Cooper et al. (2017) there are three rounds of 20 periods. We decrease the

length of each round to 16 periods to be able to add one additional round without

15They show that the effect of strategic environment is statistically significant for groups of five

or more agents.

extending the duration of the experiment excessively. Third, since post-shock phases

are shorter in our study, the post-shock equilibrium in baseline groups is set at a

higher level. Finally, Cooper et al. (2017) elicit expectations of subjects in addition

to their guesses while we elicit expectations jointly with the guesses. An advantage

of our method compared to the previous study of Cooper et al. (2017) is that it

ensures consistency between guess and expectations and thus reduces the scope of

decision-making errors.16

We implement a fixed matching protocol within each round, and a random re-

matching protocol between rounds. To reduce the scope of session effects due to

random rematching, in each session we divide each group of twenty participants

into two equal and permanent rematching clusters. Random rematching only oc-

curs within a rematching cluster which makes observations potentially correlated

within a cluster, but strictly independent between clusters.

1.3.3 Procedures

Experimental sessions were conducted at the GATE-Lab in Lyon by using z-Tree

(Fischbacher,2007).17 120 participants were recruited for 6 sessions in October

2019. Each session has 20 subjects recruited through a between-subjects design

and divided into two separate rematching clusters of ten players.18 This yields six

independent clusters of observations per condition, and twelve clusters in total.

Subjects are provided with the instructions of the game in paper form that are

read aloud by the experimenter.19 These instructions specify all the rules of the

game except the values of shocks. Participants are informed that this value will be

displayed on the decision screen, may be subject to variation during the experiment,

and that in a given period it remains the same for everyone. Once the instructions are

read, subjects are asked to answer nine comprehension questions displayed on their

screens. They are also informed about the correct answers with brief explanations.

In the main part of the experiment, each participant makes a series of 64 guessing

16The elicitation method is explained in Section 1.3.Ex ante, the design of our baseline condition

provides more suitable circumstances to observe rapid adjustment than the one in Cooper et al.

(2017), since we have an extra round and a belief elicitation mechanism that emphasizes best

replying to one’s expectations. However, the data suggest that the patterns of adjustment are

similar in both experiments.

17The experimental procedures have been approved by the GATE-Lab Review Board.

18See Appendix A.1 for the experimental materials and Appendix A.2 for a description of the

subject pool.

19Before reading the instructions and playing the game, subjects solve a series of questionnaires

which contain cognitive reflection test (CRT, Frederick 2005), reading the mind in the eyes test

(RMET, Baron-Cohen et al. 2001), a short-term memory test (STM) adapted from Wechsler digit

span test and seven questions designed to measure subjects’ propensity to reason in a heuristic

manner. We use these scores to test the sensitivity of our findings with respect to several individual

characteristics. This analysis is reported in Appendix A.3. The content and the measurement

method of each test are provided in Appendix A.1.

decisions. Each time, subjects first see a decision screen (see Figure 1.1) where they

enter their guesses. For a given guess, the computer automatically provides the

corresponding expectation of the average guess of others in their group. After seeing

these expectations, subjects can either revise or confirm their guesses.20 After each

decision screen, subjects pass to the feedback screen where they receive feedback on

the realized target number and their own payoffs.

Figure 1.1: Decision Screen

Notes: Example of a decision screen used in the experiment.

To smoothen the game, in each period the decision screen has a nonbinding

timer set to 60 seconds (except for the initial period of the game which has 120

seconds).21 Both screens also display the ongoing period and round, and provide a

summary of the previous outcomes: a figure representing the time series of previous

guesses and realized target numbers as well as a historical period-by-period table

containing own expectations and the actual average guesses of others, as well as own

stage game payoffs. To minimize any potential wealth or end-game effects, the final

payoff correponds to the payoffs accumulated in all periods of a randomly chosen

round of the game. At the end, subjects also reply to a demographic questionnaire.

20This method provides a consistent way for joint elicitation of guesses and expectations. Some

of the previous studies document systematic inconsistencies between expectations and decisions

that are elicited separately (Costa-Gomes and Weizs¨acker,2008). Moreover, LtFEs and AMEs

studies show that having both forecasting and optimizing tasks may be detrimental to learning

and cause mispricing (Bao et al.,2013;Hanaki et al.,2018).

21Kocher and Sutter (2006) show that average decision time in the first period of guessing games

is around 50 seconds and decreases gradually until the range 10-15 seconds after 20 periods. Thus,

this feature of our design should not put participants under excessive time pressure.

An experimental session took 150 minutes on average. Subjects were paid 7

euros for their participation and 14.08 euros on average for the experimental game.

1.4 Results

First, we analyze the group-level deviations from the NE and measure the adjust-

ment speed across rounds and experimental conditions. We rely on the statistical

framework previously adopted by Cooper et al. (2017). Second, we investigate the

within-period variation of individual guesses across experimental conditions. Lastly,

we compare the descriptive power of various expectation models by their one-period-

ahead forecast accuracy, as measured through the root-mean-squared-error (RMSE).

We also evaluate the changes in performance of models across rounds by comput-

ing their impact factors in an evolutionary learning model – the heuristic switching

model (HSM). This allows us to investigate whether the observed acceleration is due

to an increase in the share of subjects forming RE, or rather due to the adaptive

dynamics of simple expectation rules.

1.4.1 Adjustment Dynamics

Figure 1.2: Guesses by treatments and rounds

Notes: Dotted lines represent NE. In each round, a dot (triangle) corresponds to the median value of the average

group guesses in the baseline (treatment) condition.

We first turn to the aggregate outcomes and look at the evolution of the average

group guesses over time. Figure 1.2 summarizes the observed median values of

average group guesses (with 12 groups per experimental condition) across rounds

and periods. As expected, round 1 – in which the environment remains strictly

identical in both experimental conditions – generates the same patterns in the data

in baseline and treatment. In particular, in the pre-shock phase of round 1 this

median never fully converges to the NE level.22 The second salient observation is

that the convergence to the NE following shocks systematically exhibits a convex

pattern, but happens at varying speed.23

For a formal statistical comparison of the patterns of convergence in both ex-

perimental conditions, we use median quantile regression to estimate the following

model:24

¯pg,t −pNE

t=a0+atI[Period =t] + ϵg,t,(1.3)

where the dependent variable is the difference between the average group guess

(¯pg,t) and the NE guess (pNE) in a given period t, while the independent variables

are 63 period indicators (I[Period =t] equals 1 for period t, and 0 otherwise). The

coefficient of the final period is dropped to avoid linear dependencies.25 We run this

regression separately for baseline and treatment. Table 1.3 reports a subset of the

estimated coefficients corresponding to the post-shock periods in all four rounds of

the baseline and in rounds 1 and 4 of the treatment (in which the NE is the same

as in the baseline).

This model allows us to analyze the patterns of adjustment to the NE in two

steps. First, the intercept a0provides an empirical benchmark for convergence,

i.e., the ability to adjust guesses to the NE play as measured by the degree of

convergence to the NE that can be reached after 63 periods and with a full scope of

experience accumulation and learning (that we further investigate in Section 1.4.2)

in the experimental game. Testing whether this empirical benchmark differs from the

formal prediction based on the NE (H0: ¯p64 −pNE

64 = 0) boils down to testing for the

statistical significance of a0. For both baseline and treatment, the estimated values

of a0are similar and very close to zero (-0.38 and -0.42, respectively); in neither case

we reject their nullity (p= 0.133 and p= 0.285, respectively). Second, we build

on this first step and define the speed of adjustment as the earliest period in which

the outcome attains (in statistical terms) our empirical benchmark of convergence.

This period (denoted tc) indicates the point of reaching the adjustment benchmark:

for each period t≥tcof a given phase we fail to reject H0:at= 0.26

22We note that Cooper et al. (2017) observe the same pattern of adjustment under identical

shocks.

23The evolution of guesses in the pre-shock periods of rounds 3 and 4 also suggests that the path

of shocks experienced in the past does not affect per se the adjustment to the NE. Despite the

different sizes of shocks in round 2 between baseline and treatment, the adjustment to the NE in

the pre-shock phase of the following round is identical in both conditions. The same holds for the

adjustment to the NE in the pre-shock phase in round 4.

24This approach closely follows Cooper et al. (2017) and minimizes the role of potential outliers.

25The estimated standard errors are based on clustering with bootstrap resampling to take into

account the possible correlation between guesses within a rematching cluster. The number of

bootstrap samples follows Davidson and MacKinnon (2000).

26This definition echoes the definition of convergence proposed by Hyndman et al. (2012).

Table 1.3: Post-shock deviations from NE

Post-shock Baseline Treatment

periods Round 1 Round 2 Round 3 Round 4 Round 1 Round 4

122.76*** 22.08*** 19.31*** 18.38*** 24.89*** 20.86***

2 17.88*** 16.13*** 13.19*** 9.46*** 17.95*** 12.32***

3 13.93*** 10.13*** 8.38*** 4.57*** 13.15*** 5.70***

4 7.39*** 6.38*** 4.66*** 1.51*** 9.32*** 1.64*

5 5.22** 3.02*** 2.38*** -0.22 6.67** 0.17

6 1.64 1.14 0.99 -0.54 3.50 -0.82

7 3.40 1.26 0.37 -0.44* 3.98 -0.80*

8 2.65 0.74 0.15 -0.38 3.18 -0.42

Notes: Coefficients atand intercept term a0(in italics) from a median quantile regression model specified in

(3). Standard errors are clustered at rematching cluster level (6 clusters per condition) and bootstrapped with

1999 replications. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.

In accordance with Hypothesis 1, we observe gradual adjustment in the first

round of both baseline and treatment conditions. Adjustment accelerates across

rounds, but this acceleration is not necessarily monotonic. The adjustment periods

tcare respectively {6,6,6,5}in baseline and {6,4}in the first and the last rounds of

treatment when a 5% significance level is considered. Initial deviations remain high

in each post-shock phase, indicating that experience does not prevent deviations

despite repetition over four rounds.

Result 1a: In both experimental conditions, after the initial shock guesses grad-

ually adjust to the post-shock NE in a convex manner.

Result 1b: In both experimental conditions, adjustment occurs earlier in re-

sponse to the last shock (round 4) compared to the initial shock (round 1).

Notwithstanding our Hypothesis 2, these results indicate that the number of

periods required for adjustment in round 4 is similar in both conditions. We also

propose another way to test (and eventually reject) Hypothesis 2. First, we in-

vestigate the within-period variation of guesses between the two conditions. We

estimate the following median quantile regression model separately for each of the

eight post-shock periods of round 4:

pi=b0+b11[Treatment] + ϵi,(1.4)

where the independent variable 1[Treatment] = 1 for observations coming from the

treatment sessions (and 0 otherwise), and the dependent variable is the individual

guess (N= 120 per regression). Like before, we employ bootstrapped standard

errors clustered at rematching cluster level (999 replications). The coefficients b1

(reported in Table 1.4) remain insignificant in each of the eight models, suggesting

that the evolution of guesses over time (and thus their gradual adjustment to the

NE) in the final periods is statistically indistinguishable in both conditions. Figure

1.3 provides a visual representation of this pattern.

Table 1.4: The effect of heterogeneity in shocks on guesses of last post-shock phase

Periods 57 58 59 60 61 62 63 64

b039.70*** 32.89*** 28.40*** 25.10*** 23.49*** 23.07*** 23.21*** 23.60***

(1.54) (1.08) (0.50) (0.47) (0.44) (0.31) (0.11) (0.11)

b14.60* 3.21 1.35 -0.10 0.15 -0.67 -0.41 -0.35

(2.70) (2.02) (1.19) (0.94) (1.09) (0.79) (0.88) (0.71)

Notes: Coefficients from median quantile regression models specified in (4). Each coefficient comes from one regression (N= 120 per

regression). Below coefficients standard errors are reported in parentheses. Standard errors are clustered at the rematching cluster level (6

clusters per condition) and bootstrapped with 999 replications. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.

Figure 1.3: Median individual guesses

Notes: Median individual guess by experimental conditions for periods 55-64. Dots represent the NE. Circles

(triangles) represent guesses in baseline (treatment) condition. Whiskers denote standard deviations. Unit of

observation is the individual guesses.

Result 2: As compared to identical shocks, nonidentical shocks do not cause a

significant slowdown in adjustment.

1.4.2 Expectation Formation

In this section, we exploit the data on expectations retrieved from (and consistent

with) the guesses, as previously explained in Section 1.3.3. We consider a set of

expectation rules to provide a descriptive explanation for the observed aggregate

results.27 These rules, summarized in Table 1.5, are mainly derived from two classes

of learning models: adaptive (rules 1 to 6) and extrapolative (rules 7 to 13). Under

the data generation process described in equation 1.1, the adaptive learning can be

represented in a recursive form with the following formula:

i,t =pe

i,t−1+w(p−i,t−1−pe

i,t−1),(1.5)

where a player expects the weighted average of the most recent outcome and his/her

own previous expectation. In the extrapolative expectations, a player tracks the

most recent change in the realized outcome in the following manner:

i,t =p−i,t−1+γ(p−i,t−1−p−i,t−2).(1.6)

The coefficients wand γare learning parameters. We predetermine these parame-

ters based on their computational ease as an attempt to imitate different kinds of

boundedly rational reasoning. We also include two models (rules 6 and 13) where

parameters wand γare estimated from the individual expectations data with fixed

effects regressions and one equilibrium model (rule 14) where a player’s expectation

corresponds to the NE.28

The next three rules (15 to 17) reformulate rules 1, 3 and 9 (respectively) with

similarity-based learning. Under rule 15, for instance, a player expects the outcome

of last period to reoccur as in the case of na¨ıve expectations if the parameters in the

target formula did not change (i.e., if there were no shock). Once the target formula

has changed, the player reviews all the past periods and the new expectation now

coincides with the outcome of the most recent period involving an analogous change

t−m, such that ∆ξt= ∆ξt−mwhere mrefers to the distance between current period

and its most recent analogue from the past. If there is no such analogue, the player

simply expects the outcome of the previous period. Rules 16 and 17 apply the same

logic to adaptive (v1.) and weak trend-following rules.29

27Note that by taking the conditional expectation on both sides, equation 1.1 can be rewritten

as pi,t =Ei,t−1[pi,t] = (a+ξt) + bEh

i,t−1[p−i,t]. Thus, subjects choose their guesses pi,t as best

response to their one-period-ahead expectation about the average guess of other players in the

group that is Eh

i,t−1[p−i,t]. The superscript his placed to indicate that the process of expectation

formation is more general than RE and may be based on any rule h. For the sake of simplicity,

throughout the chapter we use the notation pe

i,t instead of Eh

i,t−1[p−i,t].

28Note that fundamentalism is not equivalent to RE since it ignores the fact that other play-

ers might be nonrational. For these two to be equivalent, one needs to assume homogeneous

expectations and common knowledge of rationality.

29For the sake of illustration, suppose that a subject uses rule 17. In period 25 of the baseline

game, s/he should extrapolate the change between periods 8 and 9 rather than between 23 and 24

(as it would the case in normal trend rule). Following this logic, the similarity-based reformulation

changes expectations in periods 25, 33, 41, 49, 57 for identical shocks, and for period 57 for the

nonidentical ones.

Table 1.5: Description of selected expectation rules for comparison

No Description Functional form

1Na¨ıve exp. pe

i,t =p−i,t−1

2 Obstinacy pe

i,t =pe

i,t−1

3 Adaptive exp. v1. pe

i,t = 0.75p−i,t−1+ 0.25pe

i,t−1

4 Adaptive exp. v2. pe

i,t = 0.50p−i,t−1+ 0.50pe

i,t−1

5 Adaptive exp. v3. pe

i,t = 0.25p−i,t−1+ 0.75pe

i,t−1

6 Fitted adaptive exp. pe

i,t = 0.89p−i,t−1+ 0.11pe

i,t−1

7 Strong trend-following exp. pe

i,t =p−i,t−1+ 0.75(p−i,t−1−p−i,t−2)

8 Medium trend-following exp. pe

i,t =p−i,t−1+ 0.50(p−i,t−1−p−i,t−2)

9 Weak trend-following exp. pe

i,t =p−i,t−1+ 0.25(p−i,t−1−p−i,t−2)

10 Strong contrarian exp. pe

i,t =p−i,t−1−0.75(p−i,t−1−p−i,t−2)

11 Medium contrarian exp. pe

i,t =p−i,t−1−0.50(p−i,t−1−p−i,t−2)

12 Weak contrarian exp. pe

i,t =p−i,t−1−0.25(p−i,t−1−p−i,t−2)

13 Fitted extrapolative exp. pe

i,t =p−i,t−1+ 0.08(p−i,t−1−p−i,t−2)

14 Fundamentalist pe

i,t =pNE

15 SBNE pe

i,t =





p−i,t−1,if ξt=ξt−1

p−i,t−m,if ξt=ξt−1

16 SBAE pe

i,t =





0.75p−i,t−1+ 0.25pe

i,t−1,if ξt=ξt−1

0.75p−i,t−m+ 0.25pe

i,t−m,if ξt=ξt−1

17 SBTF pe

i,t =





p−i,t−1+ 0.25∆p−i,t−1,if ξt=ξt−1

p−i,t−m+ 0.25∆p−i,t−m,if ξt=ξt−1

Notes: Each description provides a rule for how players expect the average guess of other players one-period-

ahead.

These rules have been selected for two main reasons. First, they are commonplace

in the literature (see Section 1.2).30 Second, they are based on backward-looking

heuristics (with the exception of the fitted rules and rule 14), so that their functional

forms are easy to compute (e.g., rule 1). Finally, we have excluded level-ktype of

expectations in the vein of the rule learning model of Stahl (1996) and the cognitive

hierarchy model of Camerer et al. (2004), since there is no common prior through

which level-0 type can form imitation after the first period.31

The goodness of fit of a given rule to the experimental data is based on the

30Note that the nomenclature used to describe these rules may vary across fields. For instance,

our rule 1 is equivalent to Cournot play in standard game theory and to the random-walk-believing

in finance.

31Alternatively, we could assume that level-0 type always selects randomly; however, this would

be unreasonable when imitation dynamics and evolution over time are considered.

aggregate one-period-ahead forecast error that is computed as the root mean squared

error (RMSE):

RMSE(ph

t) = sP64

t=3 PG

g=1(ph

g,t −pe

g,t)2

62 ×G,(1.7)

where ph

g,t is the prediction of rule h∈ {1, ..., 17}for the average expectation of

group gin period tand pe

g,t is the actual average expectation of group gin period t.

Here, the superscript Gis the scale of RMSE. A lower value of RMSE points to a

better fit. We measure the RMSE in three different ways: at the rematching cluster

level (G= 2), for each experimental condition (G= 12), and for the pooled data

(G= 24). We exclude the data from the first two periods since certain rules require

at least two past observations.

Panel A of Table 1.6 reports the RMSE for each of the seventeen rules. For

baseline as well as pooled data, SBNE (rule 15) achieves the best fit. For the treat-

ment data, it is slightly outperformed by SBTF (rule 17). In line with Hypothesis

4, the rules that are augmented with similarity-based learning (rules 15 to 17) yield

a better fit than the remaining ones, with RMSE two times smaller than under the

worst performing fundamentalist rule. The last line in Panel C shows how much

switching to similarity-based learning improves the fits. As can be seen, the im-

provements emerge in the baseline condition in which identical shocks reoccur in

a periodic manner. In the treatment condition, only the first and the last shocks

are identical which leaves less space for applying similarity-based reasoning.32 At

the cluster level, the rules that are augmented with similarity-based reasoning pro-

vide the best fit for most (10 out of 12) rematching clusters. The data from the

baseline (treatment) rematching clusters are best organized by the rules that are

derived from adaptive (extrapolative) learning. Lastly, in line with Hypothesis 3,

fundamentalism (rule 14) is always the worst fitting model regardless of the data

aggregation level.

Result 3: Backward-looking expectation rules describe the expectations better

than the RE.

Result 4a: Reformulating the rules with similarity-based reasoning improves

their fit to the data, especially when shocks are identical.

Result 4b: Overall, the SBNE rule has the best fit among all homogeneous

expectation rules.

32This is most starkly observable when we compare the improvements in RMSE from refining

rules with similarity-based learning for period 57, i.e. the first post-shock period in round 4. More

details are provided in Appendix A.2.

Note that these comparisons rely on the central assumption that all players

refer to the same expectation rule (i.e., under the homogeneity of expectations).

There is, however, a wide range of evidence indicating heterogeneity in expectations

(Hommes,2011). Moreover, expectation rules may vary not only across individuals,

but also over time: a given rule may at first perform poorly, but later on become

more relevant due to the experience. For this reason, we first refer to an evolutionary

model of expectations: the HSM of Anufriev and Hommes (2012).33 According to

the HSM, agents choose the expectation rule from a set of heuristics, evaluate the

performance of each heuristic over time and switch to the heuristic that performs

best in terms of the forecasting error. Accordingly, the one-period-ahead expectation

of the HSM for group gis

pHSM

t+1 =

h=1

nh,tph

t+1,(1.8)

where nh,t is the impact factor of heuristic hat period t. This impact factor can be

interpreted as the weights attributed by agents to different heuristics. The impact

factor depends on the performance of the heuristic measured with the current and

past squared forecast errors

Uh,t =−(¯p−i,t −ph

t)2+ηUh,t−1,(1.9)

where η∈[0,1] is a free parameter representing the weight assigned to the past

performance compared to current. η= 0 implies that only the performance in the

most recent period matters. The impact of heuristic is updated through a discrete

choice model with asynchronous updating described by

nh,t =δnh,t−1+ (1 −δ)exp(βUh,t−1)

h=1 exp(βUh,t−1),(1.10)

where the impact of the expectation heuristic hat period tdepends on its accumu-

lated impact and its relative performance normalized with the sum of all competing

heuristics. There are two free parameters in (10). The first one, δ∈[0,1], rep-

resents the proportion of agents who do not update their heuristic each period, or

the individual inertia in beliefs. The second parameter, β > 0, represents agents’

sensitivity toward differences in performances.34

To compute expectations with HSM and compare fits, one must first determine

which expectation heuristics to include, and then set their initial impacts as well

as assign the values to free parameters η, δ, β. Following a common practice in

33Their model inherits its main features from Brock and Hommes (1997).

34β= 0 would imply equal impacts regardless of the differences in performances.

the literature, we consider four different heuristics. We consider three classes of

expectation rules – adaptive, extrapolative and similarity-based – and, for each

of them, choose the best-fitting specification (rules 3, 8, 15, respectively).35 We

also include an equilibrium-based rule: fundamentalism (rule 14). We assign the

initial impact factors equal to nh,3= 0.25 for all hand set the free parameters to

η= 0.1, δ = 0.4, β = 0.1. By trial and error, we discover that this combination of

free parameters fits the data best.36

The second heterogeneous expectation model we consider – MIXEXP – follows

Haltiwanger and Waldman (1985,1989) and is based on the assumption that each

group is composed of nrational players and 5 −nna¨ıve players. The main reason

for including it in our comparison is that it provides good descriptive fit to the

experimental data in Cooper et al. (2017). Like them, we consider a single rational

player per group (n= 1). The remaining players make na¨ıve forecasts according to

the SBNE. In each round, the subject whose prediction error in the first post-shock

period is the smallest in a given group is considered as the rational player in that

group.37 The rational player forecasts consistently with RE:

pRE

i,t = ¯p−i,t +ϵi,t,(1.11)

where ϵi,t ∼ N(0,1).

Panel B in Table 1.6 reports RMSE for each of the heterogeneous expectation

models.38 These results show that the HSM fits better than all the other rules, in-

cluding MIXEXP, at every data aggregation level. The improvement in fit compared

to the best homogeneous expectation rule ranges between 15% and 53% depending

on the scale and equals 32% at pooled level. Albeit not as well performing as the

HSM, MIXEXP performs better than all the homogeneous expectation rules which

makes it the second-best expectation model in our exercise.

Result 5a: Models with heterogeneous expectations better fit the data than

those with homogeneous expectations.

Result 5b: Overall, the HSM has the best fit among all the compared expecta-

tion models.

35Note that for the class of adaptive expectations, the best fitting rule is na¨ıve expectations

(rule 1). However, since SBNE (rule 15) is already a refined version of that rule, we include rule 3

instead.

36As a robustness check, in Appendix A.3 we provide RMSE obtained under various combinations

of free parameters. While the outcomes do vary in absolute terms, the relative standing of different

rules for a given period remains fairly stable.

37We are indebted to Michael Waldman for suggesting us this strategy.

38For the HSM, RMSE is computed for periods 4-64 where expectations are endogeneously

determined.

Figure 1.4: Impact of heuristics across periods

(a) Baseline

(b) Treatment

Notes: Impact factors of the HSM across periods and experimental conditions. The symbols circle, cross, square

and triangle represent respectively SBNE, adaptive expectations (v1), weak trend-following and fundamentalist

expectations.

Finally, Figure 1.4 shows the evolution of the impact factors of different heuristics

over time in the baseline and treatment conditions. Notwithstanding the claim of

”experience eliminates na¨ıvety”, for both experimental conditions we find that SBNE

attains the highest average impact factor in the final round of the game.39 On the

opposite extreme, fundamentalist expectations attract very low weights during the

first two rounds, but their role increases toward the final round.

39Appendix A.2 reports the average of impact factors separately for pre-shock and post-shock

phases.

Table 1.6: RMSE of expectation rules

Rule Baseline: cluster-by-cluster Treatment: cluster-by-cluster All-cluster

12345612 3 4 5 6 BT B+T

Panel A - Homogeneous expectation rules

16.45 6.59 5.71 7.24 5.80 8.18 6.71 6.23 6.53 6.50 7.21 6.56 6.71 6.63 6.67

2 7.55 7.02 6.49 8.14 7.14 8.28 7.55 7.48 7.86 7.56 7.80 7.57 7.47 7.64 7.55

3 6.65 6.58 5.78 7.31 6.03 8.05 6.83 6.42 6.74 6.66 7.12 6.66 6.78 6.74 6.76

4 6.91 6.65 5.93 7.49 6.34 8.02 7.01 6.71 7.04 6.90 7.20 6.87 6.93 6.96 6.94

5 7.21 6.80 6.17 7.77 6.71 8.10 7.26 7.06 7.42 7.20 7.43 7.17 7.16 7.26 7.21

6 6.53 6.58 5.73 7.25 5.89 8.11 6.75 6.30 6.61 6.56 7.15 6.59 6.73 6.66 6.70

7 7.64 8.33 7.43 9.30 6.97 9.85 8.40 7.75 6.84 7.69 9.39 7.78 8.32 8.01 8.17

8 6.68 7.33 6.45 8.24 6.04 8.83 7.32 6.77 6.15 6.74 8.34 6.87 7.33 7.07 7.20

9 6.25 6.71 5.83 7.52 5.62 8.24 6.72 6.23 6.04 6.31 7.58 6.44 6.76 6.57 6.67

10 9.87 9.00 8.07 9.12 9.09 10.85 9.74 9.02 10.32 9.90 8.68 9.59 9.37 9.56 9.47

11 8.41 7.84 6.94 8.10 7.69 9.58 8.36 7.75 8.81 8.45 7.80 8.26 8.13 8.25 8.19

12 7.22 6.99 6.11 7.44 6.55 8.65 7.30 6.77 7.51 7.26 7.29 7.21 7.20 7.23 7.22

13 6.64 6.67 5.79 7.25 5.99 8.28 6.84 6.35 6.80 6.69 7.19 6.72 6.82 6.77 6.79

14 10.02 10.10 11.25 14.29 11.55 13.80 10.68 12.32 12.47 11.07 17.07 13.65 11.95 13.05 12.51

15 4.61 4.58 4.00 6.62 4.08 6.13 6.14 6.23 6.47 6.58 7.26 6.47 5.10 6.54 5.86

16 4.82 4.55 3.99 6.61 4.18 5.99 6.29 6.40 6.66 6.72 7.16 6.57 5.11 6.64 5.93

17 4.91 5.09 4.56 7.13 4.31 6.42 6.11 6.27 6.00 6.48 7.64 6.36 5.50 6.50 6.02

Panel B - Heterogeneous expectation models

HSM 2.45 3.45 3.37 5.46 2.56 5.08 3.02 3.47 4.56 2.95 5.51 4.34 3.90 4.08 3.99

MIXEXP 4.13 3.94 3.44 5.63 3.38 5.15 4.71 5.62 5.09 5.12 6.31 5.34 4.36 5.39 4.90

Panel C - Changes in fit between models

∆HSM 47% 24% 16% 17% 37% 15% 51% 44% 24% 53% 23% 32% 24% 37% 32%

∆ SB 28% 31% 31% 10% 30% 26% 9% 0% 1% -3% 0% 1% 24% 1% 12%

Notes: RMSE calculated at different data aggregation levels. The first column indicates the number assigned to the rule

in Table 1.5 for panel A. Columns 2 to 7 are cluster numbers that were in baseline condition and columns 8 to 13 are

cluster numbers in the treatment sessions. The last three columns indicate the experimental conditions where ”B” refers

to baseline (G= 6), ”T” refers to treatment (G= 6) and ”B+T” refers to pooled data (G= 12). Bolded values in panel

A indicate the lowest RMSE among homogeneous expectation rules. HSM and MIXEXP in panel B refer respectively to

the RMSE of the HSM and of the model with one rational player. Panel C reports the changes in fits for three comparison

groups. ∆ HSM reports the improvement in fit by using the HSM compared to the best fitting homogeneous expectation

rule. ∆ SB reports the changes in fit between the best upgraded rule with similarity-based learning compared to the

nonupgraded version of the same rule. A negative value of RMSE implies a loss in fit.

1.5 Discussion

The results outlined in Section 1.4.2 reveal that the refinement of expectation rules

with similarity-based learning improves their fit considerably when shocks are iden-

tical. Notably, the SBNE rule performs well under both homogeneous and hetero-

geneous expectations. Under this rule, a player expects the last period’s outcome

to reoccur in stationary phases as it is the case in na¨ıve expectations. In case of a

change in the environment, this expectation rule points to the outcome of the most

recent period characterized by the same change.40 This rule is consistent with the

40The model used herein is based on a particular definition of similarity which rules out any dis-

crepancy between similar events. Note that this definition can be relaxed allowing for an arbitrary

degree of discrepancy between similar events.

Gilboa and Schmeidler (1995) case-based decision theory and the similarity-based

learning model of Plonsky et al. (2015), both of which suggest that agents choose

the action which generated the best outcome under similar circumstances that an

agent can recall from the past. Thus, the SBNE rule can be viewed as a combination

of the similarity-based learning process with na¨ıve heuristic that is applied to the

domain of expectations formation.

This framework proposes a potential explanation for why the trend-following

rule with a strong extrapolation parameter fits the data well in LtFEs and poorly in

guessing games like ours. The main difference between LtFEs and guessing games

is that while the LtFEs inform subjects only qualitatively on the data generating

process, guessing games provide quantitative information by disclosing the target

formula. If there are also unexpected large shocks as in the LtFEs of Bao et al.

(2012), the last period is less likely to be perceived as the most similar state. A

stronger extrapolation of recent changes may thus help detect the arrival of a shock.

This strong extrapolation also creates a self-fulfilling prophecy, since it endogenously

generates large oscillations around equilibrium. In guessing games, players may

judge the similarity with certainty so that there is less necessity for extrapolation.

The na¨ıve and weak trend-following rules therefore tend to perform better. For

instance, this may explain why a trend model with extrapolation factor 1 performs

poorly in the AMEs of Marquardt et al. (2019): weak trend-following and na¨ıve

heuristics create slow convergence toward NE, analogously to what we observe in

our experiment.

Lastly, the results in the second part of Section 1.4.2 indicate that allowing

heterogeneity in expectations through the HSM or MIXEXP improves the fits sub-

stantially in line with the evidence from the LtF literature.41 The evolution of

impact factors computed through HSM implies that there is more heterogeneity in

nonidentical shocks condition. This pattern – if proven to be robust – suggests

that the presence of real-world complications such as nonidentical shocks may make

coordination over an expectation rule more difficult.

1.6 Conclusion

We investigate the evolution of adjustment speed across repeated negative shocks

that can be either identical (baseline) or nonidentical (treatment). We ask whether

adjustment accelerates over repetitions and whether this acceleration varies across

the different types of shocks. For this experimental variation, we find that ad-

justment accelerates thanks to repetition, yet only slightly: despite four repetitions,

41This kind of heterogeneous expectation models are gaining momentum in policy applications.

See Cornea-Madeira et al. (2019) for an application of HSM and Hommes (2018) for an overview.

convergence speeds up by only two periods at best. Nonidentical shocks do not cause

significant slowdown in adjustment, and adjustment acceleration remains weak re-

gardless the type of shock. A descriptive analysis of the expectation formation

process reveals that the backward-looking rules organize the data well. In particu-

lar, rules refined with similarity-based learning approach outperform the others in

terms of predictive power.

Our experiment successfully documents the robustness of the finding of Cooper

et al. (2017) from a guessing game with strategic complementarity: a gradual and

convex adjustment in response to identical shocks and its acceleration over repeti-

tion. This evidence strengthens the empirical validity of the strategic environment

effect, in line with an early conjecture by Haltiwanger and Waldman (1985). Further-

more, these patterns of learning to play equilibrium under strategic complementarity

persist in a more complex environment with time-varying shocks. We fit a large set

of expectation rules to provide an individual-based explanation to the observed ag-

gregate dynamics. The SBNE rule, a simple learning rule first proposed by Cooper

et al. (2017) and not yet studied in a comparative analysis, outperforms other rules

in terms of descriptive accuracy.

The main implication of these findings is that inertia in adjustment may rather

persist over time. The fact that the type of shock does not affect behavioral dynam-

ics suggests that sluggishness is an inherent feature of strategic complementarity.

Importantly, our design also does not include any market frictions that are usually

considered as the main drivers of sticky behavior. This, in turn, suggests that cogni-

tive frictions such as nonrational expectations suffice to create stickiness, and poten-

tially opens the door for policy implications. Although an experimental testbed for

policy instruments is beyond the scope of this study, we note that monetary policy

interventions may prove to be effective.42

Despite its virtues, our study may also have certain limitations. Here, we set

the number of shocks to four and one may claim that this is not enough for a

major acceleration. While this might be a limitation of experiments in general, we

reckon that four rounds should be sufficient for observing accelerated adjustment in a

relatively simple environment like ours. Another design choice is to rematch subjects

in the beginning of each new round to control for factors that are accumulating

across rounds such as the degree of strategic uncertainty. This random rematching

mechanism might partially be the reason for limited acceleration and it may stand

42Cornand and Heinemann (2019b) show that in a New Keynesian framework, monetary policy

obeying the Taylor principle decreases the degree of complementarity between pricing decisions of

firms and even turns them into strategic substitutes if its effect on aggregate demand is sufficiently

strong. In a similar vein, Assenza et al. (2021) show through their New Keynesian LtFEs that the

Taylor principle with sufficiently strong interest rate rule (ξπ= 1.5 in their experiment) manages

convergence to the forward stable solution.

at odds with certain real-world environments, such as asset markets.43 Lastly, we

only look at negative shocks. Even though the sign of a shock should not matter

in guessing games, it may matter in a pricing context. We believe that varying the

form of the heterogeneity of shocks constitutes a possible agenda for future studies.

43Cooper et al. (2017) test this argument in an auxiliary treatment and find that the main results

are qualitatively unchanged. In the light of this result, the question of matching scheme should be

less of a concern.

Appendix A

Appendix

A.1 Experimental Material

A.1.1 Instructions and Comprehension Questions Translated

to English

General Instructions

The experiment has 4 rounds and each round has 16 periods, a total of 64

periods. At the beginning of the experiment, you will be randomly assigned in

groups of five. You will only interact with other players in your group. At the

beginning of each new round, the groups will be reconstituted in a random manner.

This means that you will play in the same group during a round, and that the

composition of your group will vary randomly from one round to another.

Your Task

At the beginning of each period, each participant will be asked to choose a

number between 0 and 100, inclusive. This number can be up to two decimals, such

as 11.35 or 95.23. No participant will be able to see the number chosen by another

participant.

In each period, each player has a ”Target Number”. At the end of each period,

the player in your group whose chosen number is closest to his or her target

number will win the prize of 4.40 Euros for that period. The other players will earn

0 euro for this period. If several players have the same distance from their target

number, the prize of 4.40 euros will be divided equally between these players, while

the others would win 0 euros.

The target number of each player is calculated using the following formula:

Target number = 0.75 ×(average of the numbers chosen by the other players in

your group) + a constant

Here, ”the average of the numbers chosen by the other players in your group”

is equal to the sum of the numbers chosen by the other players in your group

divided by four. This average is calculated in the same way for all participants

in the experiment. All participants will be informed about the constant through

their decision screen. This constant is the same for all participants but may change

during the experience. When a change occurs, this change will be announced to all

participants on their screen. Please check the formula for each period.

Decision Screen

The target number formula is going to be shown on the screen. On this screen,

you can enter your decision in a cell. When you click on the ”OK” button, the

program will show you the ”Average of the numbers chosen by the other players

in your group” for which your guessing decision corresponds to. After seeing this

information, you can change your decision as many times as you want. Once you

click on ”Confirm” button your decision for this period will be final.

Note that there is limited time for decisions in each period and you can track

the remaining time on your screen. You will have 120 seconds for your first decision

and 60 seconds for each decisions of the remaining periods. A table and a figure also

allow you to follow your previous decisions and the previous average decision

of the other players in your group.

Payment

At the end of the experiment, the computer will randomly select one of the

rounds played, and your final payment will be based on the payoffs that you have

accumulated during this round, plus 5 + 2 = 7 euros for participation and the

questionnaire you answered.

Comprehension Questions

You will now answer several questions designed to check whether you understood

the rules of the game. The button in the middle of the screen will allow you to access

a calculator when you need it.

True of False Questions:

Question 1: There are 4 other players in my group.

Question 2: I play with the same group of players throughout the experience.

Question 3: The formula for the target number may change during the experi-

ment.

Question 4: All players have their own formula for the target number.

Question 5: I will be paid based on my accumulated winnings during a randomly

chosen round.

Questions Based on an Example:

Imagine that the formula for the target number is equal to

Target number = 0.75 ×(Average of the numbers chosen by the other players)

+ 15

Question 6: If the other players in the group chose 10, 30, 35, 85 as decisions for

the target number, what do you think is the ”Average of the numbers chosen by the

other players in your group”?

Question 7: What would your target number be equal to in this situation?

Question 8: Imagine that you chose number 55 as the decision for this period.

What is the distance between the target number and your decision?

Question 9: In this example, the distances between the chosen numbers and the

target numbers for the other players are respectively: 47.18, 17.5, 0.31 and 41.87.

In this example, are you the winner?

Answers and Explanations Provided to the Subjects:

Question 1: True.

Explanation: There are 5 players in each group and 4 others when you are

excluded.

Question 2: False.

Explanation: At the beginning of each new round (17th, 33rd and 49th periods),

the groups will be reconstituted in a random manner. This means that you will play

with same group members during a round, and that the composition of your group

will vary from round to round.

Question 3: True.

Explanation: The formula for the target number may change. Please pay atten-

tion in each period.

Question 4: False.

Explanation: The formula for the target number for a given period is the same

for all players.

Question 5: True.

Explanation: At the end of the experiment, one of the four rounds will be ran-

domly selected and you will get your winnings that are accumulated during this

round.

Question 6: 40.

Explanation: The correct answer is 40. This is the average number of the other

players in the group, in this example: (10 + 30 + 35 + 85) / 4 = 40.

Question 7: 45.

Explanation: The correct answer is 45. The target number is calculated using

the formula for the target number: 0.75 x 40 + 15 = 45.

Question 8: 10.

Explanation: The correct answer is 10. The target number is 45 and you have

chosen 55. The distance between these two numbers is 10.

Question 9: No.

Explanation: Your distance (10) is not the smallest. 0.31 is the smallest distance

in this group.

A.1.2 Tests Translated to English

Note that subjects solve these tests before the main part of the experiment. So,

instructions presented here are the initial instructions that subjects see.

Initial Instructions

Welcome!

You will participate in an economic experiment. During this experiment, you are

not allowed to communicate with other participants. If you have a cell phone, please

turn it off. If you have a question, press the red button on your left or raise your

hand, the experimenter will come to see you; don’t ask your question out loud. If

the question is relevant to all participants, we will repeat it and answer it out loud.

If you do not respect these rules, we will have to exclude you from the experiment

and therefore from the payment.

All the information you provide, as well as the amount of your payoffs during

this experiment, will be kept strictly confidential and anonymous. Participating in

this experiment will gain you money. Your winnings will be paid to you privately

at the end of the experiment. You earn 5 euros for showing up on time, 2 euros

for answering a series of questions and an additional amount that varies between 0

and 70 euros. The additional payment depend on your decisions and may also be

influenced by decisions made by others.

First of all, before starting the actual experiment, we ask you to answer a series

of preliminary questions. You will answer these questions using the interface on

your computer screen.

CRT Questions

1) A notebook and a pencil cost 1.10 Euros. The notebook costs 1 Euro more

than the pencil. How many cents does the pencil cost? (correct answer: 0.05 cents)

2) Assuming that 5 machines take 5 minutes to make 5 pens, how long would it

take 100 machines to make 100 pens? (correct answer: 5)

3) In a lake, there is a patch of lily pads. Every day, the patch doubles in size.

If it takes 48 days for the patch to cover the entire lake, how long would it take for

the patch to cover half of the lake? (correct answer:47)

Representativeness Heuristic Questions

Please read the descriptions below and answer the questions.

Description 1: Linda is 31 years old, she is single, frank and very bright. She has

a master’s degree in philosophy. As a student, she was very concerned about issues

of discrimination and social justice, she also participated in anti-nuclear demonstra-

tions.

Please rank the following statements based on their likelihood, using 1 for the

most likely and 7 for the least likely:

1) Linda is a primary school teacher.

2) Linda works in a bookstore and takes yoga classes.

3) Linda is active in the feminist movement.

4) Linda is a bank teller.

5) Linda is a social worker in a psychiatric environment.

6) Linda is an insurance salesperson.

7) Linda is a bank teller and is active in the feminist movement.

Correct answer: If option 7 is judged more probable than options 3 and 4, then

the answer is correct.

Description 2: A certain city is served by two hospitals. About 45 babies are

born every day in the big hospital and about 15 babies are born every day in the

small hospital. As you know, about 50% of new born babies are boys. The exact

percentage of baby boys, however, varies from day to day. Sometimes it can be more

than 50%, sometimes less.

For a period of one year, each hospital recorded the days when more than 60%

of the babies born were boys. Which hospital do you think had the highest number

of such days?

A) The big hospital

B) The small hospital

C) Same for the two hospitals.

Correct answer: Option B is the correct answer.

Availability Heuristic Questions

Below, each item includes two possible causes of death. The question to which

you must answer is the following: among the two possible causes of death, which is

the most frequent, in general, in France? For each pair of possible causes of death,

(a) and (b), we want you to choose the cause that you think is the most common.

Pair 1 (a) Road accidents (b) Diabetes (correct answer: b)

Pair 2 (a) Homicide (b) Suicide (correct answer: b)

Pair 3 (a) Stroke (b) All accidents (correct answer: a)

Pair 4 (a) Falls (b) Drug use (correct answer: a)

Pair 5 (a) Lightning strike (b) Poisoning (correct answer: b)

Reading the Mind in the Eyes Test

You will now see a series of images presenting pairs of eyes, as well as 4 words.

For each pair of eyes, choose the word that best describes what the person in the

image thinks or feels. You may feel that more than one word may apply, but choose

only the word that you consider most appropriate. Before making your choice, make

sure that you have read the 4 propositions correctly.

There will be a total of 36 questions to be answered within 10 minutes. Try to

respond as quickly and accurately as possible. To respond, select one of the choices

displayed below the image and click OK at the bottom. Please note that once you

click OK, you will not be able to return to the previous questions.

Click OK to proceed to the test question.

Training question:

Figure A.1: Test picture in the RMET

Options: 1: jealous, 2: panicked, 3: arrogant, 4: hateful

Correct answer and explanations:

Among the choices: ”jealous”, ”panicked”, ”arrogant”, ”hateful”, the correct

answer was: ”panicked”. Click OK to start the other questions. From now on,

correct answers will no longer be shown.

As a reminder, there will be a total of 36 questions which you will have to answer

within 10 minutes. Try to respond as quickly and accurately as possible.

Short Term Memory Test

You will now look at several sets of slides. At the start of each series, you will

see the word ”Ready?” then a sequence of numbers appearing one after the other.

At the end of each series, you will enter in the input zone the sequence of numbers

that you will have observed.

The sequences will become longer as you enter correct answers. Your goal is to

go as far as possible. You will have two tries. For example, if in a series you see 1,

then 3 and then 5, you would enter 135.

First sequence: 69, 929, 1021, 34634, 943453, 7374865, 69358267, 690875725,

6457803021, 26456897198, 601518340985, 1285246589042.

Second sequence: 25, 217, 8618, 48629, 727240, 1203439, 32904142, 750572970,

1720378975, 62617825067, 609295956490, 1678606889148.

A.1.3 Score Measurement, Procedures and References for

Tests

The cognitive reflection task is retrieved from Frederick (2005) and adapted to

French. Each correct answer is considered as one point in the score calculation.

There is a nonbinding time limitation which is 30 seconds for this part.

The heuristic questions are retrieved and adapted to the French population from

the studies of Kahneman and Tversky (1972), Tversky and Kahneman (1973) and

Fischhoff et al. (1977). The correct answers are determined from the data of World

Health Organization’s (WHO) report on global health estimates between 2000-2016

(World Health Organization,2018). Each wrong answer is considered as one point

in the score calculation.

The French version of The Reading the Mind in the Eyes Test is retrieved from

Prevost et al. (2014). Each correct answer is considered as one point in the score

calculation. There is a binding time limitation which is 10 minutes for this part.

The short-term memory test is retrieved from Wechsler digit span test. The

score is calculated as the maximum of the number of digits accurately remembered

in both sequences. Each number stays in the screen for 2 seconds and the box where

subjects type the number appears with 2 seconds delay after the last number.

A.2 Additional Figures and Tables

Table A.1: Description of subject pool

Pooled Baseline Treatment

Age 21.43 21.83 21.03

(1.91) (2.32) (1.26)

Share of women 42% 38% 47%

(0.49) (0.49) (0.50)

Baccalaureate grade 15.97 15.76 16.18

(2.03) (1.99) (2.04)

CRT score 1.49 1.4 1.58

(1.14) (1.07) (1.20)

Eyes score 27.2 26.97 27.43

(3.92) (4.30) (3.48)

Memory score 7.05 7.02 7.08

(1.67) (1.95) (1.33)

Represent. score 1.29 1.30 1.28

(0.65) (0.64) (0.66)

Availability score 2.13 2.00 2.27

(1.02) (1.05) (0.98)

Notes: Below averages, the standard deviations are reported in paren-

theses.

Table A.2: Improvements from similarity refine-

ment in the initial post-shock period

Baseline Treatment

Periods 25 41 57 57

∆SBNE -0.117 0.383 0.682 0.286

∆SBTF -0.363 0.152 0.451 0.210

∆SBAE 0.000 0.502 0.723 0.335

Notes: Each row reports the percent change in RMSE when

the SBNE, SBTF or SBAE rules are taken instead of their

non-refined equivalents. The columns correspond to the data

from the initial post-shock periods (one per round)

Table A.3: Average of impact factors in the HSM

Panel A - Average impact factors of pre-shock phases

Baseline Treatment

Rounds Rule 15 Rule 3 Rule 8 Rule 14 Rule 15 Rule 3 Rule 8 Rule 14

10.31 0.37 0.28 0.03 0.32 0.34 0.30 0.04

2 0.29 0.27 0.24 0.21 0.31 0.34 0.26 0.09

3 0.47 0.19 0.21 0.12 0.29 0.26 0.31 0.14

4 0.42 0.17 0.19 0.23 0.23 0.20 0.21 0.35

Panel B - Average impact factors of post-shock phases

Baseline Treatment

Rounds Rule 15 Rule 3 Rule 8 Rule 14 Rule 15 Rule 3 Rule 8 Rule 14

10.33 0.36 0.30 0.00 0.36 0.31 0.34 0.00

2 0.34 0.29 0.31 0.06 0.27 0.24 0.35 0.13

3 0.45 0.22 0.27 0.06 0.31 0.22 0.43 0.03

4 0.37 0.15 0.32 0.16 0.37 0.16 0.34 0.13

Notes: Each value in panel A and B represents the average impact factors computed for the HSM described in

Section 1.4.2 for pre-shock or post-shock phases. The pre-shock phase of first round includes periods 4 to 8.

A.3 Robustness Analyses

A.3.1 Calibration of Free Parameters in HSM

The HSM has three free parameters where each has its own behavioral implication.

To avoid arbitrariness, we provide a robustness analysis by comparing our bench-

mark HSM, denoted as HSM 1, with the benchmark HSM of Bao et al. (2012),

Figure A.2: Individual guesses across periods

Notes: Distribution of individual guesses across periods. 120 subjects per period. Plus signs (cross signs) represent

baseline (treatment) conditions.

Figure A.3: Cluster guesses across periods

Notes: Average guess of clusters by periods. The identification number of clusters are denoted above each figure.

Clusters 3, 4, 7, 8, 11, 12 belong to baseline sessions and clusters 1, 2, 5, 6, 9, 10 belong to treatment sessions.

denoted as HSM 4. For analytical tractability, we add two other versions of HSM

where in each step one free paramater gets closer to the one in Bao et al. (2012).

Table A.4 reports the results and Figure A.4 projects the impact factors as time

series. Results show that the fit worsens when any of the parameter increases ce-

teris paribus and the RMSE of HSM 4 is almost equal to the best homogeneous

expectation rule. Nonetheless, the fits of HSM 2 and 3 are better than any of the

homogeneous rule. Step-wise changes imply that the HSM applied to our data is not

much sensitive to changes in parameters ηand βbut it is somehow sensitive to the

abrupt changes in δ, at least for the treatment condition. This parameter represents

the proportion of agents who do not update their impact factor each period and a

value of 0.9 is behaviorally hard to justify. In conclusion, our benchmark results are

robust to the medium-level changes in parameters.

Table A.4: RMSE of HSM under different values of free

parameters

Comparison Level HSM 1 HSM 2 HSM 3 HSM 4

Baseline 3.90 4.43 4.54 5.10

Treatment 4.08 5.68 5.68 6.46

Pool 3.99 5.09 5.14 5.82

η0.1 0.7 0.7 0.7

δ0.4 0.4 0.4 0.9

β0.1 0.1 0.4 0.4

Notes: HSM 1 is equivalent to our benchmark HSM reported in Table 1.6.

HSM 4 has the benchmark parameters combination from Bao et al. (2012)

and HSM 2 and 3 provide an intermediate change in parameters.

A.3.2 Cognitive Skills and Individual Expectations

In this Appendix section, we ask whether individual scores on cognitive skills tests

predict the accuracy of expectations and the best fitting expectation rule. For this

sake, we compute the average relative prediction error (ARPE) in a given round

T=16

t=1

|pe

i,t −¯p−i,t|

¯p−i,t

,(A.1)

as well as the goodness of fit of the different expectation models per subject, as mea-

sured through RMSE. We then regress each of these measures on the set of subjects’

test scores. These test scores are designed to measure some of the cognitive skills of

subjects. Baccalaureate grade is the score that subjects obtained upon completion

of their secondary education. The CRT score is computed through three items and

is designed to measure one’s ability to reflect on a question and override reporting

the gut response. The RMET score is computed through thirty-six items and is

designed to measure one’s capacity to infer the internal emotional states of others.

Memory task is designed to measure one’s short-term memory capacity through the

number of items that one can remember with the correct order after seeing a se-

quence of numbers. The representativeness (availability) score is computed through

Figure A.4: Impact factors from various HSMs

(a) Baseline - HSM 2 (b) Treatment - HSM 2

(e) Baseline - HSM 4 (f) Treatment - HSM 4

Notes: Impact factors calculated with the HSM 2, 3, 4 across periods and experimental conditions. The symbols

circle, cross, square and triangle represent respectively SBNE, adaptive expectations (v1), weak trend-following and

fundamentalist expectations.

two (five) items and designed to measure one’s propensity for reasoning according

to the representativeness (availability) heuristic.

All test scores are standardized as Scorei−min(Score)

max(Score)−min(Score). Round and treatment

dummies, as well as their interactions, are also included in the model. We use a

random effects specification (N= 480 per regression).

Tables A.5 and A.6 report the corresponding estimates. Overall, the included

test scores only weakly explain the variation in either measure of interest. A higher

RMET score predicts a lower forecast error and a better fit for the fundamentalist

rule. By contrast, in groups where subjects are more prone to representativeness

heuristic, Nash play is less likely to occur.

The round dummies are systematically found to explain the dependent variables.

Compared to round 1, in round 4 the values of ARPE drop and the fits of all the

compared rules improve. Note that even with this improvement, the fundamental-

ist rule still performs worse than the other rules. The treatment dummy has no

significant impact on any variable, but its interaction with round 4 worsens the fit

of all the rules. So, the improvement in the goodness of fit over time is attenuated

under nonindentical shocks. The coefficient of this interaction term is also positive

for ARPE (in line with Hypothesis 2), but not statistically significant.

Table A.5: RMSE of expectation rules and individual characteristics

Variables SBNE SBAE SBTF Fundamentalism

Intercept 10.52*** 10.37*** 11.42*** 19.14***

(3.36) (3.13) (3.39) (2.83)

Treatment -0.75 -0.58 -0.81 0.24

(1.42) (1.41) (1.55) (1.62)

Round 2 0.74 0.52 0.81 -2.26***

(1.19) (1.22) (1.26) (0.99)

Round 3 -1.69 -1.95* -1.37 -4.38***

(1.14) (1.13) (1.23) (0.95)

Round 4 -3.67*** -3.81*** -3.43*** -8.05***

(1.28) (1.202) (1.38) (0.87)

Round 2 ×Treatment -0.18 -0.06 -0.16 -2.98**

(1.31) (1.34) (1.38) (1.16)

Round 3 ×Treatment 2.28 2.51 1.74 4.38**

(1.55) (1.54) (1.62) (1.71)

Round 4 ×Treatment 6.47*** 6.43*** 6.25*** 3.24***

(1.40) (1.34) (1.48) (1.24)

Baccalaureate grade 2.99** 2.66** 2.66** 2.35*

(1.22) (1.24) (1.11) (1.32)

CRT score -1.53 -1.54* -1.65* -0.81

(0.95) (0.88) (0.97) (0.74)

RMET score -2.56 -1.88 -3.05 -5.41***

(1.94) (1.73) (1.97) (1.99)

Memory score -2.95 -2.43 -3.13 -2.55

(3.44) (3.33) (3.51) (3.52)

Represent. score 0.81 0.61 1.16 2.89***

(1.23) (1.15) (1.24) (0.89)

Availability score -0.75 -0.56 -0.93 -0.49

(2.07) (1.95) (1.99) (1.36)

R2overall 0.08 0.08 0.08 0.22

Notes: Coefficients from random effect regression models. The dependent variables are the RMSE of four

expectation rules per subject computed for each round (N= 480 per regression). The independent variables

are standardized test scores, round and treatment dummy variables and their interactions. Robust standard

errors clustered at the rematching cluster level (12 clusters per condition) are reported. *, **, *** indicate

statistical significance at the 10, 5, 1% level, respectively.

Table A.6: Prediction errors and individual test scores

Variables (1) (2) (3) (4) (5) (6) (7) (8)

Intercept 0.174*** 0.176*** 0.186*** 0.219*** 0.208*** 0.154*** 0.176*** 0.224***

(0.022) (0.027) (0.023) (0.027) (0.034) (0.025) (0.024) (0.041)

Treatment -0.001 -0.001 -0.000 -0.000 -0.001 -0.001 -0.001 0.001

(0.024) (0.024) (0.025) (0.024) (0.025) (0.024) (0.024) (0.024)

Round 2 -0.043** -0.043** -0.043** -0.043** -0.043** -0.043** -0.043** -0.043

(0.019) (0.019) (0.019) (0.019) (0.019) (0.019) (0.019) (0.019)

Round 3 -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066*** -0.066***

(0.025) (0.025) (0.025) (0.025) (0.025) (0.025) (0.025) (0.025)

Round 4 -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107*** -0.107***

(0.021) (0.021) (0.021) (0.021) (0.021) (0.021) (0.021) (0.021)

Round 2 ×Treatment -0.023 -0.023 -0.023 -0.023 -0.023 -0.023 -0.023 -0.023

(0.020) (0.020) (0.020) (0.020) (0.020) (0.020) (0.020) (0.020)

Round 3 ×Treatment 0.065* 0.065* 0.065* 0.065* 0.065* 0.065* 0.065* 0.065*

(0.035) (0.035) (0.035) (0.035) (0.035) (0.035) (0.035) (0.035)

Round 4 ×Treatment 0.039 0.039 0.039 0.039 0.039 0.039 0.039 0.039

(0.024) (0.024) (0.024) (0.024) (0.024) (0.024) (0.024) (0.025)

Baccalaureate grade -0.002 0.016

(0.021) (0.019)

CRT score -0.024* -0.020

(0.014) (0.014)

RMET score -0.069** -0.070**

(0.029) (0.031)

Memory score -0.053 -0.021

(0.042) (0.037)

Represent. score 0.030* 0.030

(0.017) (0.016)

Availability score -0.003 -0.018

(0.026) (0.022)

R2overall 0.138 0.138 0.147 0.152 0.145 0.148 0.138 0.170

Notes: Coefficients from random effect regression models. The dependent variable is the individual ARPE computed for each round (N= 480 per regression). The independent

variables are standardized test scores, round and treatment dummy variables and their interactions. Robust standard errors clustered at rematching cluster level (12 clusters per

condition) are reported. *, **, *** indicate statistical significance at the 10, 5, 1% level, respectively.

Chapter 2

Imperfect Tacit Collusion and

Asymmetric Price Transmission

An earlier version of this chapter is published as: Bulutay, M., Hales, D., Julius, P.

and Tasch, W., 2021. Imperfect tacit collusion and asymmetric price transmission.

Journal of Economic Behavior & Organization, 192, pp.584-599. This article is

licensed under a Creative Commons license (CC-BY 4.0).

2.1 Introduction

The phenomenon of Asymmetric Price Transmission (APT), that is, that supplier

prices rise quickly after positive input cost shocks but fall relatively more slowly after

similarly-sized negative shocks, has been repeatedly documented in the literature

such that we can rightly describe it as a stylized fact.1However, while empirical

evidence for the existence of APT is ample, identification of its causal forces is

not settled. Many theoretical explanations have been proposed, but the empirical

literature has yet to conclusively determine which of these are valid or are most

influential.

Empirical studies of APT predominantly examine aggregate-level variables (e.g.

inflation, concentration) proposed to be relevant in the theory literature. The fo-

cus on such variables occurs because firm-level determinants are either not directly

observable, or are not adequately measurable in panel data form. This approach

yields helpful correlations between such variables, but the effort to identify causal

relationships has met with only limited success, most notably in the context of firm-

level underpinnings of the phenomenon. While the search for accurate firm-level

1This is also sometimes termed as ”positive APT” to distinguish it from the opposite phe-

nomenon of ”negative APT.” In this chapter, we will simply refer to ”APT” when we mean

positive APT, except where doing so would create ambiguity. See Section 2.2.1 for an overview of

the evidence.

data should certainly be continued, and where discovered used to further inform

our understanding of pricing behavior, experimental methods offer a comparative

advantage: testing theories that involve variables which are unobservable in the

field (e.g. agents’ information sets) lie outside the reach of empirical methods;2if

however these same variables can be controlled through experimental design, we can

overcome this obstacle to testing theory.

A question of primary interest is whether tacit collusion drives APT-like pricing

behavior.3The field data does not convincingly exclude the possibility that market

competitors secretly communicate, given the strong legal and even criminal incen-

tives for firms to conceal – or avoid engaging in – such activities. This provides an

obvious challenge for identification and motivates turning to the controlled setting

of the laboratory, where we can directly observe competitor behavior and credibly

prevent communication between sellers.4

An argument put forth by Borenstein et al. (1997) is that a variation of the

”trigger price” model of oligopolistic coordination originally introduced by Green

and Porter (1984) may explain the emergence of APT-type dynamics through tacit

collusion. In their model, when positive shocks occur firms immediately raise prices

in order to preserve profit margins; however, when negative shocks occur firms react

adaptively, holding prices at pre-shock levels until they see convincing evidence that

a rival has cut their prices. Rapidly lowering prices in response to a downward cost

shock could be perceived as defection from a mutually beneficial regime of tacit

collusion, thus inviting retaliation from other firms. In contrast, rapidly raising

prices in response to an upward cost shock poses no such threat to one’s competitors,

and therefore incurs no corresponding risk of retaliation. Although their arguments

are sound and are consistent with a deep empirical literature finding correlations

suggestive of tacit collusion, Borenstein et al. (1997) conclude that they are unable to

conclusively draw support for this hypothesis from their data. As no other empirical

study of which we are aware has accomplished this either, we thus find motivation

to turn to the laboratory to examine the role of tacit collusion in driving the APT

dynamic.5

2Meyer and von Cramon-Taubadel (2004) and Frey and Manera (2007) provide extensive dis-

cussions of methodological issues in econometric tests of APT.

3In this chapter we will use the term ”tacit collusion” to mean the phenomenon in which

suppliers coordinate on prices above the competitive equilibrium level, through the channel of

publicly visible pricing alone. Tacit collusion can also take the form of coordination on quantities

below competitive equilibrium levels, but in this chapter we will focus strictly on the role of

coordination on prices.

4Furthermore, the laboratory may be the only environment in which we can reliably detect

collusion, since the non-collusive prices or profits are unavailable without imposition of strong

structural assumptions.

5There are some studies that regress the estimated asymmetry with measures of market concen-

tration, e.g. Loy et al. (2016). Counter-intuitively, the authors find that asymmetry decreases with

higher concentration in German milk markets. However, it is difficult to associate this estimate

A second question of interest is whether the number of competing sellers in a

given market plays a significant role in the realization of the APT phenomenon. No-

tably, in his broad study of U.S. wholesale and retail markets, Peltzman (2000) finds

a negative relation between the number of competitors in a market and the magni-

tude of APT observed. As with any empirical study, however, this study does not

exclude the possibility that explicit (but unobserved) communication between firms

lies behind this result. Several non-APT focused studies of experimental oligopoly

markets find that there is an inverse relation between the number of sellers in a

market and the size of deviations from the Nash equilibrium (NE) outcomes (for

example, see Huck et al. (2004), Dufwenberg and Gneezy (2000), and Fonseca and

Normann (2012)). However, we are unaware of any experimental study that specif-

ically studies the role of the number of sellers in driving the APT phenomenon. We

therefore incorporate the number of sellers in our markets as a treatment variable

in our experimental design.

To our knowledge, Bayer and Ke (2018) is the only experimental study that

directly targets the topic of APT. The authors’ study employs a Bertrand duopoly

setting in which sellers’ costs either increase, decrease or stay constant at the halfway

point of the experiment. With two extensions of this baseline condition, they further

test the impacts of search costs and asymmetric information on APT. They find APT

across all treatments, even in the absence of search and information frictions. They

argue that the asymmetry can be explained with a backward-looking learning model:

If a seller fails (manages) to sell the good in the period prior to the shock, it is more

(less) likely that she will adjust her price downwards (upwards) in the following

period. The authors’ results support this regularity when the shock is negative, but

not when it is positive. Hence, although this learning model may account for the

downward rigidity, it falls short of explaining the asymmetry.6

While Bayer and Ke (2018)’s study provides a useful benchmark to our own,

our design choices differ substantially from theirs, as we pursue different research

questions. Whereas we aim to assess the roles of cooperative behavior and tacit

collusion on pricing asymmetries, they deliberately try to attenuate their impacts to

isolate the role of learning.7In particular, in their experiment sellers whose stores

with the causal impact of collusion on APT as their observed higher concentration index may stem

from higher efficiency or product differentiation rather than from competitor conduct.

6Bayer and Ke (2018) also reason that following positive cost shocks sellers will reason that other

sellers will all immediately raise their prices, and so they do the same, while following a negative

cost shock sellers do not see any reason to cut their prices unless and until they subsequently lose

sales. They cite factors such as bounded rationality as explanations for this behavior, but do not

offer a more precise explanation of the channels through which the observed behavior emerges.

7Although Bayer and Ke (2018) exert effort to minimize the role played by tacit collusion with

their study, their typed-stranger matching protocol significantly reduces but does not completely

eliminate the possibility that subjects might repeatedly interact, and thus have the opportunity

to establish reputation over time. By contrast, the perfect-stranger matching protocol, in which

are not visited by a buyer receive only limited information on the market price,

due to the feedback structure. In our experiment, we inform sellers of the average

market price of the other sellers, as we want to create the conditions in which price

signalling can be studied more explicitly.

In our experimental setting, subjects play the role of sellers and a computer plays

the role of buyers. Each seller faces demand that linearly decreases with one’s own

price and linearly increases with the average price of others. We vary the size of

groups across sessions as 2, 3, 4, 6, and 10, while calibrating the demand function to

hold the best-response functions of each seller identical, across all group sizes. Thus,

we isolate and study the impact of group size on behavior, while holding the price-

based incentives facing individual sellers constant across markets.8Throughout

our experiment, sellers experience a series of input price shocks – either large or

small – that shift the NE price either up or down. Through this design, we are

able (i) to test whether APT emerges despite the absence of market frictions and

information asymmetries that are often theorized to be the causal forces behind

pricing asymmetries; and, (ii) if APT does occur, to assess the impact of number

of sellers on the magnitude of the resulting asymmetries. To our knowledge, ours is

the first experiment that study the role of number of sellers in shaping APT, but

also the first to study the pure number effect in a price competition setting.

Our contributions to the literature are two-fold: First, we document the preva-

lence of the APT phenomenon through experiments in which we possess strict control

over the environment. In particular, our results indicate that the APT may emerge

even in the absence of market frictions and information asymmetries that are often

theorized to be the causes of pricing asymmetries. This suggests that in markets

with three or more sellers, the presence of agents who attempt to coordinate on

prices via price signaling may suffice for APT pricing dynamics to emerge. In our

duopoly markets, however, our results suggest that coordination can be so success-

ful that rather than the (positive) APT phenomenon, persistent pricing at collusive

price levels, or even negative APT, may instead result.

Second, keeping incentives the same across differing group size, we are able to

isolate and perform hypothesis tests on the pure effect of increased group size on

APT. For markets with three or more sellers, we do not find significant differences in

the magnitude of observed APT. Together, the results of our study support theories

that highlight the role of tacit collusion on APT. We conclude that APT may be the

a subject is assured they will be matched with another only once in a session, does more credibly

eliminate this possibility. Moreover, the duopoly setting of their study makes collusion presumably

more achievable, since coordination is easier when there is only one other market participant. As

a result, it is hard to assess the extent of the role to which cooperative behavior played in their

study.

8This is akin to the “pure number effect” studied by Isaac and Walker (1988) in the context of

public goods game. See Hanaki and Masili¯unas (2021) for a similar approach in Cournot oligopolies.

product environments in which collusion is significant, but imperfect (i.e., unstable).

2.2 Related Literature

2.2.1 Field Evidence

Bacon (1991) provides an early empirical study suggesting that retail gasoline prices

in the United Kingdom experience faster and more concentrated responses to crude

oil price increases than they do to similar crude oil price decreases. Bacon termed

this phenomenon ”Rockets and Feathers,” and since this paper was published dozens

of other researchers have detected the presence of this sort of asymmetry in a variety

of consumer and intermediate goods markets.

Peltzman (2000) provides one of the most comprehensive empirical examinations

of APT. He conducts a broad study of pricing behavior of 77 consumer and 165

producer goods markets in the U.S., and he concludes that in more than two-thirds

of these markets prices rise faster than they fall, in response to input cost changes.

Peltzman also seeks correlations between various features of markets and industries,

and the degree to which evidence of APT is present. Most notably, he finds that

markets with fewer competitors tend to exhibit more pricing asymmetry, while on

the other hand markets with higher levels of concentration tend to be less likely to

exhibit pricing asymmetry, as in Loy et al. (2016). Peltzman’s study, however, does

not provide an explanation for these correlations.

In an early survey of field evidence, Meyer and von Cramon-Taubadel (2004) find

that (excluding Peltzman (2000)’s study), symmetry in price response is rejected in

almost one-half of all cases in the literature. Their survey also shows that different

test methods yield highly varying rejection rates (between 6% and 80%). Frey

and Manera (2007) and Perdiguero-Garc´ıa (2013) provide meta-regression analyses

with more comprehensive and recent data sets. Both studies confirm that APT is

very likely to occur but also emphasize the variation of reported outcomes. Their

results show that this heterogeneity can be explained by certain characteristics of the

data (e.g., data frequency) and of the employed econometric model. Most notably,

Perdiguero-Garc´ıa (2013) reports that the asymmetry tends to decrease in more

competitive segments of the industry.

2.2.2 Theoretical Explanations

There is a growing body of literature on the theoretical accounts of APT, an un-

surprising fact given that pricing asymmetries are not predicted by standard price

competition models.9These studies propose explanations of APT mainly by in-

troducing market frictions, information asymmetries or boundedly rational agents

into the underlying models. One reason there is such a variety in the way different

studies explain the APT is because these studies typically focus on specific market

structures (e.g., wholesale petroleum markets) and their idiosyncrasies. In this sub-

section, we review some of these studies in an attempt to categorize them as well as

to highlight discrepancies.10

Borenstein et al. (1997) consider the role of search costs in facilitating APT.

They hypothesize that negative cost shocks in the presence of costly search provide

firms temporary pricing power, which they then use to delay reductions in prices,

yielding temporarily superior profits. Benabou and Gertner (1993) and Yang and

Ye (2008) also develop explanations based on consumer search costs, but also on the

volatility of input costs. They reason that volatility should reduce search incentives

for consumers; producers, realizing that they face demand that is temporarily more

inelastic, yielding them increased pricing power over the short term, respond by

reducing prices more slowly. Reagan and Weitzman (1982) and Borenstein and

Shepard (1996) propose explanations based on inventory costs, reasoning that it is

relatively more costly for manufacturers and suppliers facing capacity constraints or

sharply rising short-term production costs to deal with unanticipated increases in

demand resulting from price drops, than it is to respond to corresponding drops in

demand due to price increases. Ball and Mankiw (1994) consider a menu-cost model

in conjunction with positive trend inflation as an explanation of APT. In another

study, Ahrens et al. (2017) show that the presence of consumers with loss aversion

may explain why prices are more sluggish to adjust downwards than upwards in

response to permanent demand shocks.

The various explanations and models described above provide differing implica-

tions for government policy: if APT occurs due to collusion, there may be room for

regulation to improve economic efficiency; if however APT is primarily caused by

9A notable exception is the case of Markov-perfect equilibria, and in particular the case of the

Edgeworth cycle. In this phenomenon, firms undercut each others’ prices successively until prices

approach marginal cost; at this point, one of the firms decides with some positive probability to

spike its price, and once this occurs the cycle is repeated, yielding each firm positive economic

profits. Maskin and Tirole (1988) further show that these cycles provide a case where asymmetric

pricing can be sustained in equilibrium. However, the Edgeworth cycle model requires that firms

make price decisions alternately; the model does not support an equilibrium when price decisions

are made simultaneously or continuously. Moreover, the emergence of the phenomenon seems in

practice to be limited to environments in which competitors rapidly and publicly change prices (see

for example Byrne and De Roos (2019) for an interesting case in Perth, Australia petrol markets,

in which a government mandate for retail suppliers to publish their prices daily seems to have

facilitated the emergence of a weekly cycle of Edgeworth-like pricing dynamics that persisted for

many years.). Thus the Edgeworth cycle model arguably applies only to a relatively narrow range

of market contexts.

10For more exhaustive surveys of theoretical explanations, see Meyer and von Cramon-Taubadel

(2004) and Brown and Yucel (2000).

the presence of inventory costs, asymmetric menu costs, or search costs, then reg-

ulation that controls pricing behavior may actually induce inefficiency rather than

attenuate it. Given the robust evidence of the widespread existence of APT and its

non-trivial magnitude and impact on consumer outcomes, identifying which theories

best describe the asymmetric pricing behavior is key to determining effective public

policy.

2.2.3 APT and Experiments

Despite the many possible explanations that have been proposed, the empirical lit-

erature yields only mixed evidence that is often inconclusive due to identification

issues. This suggests there is room for further research to shed light on the phe-

nomenon. We consider the advantages of experimental methods in isolating and

studying causal determinants of APT.11 In this subsection, we summarize the most

relevant literature to our study.

There are two studies of which we are aware – in addition to Bayer and Ke

(2018) – that conduct market experiments with APT-related results. Deck and

Wilson (2008) investigate gasoline markets and find that retail prices adjust asym-

metrically to changes in station costs in zones with clustered stations, but not in

zones with stations that are relatively isolated from competitors. Cason and Fried-

man (2002) find weak evidence of APT in posted offer markets where customers

incur switching costs. While these studies examine their findings on APT, their

experimental designs are optimized to investigate questions regarding the structure

of gasoline markets (e.g., zone pricing, divorcement) and of consumer markets (e.g.,

switching costs), not to identify causes of APT. In particular, sellers’ costs in both

experiments follow random-walk shocks, which may not be salient enough to detect

APT. Our study distinguishes itself from this string of literature by examining APT

with larger, persistent shocks.12

Apart from studies that directly target APT, price competition experiments

that study the impact of group size on tacit collusion are also relevant to this chap-

ter. Dufwenberg and Gneezy (2000) provide an early evidence for such a relation

through an oligopoly game that corresponds to a discrete version of the Bertrand

model. They find that winning prices tend to converge to NE levels in groups of

11The usage of experimental methods in macroeconomic research is becoming more and more

prevalent. See Duffy (2017) and Cornand and Heinemann (2019a) for recent surveys.

12Fehr and Tyran (2001) also employ large positive and negative shocks and report APT-like

behavior in a price-setting game. However, the authors do not analyze the phenomenon, nor do they

probe its implications. In another related experimental study, Duersch and Eife (2019) consider

Bertrand duopolies with zero marginal cost in either inflationary, deflationary or constant price

environments. They find that real prices are significantly lower in the inflationary environment

compared to non-inflationary environments.

three or four competitors, but stay consistently high in duopolies. Morgan et al.

(2006) find that increasing the number of sellers from 2 to 4 decreases the prices

paid by some consumers (the ones informed about the entire distribution of prices)

but not for others (the ones who buy with motives other than prices). Abbink and

Brandts (2008) also find that there is a negative relationship between the number

of competing firms and price levels.13 Nevertheless, as in Dufwenberg and Gneezy

(2000), they find that collusive pricing is the modal outcome in duopolies. Fonseca

and Normann (2012), Orzen (2008), Davis (2009) and Horstmann et al. (2018) pro-

vide further evidence that collusive prices are very likely to be observed in duopolies.

Average prices approach considerably close to the NE in the baseline condition of

these studies (fixed matching, no communication, symmetric sellers etc.) when the

number of sellers is 3 or greater.

A finding common to each of these studies is that persistent coordination over

collusive prices is unlikely in markets other than duopolies. This, however, does

not preclude the possibility that players might manage to coordinate temporarily

on high prices following negative shocks. Experiments also indicate that increasing

the number of sellers often leads to more competitive outcomes (in terms of price

and output), which in turn should make APT less likely. Although, the meta-

analyses of Fiala and Suetens (2017) and Horstmann et al. (2018) on oligopoly

experiments indicate that there may not be a linear relationship between the number

of competing firms and the degree of tacit collusion. On the one hand, Horstmann

et al. (2018) argue that this result stems from the relatively small number of studies

that provide pairwise comparisons and the lack of statistical power in these studies.

On the other hand, Hanaki and Masili¯unas (2021) consider, as we do, the fact that

a change in group size simultaneously influences the difficulty of coordination (aka

the ”pure number effect”) and the incentives provided to collude. They find that

the pure effect of group size is small if exists at all. Our study contributes to

the literature through improvements of these axis, in particular, by changing the

group size without changing the incentives provided to individual sellers in different

markets in a price competition game.

13Their results are particularly interesting since in their price competition setting, there exist

multiple equilibria.

2.3 Method

2.3.1 Pricing Game

We develop a price competition game with a linear demand model and employ this

in our experimental markets.14 In this setting, the demand facing seller i∈Nin

period t∈Tis equal to

qi,t(pi,t, p−i,t;δ, γ) = 





δ−γ·(pi,t −p−i,t), pi,t ∈[pmin, p]

0,otherwise

(2.1)

where δand γare parameters of demand, pi,t is the price set by seller iand p−i,t

is the average price chosen by the rival sellers in the same market (i.e. p−i,t ≡

N−1PN−1

j=ipj,t) at period tand p=min pmax, p−i,t +δ

γ. Parameters pmin and

pmax refer to the price floor and the reservation price of the representative consumer,

respectively. Variable pregulates the maximum price level below which (conditional

on the average price of other sellers) qi,t takes on strictly positive values.15

The model of linear demand that we use, with own- and cross-price parameters

equal in magnitude, is micro-founded by the Spokes models of Chen and Riordan

(2007) and Bos and Vermeulen (2022), which are themselves adaptations of Salop’s

canonical Circular City address model (Salop,1979). In addition, this specification

of demand can also be motivated by a limiting case of quasi-linear quadratic utility

(see Appendix B.1 for an exposition). Note that any demand specification with linear

parameters on the price of each competing good can be equivalently represented in

terms of own-price and the (weighted) average price of other sellers, as in (2.1).

Thus the presence of average price figure p−i,t in this equation does not imply that

when contemplating their appetite for good ithat consumers consider the average

price set by firms competing with firm i; rather, it implies that the aggregate effect

of all consumers’ individual demand for good iis based on the entire vector of prices,

but can nevertheless be mathematically represented concisely in terms of own price

piand and the average price of all other sellers, p−i. This fact will be helpful in

keeping both our analysis and our experimental design tractable, as we will shortly

see.

14Linear demand systems have a number of advantages over alternatives and long been applied

in the industrial organization literature. Such systems lead to closed-form best-response functions

and Nash Equilibrium specifications, greatly enabling interpretation of empirical or experimental

results. To the extent that non-linear systems of demand (e.g. the Almost Ideal Demand System

of Deaton and Muellbauer (1980) or the Relative Love of Variation model of Zhelobodko et al.

(2012)) may be preferred on theoretic or other grounds, linear systems provide approximations

with first-order accuracy.

15In practice subjects chose prices that were revealed to be above this value in fewer than 0.2%

of all cases.

Given the own-demand specification in (2.1), seller profits are calculated as

πi,t = (pi,t −mct)·qi,t −f, (2.2)

where qi,t is quantity demanded from seller ias defined in (2.1), mctis marginal cost

that shifts every Tperiods that comprise a round (denoted r∈R) and fis fixed

cost. Sellers set their prices in each period simultaneously from a discrete set that

is bounded as pi,t ∈[mct, pmax], such that the price floor is equal to the marginal

cost of that round.16

In the described finitely repeated game, we can express the maximization prob-

lem as

maxpi

h=0

βhEi,t−1πi.t+hsubject to pi,t+h∈[mct+h, pmax].(2.3)

Sellers thus maximize the expected discounted sum of profits over Tperiods by

choosing a vector of prices pi.17 This in turn leads to best-response function

pBR

i,t =1

2mct+δ

γ+Ei,t−1[p−i,t].(2.4)

The current period’s marginal cost mc is revealed to the sellers prior to their pricing

decisions thus is outside the expectation operator. Sellers form expectations of the

prices their competitors will set during the current period (Ei,t−1[p−i,t]) by condi-

tioning on all available information. The system of best responses for all sellers

implied by (2.4) solves for steady-state prices as:

p∗

i,t =pBR

i,t Ei,t−1[p∗

−i,t]=mct+δ

γ≡pNE

t.(2.5)

This price level also corresponds to the unique stage-game NE (pNE

t) and the unique

subgame perfect Nash equilibrium (SPNE). Sellers may achieve the joint profit max-

imum (JPM) if they each set their prices to the maximum price pmax.

In this pricing game, neither own-demand nor own-profit depend on the number

of sellers. These only depend on own-price and the average price of rival sellers.

The best-response action is also independent of Nfor a wide range of expecta-

tion models, including rational expectations and adaptive expectations (Evans and

Honkapohja,2001). This feature assures that the incentives given to the sellers of

different group sizes are matched and the market power of each seller, measured by

the size of markup over marginal cost, is ex-ante equal. We consider this as nec-

16As a practical matter we needed to set a price floor of pi,t ≥mctto ensure subjects would not

complete the experiment with a negative payoff.

17Although the model assumes that sellers choose a vector of prices for all periods, subjects only

submit a price decision for the current period in the experiment.

essary for ensuring a ceteris paribus comparison between the treatment conditions

and to capture the pure number effect.

2.3.2 Experimental Design

Sellers interact repeatedly in the described pricing game for Rrounds, which are each

composed of Tperiods. Marginal cost mctfluctuates across rounds, modeling large

exogenous cost shocks, but remains invariant during a round. Our experimental

manipulations consist of varying the size of markets across sessions in a between-

subjects design, and of varying the size and direction of shocks across rounds in a

within-subjects design. We implement a fixed-matching protocol during a session.

The calibration of the experimental game is summarized in Table 2.1. The exper-

iment consists of 5 rounds of 15 periods each, with a new marginal cost announced

at the beginning of each round. The sequence of shocks is identical across all treat-

ments: marginal cost starts at $0.90 in Round 1, drops to $0.50 in Round 2, rises to

$1.30 in Round 3, falls again to $0.50 in Round 4, then rises to $0.90 for Round 5.

Table 2.1: Experimental design parameters

General parameters

Number of periods per round T= 15

Number of rounds per session R= 5

Demand parameters δ= 8.50, γ= 7.275

Fixed cost f= 1

Maximal/reservation price pmax = 3

Varying parameters

Group size across treatments N∈ {2,3,4,6,10}

Marginal cost across rounds mc : (0.90,0.50,1.30,0.50,0.90)

Cost shock sequence ∆mc ≡η: (−0.40,+0.80,−0.80,+0.40)

NE price across rounds pNE : (2.07,1.67,2.47,1.67,2.07)

2.3.3 Procedures

Experimental sessions were conducted at the University of California, Santa Bar-

bara’s Experimental and Behavioral Economics Laboratory (EBEL) using the z-Tree

platform (Fischbacher,2007), between September and December of 2018. A total

of 245 subjects were recruited from the experimental economics subject pool of the

same university, using the ORSEE tool (Greiner,2015). Subjects were allocated

to markets of size 2, 3, 4, 6 and 10, with a total of 36, 39, 52, 48 and 70 subjects

assigned to each group size condition, respectively. This setup yields 59 independent

markets for the analysis.18

At the beginning of each experiment, subjects are provided written instructions

which are also read to them aloud by an experimenter. Subjects then proceed to

take a short comprehension quiz.19 In the main part of the experiment, each subject

plays the role of sellers and makes a series of 75 pricing decisions, whereas consumer

behavior is simulated by computer. We also elicit subjects’ one-period-ahead ex-

pectations about the average price chosen by rival sellers (i.e., Ei,t−1[p−i,t]). These

expectations are not rewarded separately, to avoid creating hedging issues. Subjects

are able to set a price between the marginal cost and the maximum price (of $3.00),

in increments of $0.01. Once all subjects set their prices and expectations, they are

individually notified by the computer of the average price established by the others

in their market, reminded of their own price, and shown their own resulting payoff

for that period. Subjects are able to track the previous values of these outcomes

through a history box that is available in their screen (see Appendix B.2).

We notify subjects that a new cost shock will occur at the beginning of each

new round, either an increase or decrease, of either $0.40 or $0.80. We reveal the

magnitude and direction of each shock immediately prior to the first period of each

respective round. At that time, we also hand out copies of a printed payoff table

corresponding to the new marginal cost. These tables assist subjects in estimating

the profits they will receive, conditional on the hypothetical prices they and others

may set in each period of that round (see Appendix B.2).

Sessions lasted a total of 90 to 125 minutes. Subjects were paid $18.66 on average

(a minimum of $10.89 and a maximum of $28.50), which includes the $5.00 show-up

fee and $3.00 for the completion of the optional survey (no subject declined this

offer). The remaining payoff is determined as the average payoff of a randomly

chosen round of the game.

2.4 Hypotheses

As the experimental design specifically avoids any of the features outlined in Section

2.2.2 (e.g., frictions, information asymmetries), standard theory suggests that prices

react symmetrically to shocks. However, we may observe asymmetry if certain

conditions are satisfied even in the absence of these frictions. To unravel these

conditions, we need to investigate the strategic tension underlying our pricing game.

18In one session (20 subjects), the data from the final period is lost due to technical reasons. All

the analysis in the results section is performed based on all the available data.

19We reviewed answers for each subject and provided explanations where needed. See Appendix

B.2 for all experimental material.

On the one hand, individual incentives promote undercutting opponents’ prices until

the NE price is reached. On the other hand, sellers may generate higher profits if

they successfully sustain coordination at a price level above NE.

We first focus on the incentive each individual seller has to undercut other sellers.

Consider the following variable:

Inc2Dev(p−i,t)≡πi,t(pBR

i,t (p−i,t−1), p−i,t−1)−πi,t(p−i,t−1, p−i,t−1),(2.6)

which expresses the incentive of a myopic seller to deviate from the coordinated

price level from the previous period, p−i,t−1.20 This seller is myopic as s/he ignores

the possibility that the competitors might also engage in similar reasoning. By

substitution of (2.1), (2.2), and (2.4) we can rewrite (2.6) as

Inc2Dev(p−i,t) = γ

4(p−i,t−1−δ/γ −mct)2=γ

4p−i,t−1−pNE

t2(2.7)

The incentive to deviate from price p−i,t−1during period tthus increases quadrati-

cally with the difference between the previous period market price and the current

period NE price. Thus, individual incentives promote convergence to the NE.

We now separately analyze Inc2Dev in reaction to positive and negative shocks.

Consider the case where a shock of magnitude η≡ |mct−mct−1|occurs in period t.

Then, application of (2.5) allows us to express pNE

t=pNE

t−1+ηand pNE

t=pNE

t−1−η

for positive and negative shocks, respectively, and we thus define:

Inc2Dev+

i,t ≡γ

4(p−i,t−1−(pNE

t−1+η))2

Inc2Dev−

i,t ≡γ

4(p−i,t−1−(pNE

t−1−η))2.

(2.8)

Straightforward manipulation of (2.8) allows us to express the difference in the

incentives to deviate as:

∆Inc2Devi,t ≡Inc2Dev+

i,t −Inc2Dev−

i,t =−γη ·(p−i,t−1−pNE

t−1).(2.9)

This equation implies that when market prices are above (below) NE immedi-

ately prior to a cost shock, the incentive to deviate following the shock will be greater

(lesser) if the shock is negative than if it is positive. When however p−i,t−1→pNE

t−1,

or when sellers are not myopic, the incentive to deviate converges to zero. We con-

sequently consider the absence of APT as our first null hypothesis:

Hypothesis 1: Prices respond symmetrically to (equally sized) positive and

negative shocks.

20We are indebted to an anonymous reviewer for proposing analysis using this approach.

We can test this hypothesis by exploiting the exogenous within-subjects treatment

variations in marginal cost.

An interesting implication of equation (2.9) is that if prices are already collusive

prior to the shock, the relatively greater temptation to deviate from the coordinated

price following negative shocks suggests we should observe negative APT as opposed

to the positive APT that Borenstein et al. (1997) and many others have observed.

The arguments of myopic best-response and of tacit collusion – explained in the

next hypothesis – therefore go in the opposite directions.

Our second hypothesis concerns the second force underlying this strategic ten-

sion: the prospect of sustained higher profits through tacit collusion. We use market

power, measured through the size of markups over marginal cost, to study collusion

(see Section 2.5.2 for the description of the exact measure). Markups should be

invariant to the number of rival sellers, given that both the profit and best-response

functions are independent of group size. Moreover, in the absence of frictions and

the ability of competitors to communicate, the theory predicts a constant markup

for all levels of marginal cost. However, if tacit collusion occurs, we expect to ob-

serve higher market power (i) in markets with fewer sellers, and (ii) in the periods

occurring soon after negative shocks. For (i), we expect to observe persistent coor-

dination more often where there are fewer sellers to dampen the strength of price

signals. Moreover, it is arguably easier to sustain coordination above NE pricing

when one has fewer competitors, as any one of them can undermine joint coordi-

nation if they individually fail to cooperate. For (ii), we conjecture that negative

shocks boost the market power of sellers (at least temporarily), as such shocks may

play the role of a coordination device for attempts of collusion. We can test this

hypothesis by using the between-subjects treatment variations in group size, and

within-subject treatment variations in marginal cost.

Hypothesis 2: Sellers’ market power is invariant to the number of sellers in the

market, and is unaffected by the existence of periodic shocks.

Finally, our third hypothesis concerns individual pricing strategies. The Rational

Expectations Hypothesis (REH) of Muth (1961) admits the possibility of expecta-

tion errors at the individual level, but which should tend to cancel out in aggregate.

Also, after observing t−1 periods of price history, a seller may learn that the others

do not behave consistently with the predictions of REH. Nevertheless, conditional on

expectations, sellers should select the best-response action as this maximizes their

profit. As we elicit subjects’ guesses on the average price set by others, we can test

this hypothesis without assuming a specific expectation model. We can interpret

any intentional deviation from the profit-maximazing action as an attempt to reach

collusive prices.

Hypothesis 3: Conditional on expectations, pricing behavior follows the best-

response function.

2.5 Results

Figure 2.1: Average pricing behavior across periods and group sizes.

(a) Average of all market prices

.5 1 1.5 2 2.5 3

Prices

0 15 30 45 60 75

Periods

Average JPM MC NE

Prices

(b) Average market prices by group size

1.5 2 2.5 3

Price

0 15 30 45 60 75

Periods

N=2 N=3 N=4 N=6 N=10 JPM NE

Prices

Figure 2.1 provides a depiction of the average price per period, as the average

of all market prices and as broken out by group size. Here, market price refers to

the average of all prices in market m(i.e., pm,t =1

NPN

i=1 pi,t). The reader can

readily discern that for groups of size 3 and greater, average prices rise rapidly after

positive cost shocks, while they fall more slowly after negative cost shocks. By

contrast, for groups of size 2, it is not immediately obvious whether average pricing

behavior is affected by cost shocks. A second observation that is immediately clear

is that average prices are generally above the NE price, with deviations being higher

following negative shocks than when following positive shocks. Overall, the visual

inspection of the data suggests the presence of market behavior consistent with

APT.

2.5.1 Estimation of the Asymmetry

We follow Peltzman (2000) and estimate the coefficients of the distributed lag model

(DLM) to measure the magnitude of APT. This model can be expressed as:

∆pi,t =

k=0

bt−k·∆mct−k+

k=0

ct−k·(I[∆mct−k>0] ·∆mct−k) + ϵi,t (2.10)

where the change in output price (i.e., ∆pi,t =pi,t −pi,t−1) is modelled as a func-

tion of the lagged changes in marginal cost (i.e., ∆mct−k). The indicator variable

I[∆mct−k>0] takes the value 1 if the change in marginal cost in period t−kis

positive and equal to 0 otherwise. The sum of interaction coefficients PK

k=0 ct−k

reflects the magnitude of asymmetry and its persistence over Kperiods.

Figure 2.2: Cumulative response to shocks

0 .2 .4 .6 .8 1

Cumulative asymmetry since the shock

0 1 2 3 4

Periods after shock

Notes: Cumulative response after Kperiods. Dots refer to PK

k=0 ct−k. Lines represent 95% confidence intervals.

We estimate model (2.10) with Ordinary Least Squares (OLS) regressions in

a step-wise manner. Figure 2.2 reports the estimated asymmetry for K= 4.21

21We report the full set of results in Appendix B.3. All estimations employ robust standard

errors that are clustered at market level. We also include a set of indicator variables that are

specific to each group size (i.e., I[N=s]), the lagged change in the average price of rival sellers

(i.e., ∆p−i,t−1), a three-way interaction term (between I[∆mct−k>0], ∆mct−kand group size

specific indicator variables) and autoregressive terms amongst the set of regressors to check the

robustness of estimates. The significance of asymmetry coefficients as well as their magnitude are

Estimates indicate that the APT is both strong and persistent. Immediate price

reactions are 32.9 cents greater in magnitude for positive than for negative 80-cent

shocks.

We now assess the reaction of prices to equally sized shocks between our treat-

ment groups with non-parametric tests. We compare immediate pass-through rates

of shocks that are β+

0and β−

0calculated as:

i,t+τ=p+

i,t−1+β+

τη+

p−

i,t+τ=p−

i,t−1+β−

τη−(2.11)

where η+(η−) reflects either the large or small positive (negative) shock and t−1

corresponds to the period just before the shock. Note that aggregate market demand

implied by (2.1) is perfectly inelastic, so that shifts in the NE price following cost

shocks are exactly equal to the magnitude of the cost shock itself. Accordingly, we

can denote the cases β+

0= 1 and/or β−

0= 1 as incidences of ”full pass-through” of

cost shocks. The ratio of β+

0and β−

0thus conveys information on the degree of APT

in immediate cost-shock responses. A ratio of 1 would indicate the absence of APT.

Table 2.2 provides average value of pass-through rates for different aggregation

levels.22 First, we note that the hypothesis of full pass-through can generally be

rejected. Exceptions consist of the small positive shock (i.e., η+= 0.40) and N= 6

for the large positive shock. Second, we test APT in the immediate post-shock

responses by testing the equality of immediate pass-through rates for equally sized

shocks as H0:β+

0=β−

0via Wilcoxon signed-rank tests. The pooled data and the

data for groups of size greater than 2 suggest rejecting the null. For groups of size

3, we reject symmetry for the smaller but not for the larger shock.

robust to the inclusion of these variables.

22We report the results of an identical analysis for τ= 14 in the Appendix B.3 Consequently,

asymmetry in pass-through rates reduces but does not disappear entirely even 14 periods after the

shock.

Table 2.2: Asymmetry in the immediate pass-through rates

All N>2N= 2 N= 3 N= 4 N= 6 N= 10

Small shocks

β−

00.159 0.115 0.411 0.558 0.158 -0.144 0.0154

(0.0676) (0.0739) (0.162) (0.247) (0.138) (0.130) (0.0985)

β+

01.119 1.270 0.244 1.305 1.209 1.233 1.322

(0.0672) (0.0659) (0.196) (0.172) (0.121) (0.121) (0.123)

p-value 0.000 0.000 0.411 0.028 0.000 0.000 0.000

Large shocks

β−

00.324 0.313 0.391 0.303 0.432 0.206 0.304

(0.0362) (0.0405) (0.0734) (0.107) (0.0811) (0.0706) (0.0712)

β+

00.639 0.718 0.181 0.537 0.779 0.945 0.619

(0.0396) (0.0400) (0.111) (0.116) (0.0636) (0.0796) (0.0643)

p-value 0.000 0.000 0.035 0.202 0.013 0.000 0.006

Observations 245 209 36 39 52 48 70

Notes: The averages of pass-through rates by differing group sizes are reported. Below averages, standard errors

are reported in parentheses. p-values correspond to the result of the Wilcoxon signed-rank test on the equality of

pass-through rates for small or large shocks (i.e. H0:β+

0=β−

0).

For duopolies, we see that the asymmetry is reversed; the average price response

following the larger cost shock is significantly greater for the negative than for the

positive cost shock. Reflecting on the differences in incentives to deviate calculated

in (2.9) from Section 2.4, and noting from Figure 2.1 that pre-shock prices were

generally much higher than the NE price immediately prior to shocks in duopoly

markets, we see that the relative incentive to deviate after negative shocks was much

stronger in duopoly markets than in other markets. We explore the possibility that

this relatively higher incentive to deviate in duopoly markets may have led to this

divergent outcome.

Taken together with the estimates of the DLM, we reach the first two results of

this chapter:

Result 1.1: Prices do not react symmetrically to equally sized positive and

negative shocks.

Result 1.2: Price reactions in triopoly and larger markets are consistent with

positive APT. Price reactions in duopoly markets are consistent with negative APT.

2.5.2 Market Power

We now turn to our second hypothesis. We follow the literature in applying the

Lerner index as the relevant measure of market power: Li,t =pi,t−mct

pi,t (Lerner,1934).

We propose that the difference between the observed Lerner index (i.e., Li,t) and the

”theoretical” Lerner index, that is the index that would be relevant if behavior was

consistent with NE predictions (i.e., LNE

t=pNE

t−mct

pNE

t), provides a measure of ”excess”

market power due to collusion. We further propose this as an appropriate measure of

tacit collusion, as our price competition structure incorporates homogeneous goods

and we control marginal costs. Thus, we do not suffer the identification problem of

observational studies. Our measure of excess market power can then be expressed

as:

i,t =Li,t −LNE

t=mct1

pNE

−1

pi,t .(2.12)

Figure 2.3 depicts the average of our measure of excess market power, by pe-

riod and treatment. Upon visual examination one can immediately see that excess

market power generally lies above the theoretical ”Nash” level, consistent with an

environment in which tacit collusion exists. Also, this average measure reaches its

highest levels during the second and fourth rounds, the two rounds that immediately

follow negative shocks. Following the large positive shock at the beginning of the

third round, excess market power falls so much that it turns negative for several pe-

riods. Following the small positive shock at the beginning of the fifth round, excess

market power does not notably react.

We test the veracity of these observations by performing OLS regressions.23 We

consider the following specification:

i,t =α+X

s=2

δs·I[N=s] + X

e=1

γe·I[r=e] + ϵi,t,(2.13)

where the excess market power of seller iin period tis modeled as a function of

group size- and round-specific indicator variables. According to our null hypothesis,

the model with no independent variables should fit the data as well as this model.

23The non-parametric counterpart of this test is reported in Appendix B.3.

Figure 2.3: Excess market power across periods and group size

(a) Average of excess market power

−.05 0 .05 .1 .15

Excess Market Power

0 15 30 45 60 75

Periods

Excess Max Nash

Market Power

(b) Excess market power by group size

−.05 0 .05 .1 .15

Excess Market Power

0 15 30 45 60 75

Periods

N=2 N=3 N=4 N=6 N=10 Max Nash

Market Power

Notes: Excess market power across periods and group sizes. In both subfigures, ”Max” refers to the maximum

excess market power that can be observed (i.e., when Li,t =Lmax

t=pmax −MCr

pmax ) and “Nash” refers to the case

Li,t =LNE

Table 2.3: Excess market power

(1) (2) (3) (4) (5)

Constant 0.032*** 0.060*** 0.025*** 0.054*** 0.004

(0.004) (0.011) (0.004) (0.011) (0.007)

N= 3 -0.039* -0.039* 0.018

(0.015) (0.015) (0.020)

N= 4 -0.031* -0.031* 0.028**

(0.014) (0.014) (0.010)

N= 6 -0.026 -0.026 0.047***

(0.014) (0.014) (0.010)

N= 10 -0.038** -0.038** 0.029*

(0.012) (0.012) (0.011)

r= 2 0.017*** 0.017***

(0.003) (0.003)

r= 3 -0.014*** -0.014*** -0.084***

(0.004) (0.004) (0.009)

r= 4 0.023*** 0.023*** 0.028***

(0.004) (0.004) (0.006)

r= 5 0.005 0.005 -0.020**

(0.004) (0.004) (0.007)

Observations 18355 18355 18355 18355 980

Adjusted R2- 0.052 0.054 0.106 0.225

F-statistic - 2.607 31.289 19.288 27.003

Notes: Results of OLS regressions on specification 2.13 are reported. In model (5), the dependent

variable is the change in excess market power, ∆Lx

i,t. Below estimates, robust standard errors

that are clustered at the market level are reported in parentheses. * p < 0.05, ** p < 0.01, ***

p < 0.001

Table 2.3 reports the estimates in a step-wise manner. In model (5), we truncate

the data to the periods where shocks shift the marginal cost (i.e., periods 16, 31,

46 and 61) and replace the dependent variable with the change in excess market

power as ∆Lx

i,t. This allows us to interpret the estimates of round specific indicator

variables as the immediate effect of cost shocks on the tacit collusion in model (5).

First, we reject the null hypothesis in all specifications except (2) at a confi-

dence level of 0.01 with the F-test. The fact that the constant αis positive and

significant in model (1) indicate the overall presence of tacit collusion. Second, the

coefficients of round-specific indicator variables in model (3) indicate that tacit col-

lusion is higher during the second and fourth rounds, and lower during the third

round relative to the first round. In model (5) where we truncate the data, the

coefficient of rounds 3 and 5 are negative and that of round 4 is positive. Further-

more, we reject the hypothesis H0:α+δs+γe= 0 at a confidence level of 0.05

(i.e., α+δ{N=3,4,10}+γ5= 0). We can thus say that immediately after a negative

(the large positive) shock, the excess market power increases (decreases). Third,

coefficients of group size specific indicator variables are negative and significant, al-

though at marginal level for N= 6 (p-value= 0.071) in model (2). Here, we also

reject the hypothesis H0:α+δs= 0 for all s(p-value<0.01). This suggests that

tacit collusion is present in all markets but its magnitude is smaller when N > 2.

The sign of these coefficients in model (5) suggests that markets larger than size

3 increase their market power in response to the first negative shock. Lastly, we

generally reject the hypothesis H0:α+δs+γe= 0 in the most unrestricted model

(4) (11 times out of 15 tests at p-value<0.05). The overall interpretation of these

tests provide the basis of our second result:

Result 2: Excess market power (i.e., tacit collusion) is not invariant to shock

direction and group size. It is persistently higher in duopolies, and in larger-sized

markets it rises following negative cost shocks.

2.5.3 Deviations from Best-Response

Finally, we assess the deviation of subjects’ prices from the profit-maximizing best-

response action that is computed by using the submitted expectations/guesses (i.e.,

pi,t −pBR|E

i,t ). Conditional on the subjects accurately reporting their expectations,

they risk lower profits if they fail to best-respond to these reported expectations.

Consequently, the observed deviations can be attributed either to error or alterna-

tively to strategic motives (i.e. attempts to collude). To argue that the deviations

we observe in our experiments are not entirely due to erroneous behavior, we com-

pare the magnitudes and directions of such deviations to the average magnitude of

expectation errors (i.e., Ei,t−1[p−i,t]−p−i,t) and the average of absolute expectation

errors.24 If subjects deviate from their best-response action in a way that is differ-

ent from their expectation errors, this suggests that deviations reflect intentional

behavior.

Figure 2.4(a) depicts the average value of these deviations over time. The average

expectation errors are remarkably close to zero, with no obvious trend across periods.

Although this suggests that beliefs are on average correct, it does not imply the

complete absence of errors: the average measure of absolute expectation errors lies

well above zero throughout the experiment. The latter peaks following both positive

24We label these latter two as ”errors” rather than as ”deviations” as there is no strategic benefit

to knowingly submitting inaccurate guesses/expectations in our experiment.

Figure 2.4: Deviations from best-response action and errors in expectations

(a) Average of different deviations across all markets

−.1 0 .1 .2 .3 .4

Deviation

0 15 30 45 60 75

Periods

Deviation from best−response Exp. Errors Absolute Exp. Errors

(b) The deviations from best-response action by group size

0 .1 .2 .3 .4

Deviation

0 15 30 45 60 75

Periods

N=2 N=3 N=4 N=6 N=10

Deviation from Best−response

and negative cost shocks but subsequently trends downward. The patterns of high

initial absolute expectation errors and slow convergence are consistent with those

of prior experiments in which prices are strategic complements (e.g., Hommes et al.

2005;Cooper et al. 2021). However, deviations from the best-response action reveal

a different and rather interesting pattern: they peak sharply following negative

shocks and remain high during these rounds, but do not peak similarly following

positive shocks. The second graph in Figure 2.4 depicts the average of deviations

from best-response action by group size. The same pattern can be traced across our

treatment groups.

We perform OLS regressions to study deviations from best-response. Consider

the following specification:

pi,t −pBR|E

i,t =α+X

s=2

δs·I[N=s] + X

e=1

γe·I[r=e] + ϵi,t (2.14)

where the deviation of subject i’s price from the best-response action conditional

on the submitted guess is modeled as a function of group size- and round-specific

indicator variables. Our null hypothesis concerns the overall significance of this

model.

Table 2.4 reports the estimates in a step-wise manner. In model (5), we truncate

the data to the periods where shocks shift marginal cost the same way as we did in

the previous section, and replace the dependent variable with the change in deviation

from best-response action following a cost shock.

We reject the null hypothesis in all specifications. The hypotheses H0:α+γs= 0

in model (2) and H0:α+δs= 0 in model (3) can be rejected at a significance of

p < 0.01. This points out to the following two results: (i) Sellers deviate more (less)

from the associated best-response action following negative (positive) shocks and

(ii) deviations are lower when N > 2. We see that the sign of group size indicator

coefficients in models (4) and (5) are flipped. In model (4), they reflect the fact

that groups of size 3 and larger deviate less, on average, relative to duopolies. In

model (5), they correspond to the immediate reaction of these groups to the first

negative shock. These deviations rise further when a large negative shock shifts

the marginal cost down (ˆγ4= 0.131) while they drop significantly in response to the

large positive shock (ˆγ3=−0.260). In consequence, we reach to the following results:

Result 3.1: Sellers deviate on average above their best-response action.

Result 3.2: Deviations from the best-response action grow (shrink) following

negative (positive) shocks.

Table 2.4: Deviations from best-response

(1) (2) (3) (4) (5)

Constant 0.120*** 0.083*** 0.248*** 0.212*** 0.044

(0.013) (0.012) (0.041) (0.040) (0.028)

N= 3 -0.159** -0.159** 0.122*

(0.049) (0.049) (0.055)

N= 4 -0.138** -0.138** 0.163**

(0.050) (0.050) (0.051)

N= 6 -0.128* -0.128* 0.210***

(0.051) (0.051) (0.041)

N= 10 -0.170*** -0.170*** 0.144**

(0.044) (0.044) (0.046)

r= 2 0.080*** 0.080***

(0.010) (0.010)

r= 3 -0.028** -0.028** -0.260***

(0.010) (0.010) (0.031)

r= 4 0.112*** 0.112*** 0.131***

(0.015) (0.015) (0.025)

r= 5 0.018 0.018 -0.118***

(0.013) (0.013) (0.024)

Observations 18355 18355 18355 18355 980

Adjusted R2- 0.044 0.051 0.095 0.170

F-statistic - 37.840 4.013 19.546 40.442

Notes: Results of OLS regressions on specification 2.14 are reported. In model (5), the dependent

variable is the change in deviation from the best-response action following a cost shock, ∆(pi,t −

pBR|E

i,t ). Robust standard errors are clustered at the market level and are reported in parentheses.

*p < 0.05, ** p < 0.01, *** p < 0.001

2.6 Discussion

Our results point to the co-appearance of asymmetric price transmission and tacit

collusion. The latter seems to be the result of strategic behavior, as our analysis of

deviations from best-response action reveals. These findings are consistent with the-

ories that cast tacit collusion as having a significant role in the emergence of APT,

such as the trigger price model in Borenstein et al. (1997). Most of the other theo-

retical explanations of APT in the literature cannot account for the pricing behavior

observed in our results. We can reasonably exclude, for example, the influence of

explicit collusion (i.e., involving direct communication), capacity constraints, inven-

tory limitations, (a)symmetric menu costs, consumer loss aversion, (a)symmetric

search costs, contexts of alternating price moves and price lockup periods, and so

forth, as being necessary conditions for APT, since these features are excluded by

our design.

We cannot, however, claim a monotonic relation between the magnitude of APT

and tacit collusion: the pricing behavior of duopolies in our experiment is revealed

to be consistent with 1) a negative APT response in the immediate aftermath of

shocks, and 2) elevated and relatively stable pricing behavior between shocks. We

propose an explanation for the exceptionality of the duopoly result follows, tackling

the latter result first: in duopolies collusion is so strong that sellers are, by and large,

able to maintain cooperative (tacitly collusive) pricing over a sustained period of

time, with pricing showing no reversion to Nash. We therefore argue that APT

requires significant, but imperfect, collusion.25

The success of duopoly markets in maintaining average prices well above equi-

librium levels between shocks, and well above average price levels of triopoly and

larger markets, provides a plausible explanation for the negative APT result we ob-

served with duopolies for large shocks: as is apparent in an examination of (2.9), the

high price deviations above NE prices in the lead-up to shocks in duopoly markets

implies a correspondingly greater incentive for firms to deviate downward following

negative than positive shocks. Thus we have the interesting result that while the

dynamics of our duopoly markets appears to have enabled more success in attempts

to tacitly collude between shocks, this success (in the form of elevated prices) also

made these markets relatively more prone to respond strongly to downward than

upward cost shocks.26

If tacit collusion is indeed a significant causal force behind APT, then our work

has important implications for antitrust enforcement policy against collusion and

price-fixing. In particular, regulators may consider APT in a market as a signal for

25The fact that our duopolies reached almost stable collusion, while larger markets did not, is

consistent with the literature we review in Section 2.2. This can be attributable to a combination

of two factors: first, coordination between market participants becomes increasingly difficult with

each new seller, and three may well be the number from which the difficulties and costs involved in

maintaining coordination start to exceed the marginal benefits; second, our duopolies are unique in

that each participant can deduce the choices made by the other participant by observing aggregate

market outcomes. In a triopoly or larger market, by contrast, it is not possible for sellers to detect

whether an aggregate market outcome is due to the defection of a single competitor, or from a

broader but shallower defection by multiple competitors.

26As we noted in the Results section, our dupoloy markets exhibit pricing behavior consistent

with negative APT, while larger markets’ behavior was consistent with positive APT. We see these

apparently contradictory results as indicating a tension between two phenomena operating in op-

posite directions: on the one hand, collusive dynamics encourage competitors to respond more

sharply to upward than to downward cost shocks, as we have argued throughout the chapter; how-

ever, as prices rise above NE levels the differential incentives to deviate become stronger following

negative shocks than positive shocks. At some point the success of tacit collusion may be so great

as to reverse the direction of the resulting APT.

the presence of collusion between firms in that market. Since many real-world inter-

actions between competitive firms are repeated indefinitely, such collusion may even

be sustainable as a NE. Further research is needed to determine whether collusion

is an important cause of APT behavior in field settings, and if so, whether suit-

able forms of regulatory intervention might exist to reduce such collusion without

increases in inefficiency.

We propose that follow-on research may yield further insights into the mecha-

nisms through which tacit collusion leads to APT, as well as potential policy re-

sponses that might diminish its frequency and magnitude. In particular, future ex-

periments should address the impact of different levels of information transparency.

Most notably, testing the effects of providing feedback on individual prices and/or

payoffs of rivals on APT may provide particularly helpful insights. The latter is

shown to lead to more rivalistic outcomes in experimental oligopoly studies, as it

initiates imitation dynamics (Fiala and Suetens,2017), while the former can lead

to more collusive levels. Nevertheless, both may reduce the degree of asymmetry

in price transmission. Another area of needed research is to explore the roles of

market power and market concentration in shaping APT pricing behavior. In our

experimental design we explicitly kept the incentives provided to sellers the same

across markets of varying sizes to study the pure number effect similar to Hanaki and

Masili¯unas (2021). Finally, future studies may benefit from testing the robustness

of our findings to alternative demand specifications. The parameters of demand in

our experimental markets are consistent with goods such as retail gasoline, which

are demanded inelastically but for which suppliers face elastic demand. Our results

are also more relevant to markets in which there are close substitutes.

Appendix B

Appendix

B.1 Quadratic Utility and Linear Demand

Here we derive a linear demand system from an assumption of quasi-linear quadratic

utility.1. Following the notation and proof in Amir et al. (2017) we have:

U(x, y) = a′x−1

2x′Bx +y,

where ais a strictly positive vector of size n,Bis a positive-definite n×nmatrix, xis

an n-vector representing quantities of goods, and yis the quantity of the numeraire

good with price py= 1.

Being that matrix Bis positive definite B−1exists and is also positive definite.

Then, imposing the standard budget restriction p′x+y≤mwith exogenous price

vector pand budget m; assuming interiority condition B−1(a−p)>0 and feasibility

condition p′B−1(a−p)≤m, we arrive at a system of linear demand functions:

q(p) = B−1(a−p).(B.1)

Now, to match the market environment we model in our experiment, we impose

restrictions on aand B: we assume ai=a,bii =band bij =d∀i, j ∈[1, n], i =j

for some strictly positive constants a, b, and d, with b>dto ensure Bis positive

definite. These assumptions are equivalent to assuming that the utility derived from

consumption of each good xis symmetric both in terms of own- and cross-product

parameter values. Intuitively and as we will see, this leads to symmetric (linear)

demand functions for each good x.2

1We thank an anonymous reviewer for comments that inspired the approach followed in this

appendix

2As Amir et al. (2017) point out, our use of the term ”linear demand function” is a slight abuse

of terminology. More correctly we have an affine function whenever the implied result is positive

and zero otherwise.

To explore this further and apply to our specific demand specification, we make

the several definitions and impose the following additional restrictions on a, b, and

•Define parameters δand γ:δ, γ ∈R++

•Define pias the ith element of price vector pand p−i≡1

n−1Pn

j=ipjas the

average of the other n−1 elements in the price vector

•Assume n∈Z, n ≥2

•Restrict a=δnd

•Restrict b=d+n−1

nγ

•For compactness of notation and clarity, define ∆ ≡n−1

nγ ·1

d⇒b=d(1 + ∆)

Next, we impose these restrictions on (B.1) in order to show that:

lim

d→∞ qi(p;d, n, δ, γ) = δ−γ(pi−p−i)

Next, we must rationalize B−1. Noting that b=d(1 + ∆) we can rewrite Bas

d·B, where the diagonal elements of Bare all 1 + ∆ and the off-diagonal elements

are all 1. For example, to illustrate with n= 4:

B=





1 + ∆ 1 1 1

1 1 + ∆ 1 1

1 1 1 + ∆ 1

1 1 1 1 + ∆







This leads to a specification of B−1=d−1·B−1, with:

B−1=1

∆(∆ + n)







n−1 + ∆ −1−1−1

−1n−1 + ∆ −1−1

−1−1n−1 + ∆ −1

−1−1−1n−1 + ∆





.

We then apply these restrictions to (B.1) and have:

q(p) = B−1(a−p) = 1

dB−1(a−p).

Then, the demand for arbitrary good ican be represented as:

qi(p;d, n, δ, γ) = 1

d·1

∆(∆ + n)"(n−1 + ∆)(a−pi)−(−1)

j=i

(a−pj)#

d∆(∆ + n)[∆(a−pi) + (n−1)(a−pi) + (n−1)(a−p−i)]

=a−pi

d(∆ + n)−(n−1)(pi−p−i)

d∆(∆ + n)

Substituting in a=δnd and ∆ = n−1

nγd , we have:

qi(p;d, n, δ, γ) = δnd −pi

dn−1

nγd +n−(n−1)(pi−p−i)

dn−1

nγd n−1

nγd +n

⇒qi(p;d, n, δ, γ) = δ−pi/nd −γ(pi−p−i)

n−1

n2γd + 1(B.2)

Now, evaluating this expression as d→ ∞ we see that limd→∞ n−1

n2γd + 1= 1

and limd→∞(pi/nd) = 0, which implies:

⇒lim

d→∞ qi(p;d, n, δ, γ) = δ−γ(pi−p−i),QED. (B.3)

We thus see that our model of linear demand, with own- and cross-price parame-

ters equal in magnitude, is consistent with an assumption of quadratic, quasi-linear

utility, albeit a special case taken in the limit as parameters a,band dtend towards

infinity in the pathway described above.

B.2 Experimental Material

B.2.1 Instructions

Good morning, and thank you for agreeing to participate in this economics experi-

ment.

Earnings

As compensation, you will be paid a show-up fee of $5. In addition to the show-up

fee, you will have the opportunity to earn additional money. We anticipate that this

experiment will run around 90-100 minutes. The experiment consists of 5 rounds of

15 periods each, a total of 75 rounds. The computer will randomly select a number

between 1 and 5, corresponding to one of the 5 rounds of the experiment. At the

end of the experiment, your additional fee will be equal to the average payout during

that particular round. You will then be paid the added total of the show-up fee plus

the additional fee. You are free to leave at any time; if you do so you will still receive

the $5 show-up fee, but if you leave before the experiment is complete you will not

receive the additional fee. In all cases, your earnings will be paid individually and

anonymously.

Market Setup

In this experiment we will simulate markets, in which you and the other partic-

ipants each play the role of the CEO of a company that produces and sells a single

product in your particular market. You will be randomly grouped with X (n−1)

other companies (participants), and together the Y (n) of you will form this market.

You will stay matched with the same participants in your market for the duration

of the entire experiment. Each company is largely identical, faces the same identical

costs to produce each unit, and has the same profit function. The only thing that

differs between companies is the price they set for the product. Demand for your

product will be simulated by the computer, according to a formula shown on the

payoff sheet we have left at your workstation. The higher your price, the fewer units

you will sell. The higher the average price of the other participants in your market,

the more units you will sell.

During each period of the experiment, all Y (n) of you will be asked to set a

price at which you will each sell the product. You will not know anything about

the price the other participants set, until after you have set your own price. We will

then ask you to guess the average price the other participants set during that same

round. Finally, after all Y participants have set their own prices, we will show you

the average price set by the other X participants, and calculate and show you your

payoff for that particular round. You will be able to see the history of your prices,

the average prices of the other participants, and your resulting payoff for each of the

previous periods within each round, to help you make future pricing decisions.

How to Set Your Price and Predict Your Payout each Period

In the first round of 15 periods, you will face an input cost of $0.90 per unit

produced. “Input cost” is shorthand for the total costs of raw materials, labor,

etc., required to produce one unit of product. The payoff sheet at your workstation

corresponds to this particular input cost. Based on what you guess the average of

others’ costs to be, shown along the columns, you can see how much you will earn

for each potential price you would set, shown along the rows. For example, if the

average price other participants set is $1.50, and you set your price at $2.00, your

payoff would be $4.35. As another example, if the average other participants set

is $2.90 and yours is $2.30, your payoff would be $17.01. Note that the maximum

price you can set for your product is $3.00, because consumers in this market are

not willing to pay more than $3.00 for the product. We only show prices that are

multiples of $0.10, because otherwise the payoff sheet would be too large to print on

a single piece of paper. The price you set can be anywhere between the input price

and $3.00, in increments of $0.01. Finally, note that negative numbers are shown in

the payoff sheet in parentheses, so for example ($1.00) means minus one dollar.

Changes in Input Cost

In each round, there will be a change to the input costs each company faces.

Costs will increase or decrease by either $0.40 or $0.80. You will not know the size

or direction of the change until it is announced at the beginning of each round. At

that time, we will hand out a new payoff sheet that corresponds to the new input

cost. Make sure you use only the payoff sheet that corresponds to the current input

cost. Again, input costs will remain the same for all 15 periods of each round, but

will change at the beginning of each new round.

Please raise your hand if you have a question, and one of us will come to you

at your workstation. Please do not talk or discuss the experiment or anything else

with your neighbors, until after the experiment is complete.

B.2.2 Comprehension Questions

Before we begin the experiment, please complete the following questions. Just write

your answer on this sheet of paper. We will walk by each workstation and look at

your answers, to ensure you understand. If you have questions please ask us when

we come by, but DO NOT discuss or ask questions of your neighbors. If there is

anything needing clarification, we will announce it to everyone in the group at the

same time.

1) True/False: I will be rematched with different participants at the beginning

of each round:

2) If I keep my price the same from one period to the next, but the average price

others in my market set falls, my payout in that period will: (hint – look at the

payout chart and see what happens as you go from right to left on any given row)

a. Rise

b. Fall

c. Stay the Same

d. May Rise or Fall, or Stay the Same

3) If the average price of others in my market stays the same between periods,

but if I increase my price, my payout that period will: (hint – look at the payout

chart, but this time see what happens as you go from top to bottom on any given

column)

a. Rise

b. Fall

c. Stay the Same

d. May Rise or Fall, or Stay the Same

4) Use the Payoff Chart to answer the following questions:

a. If the average price others set in my market is $2.80, what is the price I could

set that would maximize my own payout that period?

b. What would be the amount of that payout?

B.2.3 Payoff Table and Experimental Interface

Figure B.1: Payoff matrix

Notes: Exemplary payoff matrix/table. Provided to subjects when marginal cost is equal to $0.90.

Figure B.2: Screen: Price setting

Notes: An example for price setting screen.

Figure B.3: Screen: Guess setting

Notes: An example for guess decision screen.

Figure B.4: Screen: Feedback

Notes: An example for feedback screen.

B.3 Additional Analysis

B.3.1 Regression Results of Asymmetry

The regression estimates used in Figure 2.2 of the main text are reported on Table

B.1 under column (1). The remaining columns introduce control variables in a step-

wise manner. In all models, the dependent variable is the change in output price

(i.e., ∆pi,t =pi,t −pi,t−1).

Table B.1: Estimation of asymmetry

(1) (2) (3) (4) (5)

∆mct0.497*** 0.487*** 0.227** 0.224** 0.769***

(0.050) (0.051) (0.066) (0.066) (0.070)

∆mct−1-0.540*** -0.520*** -0.169 -0.165 -1.265***

(0.076) (0.076) (0.088) (0.088) (0.125)

∆mct−20.566*** 0.527*** 0.362*** 0.352** 1.257***

(0.099) (0.099) (0.104) (0.104) (0.134)

∆mct−3-0.295*** -0.265*** -0.172* -0.164* -0.582***

(0.063) (0.063) (0.065) (0.064) (0.077)

∆mct−40.063*** 0.053** 0.030 0.028 0.102***

(0.016) (0.016) (0.016) (0.016) (0.018)

k= 0 (ct) 0.412*** 0.432*** 0.583*** 0.043 0.478***

(0.106) (0.109) (0.140) (0.144) (0.118)

k= 1 (ct−1) 0.152 0.113 -0.026 -0.036 0.004

(0.098) (0.100) (0.124) (0.123) (0.118)

k= 2 (ct−2) -0.230 -0.152 -0.134 -0.113 -0.084

(0.122) (0.125) (0.141) (0.140) (0.140)

k= 3 (ct−3) 0.161* 0.102 0.082 0.067 0.067

(0.077) (0.078) (0.082) (0.081) (0.087)

k= 4 (ct−4) -0.051* -0.031 -0.024 -0.018 -0.008

(0.021) (0.021) (0.020) (0.020) (0.022)

N= 3 -0.007*** -0.005*** -0.005 -0.010***

(0.002) (0.001) (0.003) (0.003)

N= 4 -0.007*** -0.005*** -0.008*** -0.009***

(0.001) (0.001) (0.002) (0.002)

N= 6 -0.009*** -0.007*** -0.012*** -0.013***

(0.002) (0.001) (0.002) (0.003)

N= 10 -0.008*** -0.006*** -0.007** -0.011***

(0.002) (0.001) (0.002) (0.003)

∆p−i,t−10.287*** 0.287*** 0.347***

(0.022) (0.021) (0.024)

Three-way interaction terms with immediate asymmetry

(N= 3) ×ct0.492*

(0.192)

(N= 4) ×ct0.680***

(0.142)

(N= 6) ×ct0.828***

(0.154)

(N= 10) ×ct0.561***

(0.143)

AR(4) included? No No No No Yes

N17130 17130 17130 17130 17130

adj. R20.132 0.133 0.165 0.178 0.301

F-statistic 92.240 70.604 160.112 136.818 135.597

Notes: Robust standard errors are clustered at the market level and are reported in parentheses. *

p < 0.05, ** p < 0.01, *** p < 0.001

B.3.2 Pass-through Rates

Table B.2 reports the pass-through rates when τ= 14 for different aggregation levels

of data.

Table B.2: Asymmetry in the pass-through rates after 14 periods

All N>2N= 2 N= 3 N= 4 N= 6 N= 10

Small shocks

β−

14 0.881 0.957 0.483 1.004 1.002 1.088 0.749

(0.0543) (0.0518) (0.192) (0.145) (0.0764) (0.114) (0.0816)

β+

14 0.599 0.729 -0.0806 0.783 0.815 0.479 0.836

(0.0518) (0.0432) (0.197) (0.0809) (0.0818) (0.110) (0.0564)

p-value 0.000 0.001 0.108 0.145 0.066 0.000 0.475

Large shocks

β−

14 0.821 0.902 0.398 0.811 0.961 0.878 0.936

(0.0251) (0.0205) (0.0849) (0.0598) (0.0393) (0.0417) (0.0205)

β+

14 0.802 0.866 0.465 0.637 0.978 1.002 0.796

(0.0297) (0.0270) (0.105) (0.0765) (0.0450) (0.0430) (0.0399)

p-value 0.064 0.042 0.955 0.021 0.331 0.156 0.000

Observations 245 209 36 39 52 48 70

Notes: The averages of pass-through rates by differing group sizes are reported. Below averages, standard errors are

reported in parentheses. p-values correspond to the result of the Wilcoxon signed-rank test on the equality of pass-

through rates for small or large shocks (i.e. H0:β+

14 =β−

14).

B.3.3 Non-parametric Test on Excess Market Power

Table B.3 reports average of excess market power broken out by round and by group

size. The rows 7 and 8 report the p-values resulting from Wilcoxon signed-rank test

on the average of excess market power for small and large shocks, respectively. The

null hypothesis is that the average of excess market power during rounds 2 and 5

(or 3 and 4) are equal.

Table B.3: Non-parametric test on excess market power

Rounds All Markets N > 2N= 2 N= 3 N= 4 N= 6 N= 10

10.025 0.021 0.051 0.019 0.020 0.029 0.017

2 0.043 0.037 0.077 0.034 0.033 0.046 0.034

3 0.011 0.008 0.029 -0.012 0.019 0.019 0.004

4 0.048 0.042 0.083 0.038 0.044 0.052 0.038

5 0.030 0.025 0.061 0.026 0.029 0.027 0.020

All Rounds 0.032 0.027 0.060 0.021 0.029 0.034 0.023

Small η(2vs5) 0.000 0.000 0.499 0.115 0.109 0.000 0.000

Large η(3vs4) 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Observations 245 209 36 39 52 48 70

Chapter 3

Measuring Strategic-Uncertainty

Attitudes

An earlier version of this chapter is published as: Bruttel, L., Bulutay, M., Cornand,

C., Heinemann, F. and Zylbersztejn, A., 2023. Measuring strategic-uncertainty

attitudes. Experimental Economics, 26(3), pp.522-549.

3.1 Introduction

Strategic uncertainty is the uncertainty that players face with respect to the pur-

poseful behavior of other players in an interactive decision situation. While economic

theory mostly applies equilibrium concepts like Nash or rational expectations equi-

libria that are based on the absence of strategic uncertainty, experiments show that

real decision makers are sensitive to strategic uncertainty. Laboratory experiments

have indicated that many humans exhibit strategic uncertainty aversion: they are

ready to waive a part of their expected payoff in order to avoid that their pay-

off depends on the decisions made by others.1This behavioral phenomenon has

far-reaching consequences for economic efficiency, because it implies coordination

failures and suboptimal levels of investment and risk taking in markets.

From early experiments, we know that humans tend to prefer situations with

known probabilities of outcomes to “ambiguous” situations in which these proba-

bilities are unknown (Camerer and Weber,1992). This attitude is called ambiguity

aversion. Tests of ambiguity aversion traditionally compare choices between lotter-

ies with given probabilities and lotteries for which the probabilities are exogenously

given but unknown to subjects. Ambiguity aversion might also apply to strategic

interaction. However, the beliefs about the strategic behavior of other humans are

also affected by the theory of mind: agents may put themselves in the shoes of other

1See, for example, Greiner (2016).

decision makers and form beliefs about their reasoning processes. This idea has been

taken to the extreme by the Nash equilibrium concept in which each player’s strat-

egy is a best response to the other players’ strategies. As a descriptive theory, Nash

equilibrium assumes that players are able to guess the strategies of others either by

simultaneously solving the others’ decision problems or by relying on experience (as

in repeated games). Such reasoning processes may reduce perceived strategic un-

certainty, so that strategic uncertainty aversion may have lower effects on behavior

than ambiguity aversion in lotteries with completely unknown probabilities. On the

other hand, strategic interactions are also more complex to analyze than lotteries.

Humans try to avoid complexity and may doubt the logical consistency of their own

reasoning processes or the logical consistency of other players’ reasoning processes

or decisions.

This chapter develops a method for measuring strategic-uncertainty attitudes

and distinguishing them from risk and ambiguity attitudes. The main idea is to

elicit and exploit the information contained in certainty equivalents (willingness to

accept) for lotteries under three different sources of uncertainty: strategic uncer-

tainty, risk and ambiguity. We provide a structural model of uncertainty attitudes

that allows us to measure two dimensions of uncertainty attitudes: a preference for,

or aversion against, the source of uncertainty, modelled by an additional [dis]utility

depending on the source, and optimism or pessimism2regarding the outcome, which

we formalize as a shift of the subjective weight that is put on the higher outcome.

We conduct an experiment with interactive games and interaction-free lottery

tasks. Unlike previous experiments, our novel methodology allows for a variation

of the source of uncertainty (whether strategic or not) across conditions in a ceteris

paribus manner. This means that we keep the potential payoffs constant but consider

different mechanisms (random or strategic) that determine the realized payoff. Since

strategic uncertainty typically characterizes coordination problems, we focus on two

coordination games: one with strategic complementarities in agents’ actions and

one with strategic substitutability (anti-coordination). Following the literature on

strategic uncertainty (see below), we apply our methodology to two classic 2x2

games: stag-hunt and market-entry games.3

For the different sources of uncertainty – each of the two games, as well as the

corresponding ambiguous lottery environments – we identify two subject-specific

parameters of a model of uncertainty attitudes. We investigate two ways in which

2One might also interpret these as excitement or fear about the other player’s behavior.

3Stag-hunt games provide a useful paradigm to analyze a wide range of economic phenomena,

such as macroeconomic fluctuations (Cooper and John,1988), bank runs, debt and liquidity crises,

speculative attacks (Morris and Shin,2003;Heinemann,2012), and commercial production pro-

cesses (Brandts et al.,2015). Market-entry games describe the prototypical situation of conflicting

interests, such as Cournot competition or location choice.

strategic uncertainty may affect behavior in a game. First, following Baillon et al.

(2017), we define ambiguity as a situation where subjects have information about

possible outcomes of a lottery but not about probabilities. Whether these given –

exogenous to the decision maker – unknown probabilities are resulting from human

decisions or nature does not affect this definition of ambiguity. We investigate

whether, all other things being equal, attitudes towards uncertainty differ between

strategic uncertainty and ambiguity conditions. Second, strategic uncertainty is

related to conscious behavior of human players whose interaction exhibits common

or opposite interests, and as such involves decisions based on strategic thinking. We

study how the nature of the game (strategic complements versus substitutes) affects

these uncertainty attitudes.

We document systematic attitudes toward uncertainty. These attitudes vary

across contexts and across subjects. The median participant exhibits neither a

preference for, nor an aversion against ambiguity or strategic uncertainty. In the

game with strategic complements [substitutes], the median participant is found to be

pessimistic [optimistic] regarding the outcome that leads to a higher payoff given the

player’s own choice. Comparing uncertainty attitudes across treatments, we observe

more optimism in the entry game than in the stag-hunt game or under ambiguity

(both of which, in turn generate similar results).

The next section describes our contribution to the literature. Section 3.3 presents

the experimental design and procedures. Section 3.4 lays out the theoretical under-

pinnings of our design. Section 3.5 shows the results and Section 3.6 concludes the

chapter.

3.2 Related Literature

Brandenburger (1996) defines strategic uncertainty as uncertainty about the pur-

poseful behavior of players in an interactive decision situation. Experimental evi-

dence reported in Beard and Beil (1994) can hardly be explained without assuming

that players dislike situations in which their payoffs depend on the decisions made

by other players. Camerer and Karjalainen (1994) attribute this behavior to ambi-

guity aversion, because there are no given probabilities for other players’ strategies.

They use non-additive probabilities as in Gilboa and Schmeidler (1989) to model

ambiguity aversion and argue that ambiguity aversion may be responsible for players

not reaching an efficient equilibrium in coordination games with strategic comple-

ments (like the median effort game). Camerer and Karjalainen (1994) conduct an

experiment on the median effort game, in which they elicit bounds on subjective

probabilities for complementary and exhaustive events defined on the outcomes of

the game. If the sum of these probabilities is smaller than one, a subject can be

said to be ambiguity averse. Unfortunately, their method of eliciting subjective

probabilities seems rather fragile as it may produce contradictory results and does

not allow a clear distinction between subjective beliefs about others’ behavior and

aversion against strategic uncertainty.

Greiner (2016) is the first to clearly identify aversion against strategic uncer-

tainty by comparing behavior in dictator, ultimatum, and impunity games. He

shows that behavior in these games indicates a substantial aversion against strate-

gic uncertainty that may be higher than ambiguity aversion. Subjects pay high

prices for avoiding that their payoff depends on the decisions of their partners, even

though they attribute high subjective probabilities to their partners’ decisions being

favorable for them.

Bohnet and Zeckhauser (2004) find similar evidence in a trust game, where the

second mover could either be another subject or a lottery. They attribute subjects’

reluctance to depend on human second movers as betrayal aversion, but strategic

uncertainty aversion might have played some role. Li et al. (2020) find that ambi-

guity preference affects the decision to trust a trustee. Note that the games used

by Greiner (2016) and Bohnet and Zeckhauser (2004) all have a unique equilibrium

and equilibrium choices of the second movers can be derived simply by eliminating

dominated strategies.

Kelsey and le Roux (2015) analyze behavior in an extended battle of the sexes

game and find further evidence indicating that strategic uncertainty aversion may

exceed ambiguity aversion in non-strategic games. They also conjecture that not

only strategic uncertainty, but also strategic uncertainty aversion may depend on

the nature of the game. However, they have no means to test this hypothesis.

Nevertheless, this conjecture has to be taken seriously, because Ivanov (2011) finds

that in a game that is solvable by iterative elimination of dominated strategies, 32

percent of subjects are strategic uncertainty loving, while only 22 percent are averse

to strategic uncertainty.

Nagel (1995) provides an experimental test of a game with strategic complements

and shows that behavior can be described by assuming that subjects follow distinct

levels of reasoning, where Level zero is defined as random choice of a strategy and

Level k is defined as best response to Level k-1. Camerer et al. (2004) develop a

cognitive hierarchy model based on levels of reasoning. Uncertainty about other

players’ strategies can be modelled as uncertainty about the levels of reasoning

applied by other players. In games with strategic complements, the number of

levels of reasoning is in a monotone relationship with actions and, thus, experiments

on such games can be used to measure the distribution of levels among players, but

also the beliefs about others’ levels of reasoning. In games with strategic substitutes,

however, the optimal strategy for a given number of levels of reasoning is non-

monotonic. In entry games, for example, the optimal decision is to enter for any odd

number of levels and to stay out for any even number of levels (or vice versa). This

raises the question whether perceived strategic uncertainty or strategic uncertainty

aversion differ between games with strategic complements and substitutes.

Heinemann et al. (2009) propose a method to measure strategic uncertainty in

coordination games with strategic complements. They let subjects play a variety

of games, each consisting of a choice between two options A and B. Option A is

associated with a safe payoff X, while Option B paid 15eif at least a fraction k

of the other subjects were choosing B in the same game and zero otherwise. The

safe payoff was varied from 1.50eto 15eand subjects typically switched from B to

A at some value of the safe payoff. The safe payoff at the switching point can be

interpreted as certainty equivalent for the uncertain option in this game and, thus,

be used as a measure for strategic uncertainty. Subjective probabilities for success of

Option B can be elicited directly or derived from comparing the certainty equivalent

of a strategic game with certainty equivalents of lotteries with given probabilities.

As the safe payoffs are part of the game and any pair A-B is a different game,

switching points only provide precise measures of strategic uncertainty for games in

which subjects are indifferent between A and B. Thus, this method can only give

upper or lower bounds for strategic uncertainty in games in which subjects reveal

their preference for one or the other option by choosing it.

Following the same method as Heinemann et al. (2009), recent work by Chier-

chia et al. (2018) elicits certainty equivalents for choosing the uncertain option in

coordination games with strategic complements (stag-hunt games) and substitutes

(entry games). They find that most subjects have a unique switching point in stag-

hunt games, but multiple switching points for entry games, which is in line with

higher levels of reasoning.4The observed multiple switching points in entry games

indicate, however, that levels of reasoning and strategic uncertainty may be related,

for which reason we focused on games with strategic complements and substitutes

to measure strategic uncertainty aversion. In addition, many simultaneous-move

games are characterized by strategies being either complements or substitutes, and

games with these characteristics are applied in many domains of economics to model

competition, monetary policy, financial crises, network externalities in growth, and

4Nagel et al. (2018) explain multiple switching points in entry games by the higher demand

for strategic reasoning compared to a stag-hunt game. They analyze the brain activity of subjects

during decision-making in an fMRI scanner. They show that strategic games activate the brain

network that also mediates risk during lottery decisions (anterior insula, dorsomedial prefrontal

cortex and parietal cortex) which indicates that strategic uncertainty is treated in a similar way as

other forms of uncertainty. The activation of the risk mediating network is highest when subjects

chose the risky action in the entry game which indicates that entry games are associated with a

higher perceived strategic uncertainty. The level of strategic thinking is reflected in the activity of

the dorsomedial and dorsolateral prefrontal cortex. These regions are more active among players

with non-threshold strategies in the entry game, indicating higher levels of reasoning.

political economy issues, to name just a few.

While multiple price lists used by Heinemann et al. (2009) and others allow for

measuring strategic uncertainty, the authors do not clearly distinguish strategic-

uncertainty attitudes from ambiguity attitudes.56 At best, the existing methods

suffice to distinguish whether a subject likes or dislikes strategic uncertainty. While

the general conclusion is that subjects dislike strategic uncertainty, Ivanov (2011)

provides evidence that strategic uncertainty may be preferred to risk. We thus

reckon that the literature lacks a clear methodology to measure strategic-uncertainty

attitudes. We fill this void by developing a method that can be used to measure

strategic-uncertainty attitudes for any strategic binary-choice game and distinguish

optimism or pessimism regarding the outcome of the game from a preference for or

aversion against the source of uncertainty.

3.3 Experimental Design and Procedures

We develop a method for measuring attitudes towards strategic uncertainty. We use

a within-subject design based on three distinct experimental conditions. The main

condition of interest is STRATEGICUNCERTAINTY, in which the uncertainty that

players face in the game stems from other players’ behavior. We also include two

control conditions: RISK (the aim of which is to establish a behavioral benchmark

for a pre-determined structure of uncertainty, where possible outcomes and associ-

ated probabilities are known) and AMBIGUITY (which captures behavior under

uncertainty, where possible outcomes are known but associated probabilities are

unknown).

Each subject acts in all of the three decision-making environments in the follow-

ing order: RISK,AMBIGUITY, and finally STRATEGICUNCERTAINTY. The

STRATEGICUNCERTAINTY treatment is played for two distinct 2-player, 2-

strategy settings: one with strategic complements, the stag-hunt game (see Game 1

in Table 3.1 below), and one with strategic substitutes, the entry game (see Game

2 in Table 3.2 below). The order, in which subjects face the two games, varies. In

half of the sessions, the STRATEGICUNCERTAINTY treatment starts with sub-

5Heinemann et al. (2009) compare strategic uncertainty to risk. Apart from the research ques-

tion itself, many design features of our experiment differ from theirs (e.g. elicitation of certainty

equivalent and subjective beliefs). In their experiment, subjects choose between a safe payoff and

a risky payoff that they get if and only if a sufficient number of subjects chooses the risky option.

Thus, the safe payoff was not a certainty equivalent for the game, but part of the game itself.

Hence, the method employed by Heinemann et al. (2009) cannot identify any attitudes towards or

against strategic uncertainty. In contrast, we elicit certainty equivalents for each potential action

in the game without the stated certainty equivalents affecting payoffs in the game.

6A comparison of risk and ambiguity driven either by human behavior or computer is proposed

by Farjam (2019). However, he focuses on non-strategic human-driven uncertainty and shows that

computerized uncertainty is preferred.

jects facing Game 1 before Game 2, and conversely in the other half of the sessions.

The payoff structure in Tables 3.1 and 3.2 is such that in each game each player

decides between two “lotteries” (one lottery pays either 20eor 15e, the other ei-

ther 5eor 25e) in which the outcome depends on the other player’s decision. We

elicit the certainty equivalents for both of these “lotteries” along with subjective

beliefs, and compare them with certainty equivalents of analogous binary lotteries

with exogenously given probabilities.

Table 3.1: Game 1 and associated

payoffs

You

The other player

L R

L 20e, 20e15e, 5e

R 5e, 15e25e, 25e

Table 3.2: Game 2 and associated

payoffs

You

The other player

L R

L 5e, 5e25e, 20e

R 20e, 25e15e, 15e

Prior to the RISK treatment, subjects take part in five unpaid lotteries under

the same design as the RISK treatment. The goal of this training part is to ac-

custom subjects with the basic mechanisms at play, and especially to let them gain

experience with the Becker-DeGroot-Marschak procedure (Becker et al.,1964). Un-

like the main part of the experiment that follows, in this preliminary part subjects

receive feedback after each lottery.

The three treatments are summarized in Section 3.3.1. The key feature of our

experimental design is that it varies the source of uncertainty, keeping the remain-

ing aspects of the decision-making process as identical as possible across treatments.

This, in turn, allows for isolating and measuring the behavioral effect of strategic un-

certainty as compared to other sources of uncertainty. The experimental procedure

is outlined in Section 3.3.2.

3.3.1 Treatments

While the STRATEGICUNCERTAINTY is played last in our experiment, we present

it first because it is our main treatment. We then present the two control treatments,

which are played first.

Main treatment: STRATEGICUNCERTAINTY

This treatment consists of two consecutive parts, each involving a different game

(either Game 1 or Game 2). The order of games is balanced across sessions. Subjects

are randomly and anonymously matched into pairs for each game.

In each session there are 12 subjects. This allows us to consistently use frequency-

based framing (“how many times out of 10”) when eliciting beliefs about others’

behavior.

In the STRATEGICUNCERTAINTY treatment, each subject makes 4 decisions:

–Decision 1: The choice between L and R in the game.

–Decision 2: Subjective beliefs about the behavior of the other subjects. We

ask the following question: Out of the 10 other participants (not including the own

counterpart) in this session, how many would choose R? Beliefs are incentivized

using a binarized quadratic scoring rule.7

–Decision 3: The certainty equivalent (WTA) for not playing the game if De-

cision 1 is implemented.

–Decision 4: The certainty equivalent (WTA) for not playing the game if the

alternative of Decision 1 is implemented.

7Note that quadratic scoring rules are incentive compatible only for expected-payoff maximizers.

Biases may occur for non-risk-neutral subjects (Offerman et al.,2009). Schotter and Trevino (2014)

provide a survey on experimental belief elicitation. The binarized quadratic scoring rule (Hossain

and Okui,2013) (BSR) incentivises truthful reporting of beliefs independently of risk-preferences

and the (non-linear) form of probability weighting. Danz et al. (2020) have recently shown that

in practice subjects misreport their beliefs even with the BSR. However, they also show that

“false reporting and pull-to-center effects arise only when participants are informed of the BSR’s

quantitative incentives” (Danz et al.,2020, p.2). For this reason, we apply the binarized quadratic

scoring rule, but in the instructions, we present the details only on demand and solely tell subjects

the principle of this mechanism and that it is in their own interest to state their true beliefs.

Alternatively, we could correct the stated beliefs from a standard quadratic scoring rule using the

estimated relative risk aversion along the lines laid out in Offerman et al. (2009). However, this

exercise also requires structural assumptions that, if misspecified, may bias the findings even more

than using the stated beliefs without correction. See the experimental instructions in Appendix

C.1 for implementation details of the BSR in our study.

We allow WTAs in Decisions 3 and 4 to be stated on [0, 30e]. This exceeds the

range of potential payoffs so as to detect strong aversion against or strong preference

for strategic uncertainty. Payoffs are determined as follows:

A. With 1/3 probability, the game is played and payoffs are determined by both

subjects’ Decision 1.

B. With 1/3 probability, subjects are paid for their stated beliefs.

C. With 1/3 probability, a subject’s own payoff depends on her own stated WTAs

and on the other subject’s Decision 1. Here, each subject’s payoffs are deter-

mined as follows:

1. One of two possible actions – either L or R – is drawn at random (with

50% chance for the own preferred action) and replaces the subject’s own

Decision 1.

2. For that action, the BDM procedure takes place. The computer draws

a random integer from 1 to 30e. All integers are equally likely. If the

drawn integer is larger than or equal to the stated WTA for that action,

then the subject’s payoff equals the randomly drawn number.

3. If the drawn integer is smaller than the stated WTA for that action, the

subject’s payoffs are determined by that action and by Decision 1 of the

other subject.

With this design, a subject’s own Decision 1 is only payoff-relevant for her if

the game is actually played (Situation A). Thus, each subject’s Decision 1 is not

affected by her choice of WTAs. Hence, beliefs about the other’s Decision 1 are not

affected by beliefs about the other’s WTA either. Thereby, we provide the highest

incentive for subjects to activate their theory of mind as intended for Game 1 or

Game 2. The decision on WTAs depends solely on beliefs about the Decision 1 of

the other subject and it requires the same considerations. Our procedure elicits the

WTAs for the action that the subject would have chosen herself and also for the

counterfactual non-preferred decision. This allows us to identify two parameters of

a model of strategic uncertainty that can be interpreted as uncertainty aversion and

optimism (see Section 3.5). Theoretically, the higher of the two stated WTAs is the

WTA for the entire game.

For comparability purposes, we design the two control treatments in a similar

frame as the STRATEGICUNCERTAINTY treatment. These two treatments vary

the source of uncertainty. In the RISK treatment, uncertainty is generated by a

random process with known probabilities. In the AMBIGUITY treatment, the out-

come is determined by an unknown probability distribution.

Control treatment 1: RISK

In this treatment, each subject is faced with 11 pairs of lotteries (lotteries

15e/20eor 5e/25eassociated with 11 given probabilities p). Here, we only ask

for 22 WTAs for the respective 22 lotteries.

A subject’s own payoff depends on her own stated WTAs and is determined as

follows. The computer determines which of the 22 lotteries is carried out. Each

lottery is equally likely to be selected. Then, the BDM procedure takes place. The

computer draws a random amount from 0 to 30ewith 2 decimals. If the drawn

amount is larger than or equal to the stated WTA for the selected lottery, then the

subject’s payoff is equal to the randomly drawn amount. Otherwise, the lottery

is played. Altogether, each subject makes 22 decisions using a table of contingent

choices similar to Table 3.3.

Table 3.3: Decision table in the RISK treatment

Probability with which the

computer selects the higher

payoff

WTA for lottery that pays

either 15eor 20e

WTA for lottery that pays

either 5eor 25e

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

The 11 lotteries on the left-hand side of the table pay either 15eor 20e, the 11

lotteries on the right-hand side pay either 5eor 25e. In any lottery, the computer

determines randomly which of the two possible payments is made. Subjects receive

information about which part of the experiment and eventually which lottery is se-

lected for payoffs only at the end of the experiment after all decisions are completed.

Control treatment 2: AMBIGUITY

In this treatment, each subject is faced with one pair of lotteries that are pre-

sented in the same way as potential payoffs in the previous treatment, but this time,

subjects are not told the likelihood that the higher payoff is chosen. Subjects are

informed that the computer selects one of the 11 distributions from the RISK treat-

ment before their own decision. We inform them that the 11 distributions are not

equally likely to be selected.8As in the RISK treatment, each subject states WTAs.

Here, we ask for two WTAs, one for each lottery. In addition, each subject states her

belief about the selected probability distribution. The computer randomly decides

whether subjects get paid according to the BDM procedure, or according to their

stated beliefs (with 1/2 probability each). The computer selects the probability for

the higher payoff and, if the BDM procedure is payoff-relevant, one of the lotteries

(L/R) is selected with 50% chance. As a next step of the BDM procedure, the com-

puter draws a random amount from 0 to 30e. If the random amount is larger than

or equal to the stated WTA for that lottery, then the subject’s payoff is equal to

the randomly drawn amount. Otherwise, the lottery is played with the probability

distribution selected by the computer.

3.3.2 Implementation Details

The design of the experiment was approved by the local GATE-Lab (Lyon) ethic

committee. We ran 19 sessions (including the pilot session) with 12 participants each

(maximal capacity during the COVID pandemic) at the Experimental Economics

Laboratory of the Technische Universit¨at Berlin, Germany, between September and

October 2021.910 Participants were recruited through ORSEE (Greiner,2015) and

95% of them were students from various disciplines – engineering (41.7%), economics

(8.8%), and business administration (6.6%) representing the largest groups. The

8For the sake of implementation, the random process generating probability distributions in

lotteries played under ambiguity is based on 2019 weather data from Berlin.

9In the pre-results reviewed report, we planned to run sessions with a minimum of 14 partic-

ipants at the GATE-Lab, Lyon, France. This initial plan could not be implemented due to the

pandemic conditions.

10The minimal sample size determined by the power analysis is N= 208. Our power calculations

(GPower software, version 3.1) are based on the nonparametric two-sided Wilcoxon signed rank

test. We assume normal parent distribution. We apply the following criteria. First, a test attains

the statistical power of at least 0.8 (which is a common-place reference value in the literature) with

the conventional threshold for rejecting a null hypothesis of 5%. Second, the minimal effect size

(as measured by Cohen’s d) a test can pick up on is small (d= 0.2). The resulting actual power

equals 0.801. Given our initial sample of N= 228 (i.e., prior to applying both the pre-registered

and the ex post data selection criteria, as explained in Section 3.5.1) and d= 0.2, the resulting

statistical power is even higher and equals 0.836 at the 5% significance level. Conversely, with a

reference minimal power of 0.8 (the actual one being 0.801), this sample size is enough to pick up

on a treatment effect of magnitude d= 0.191.

experiment was programmed using z-Tree (Fischbacher,2007).

Participants were randomly seated in front of PCs. Throughout the sessions,

they were not allowed to communicate with one another and could not see each

other’s screens. All questions were answered in private.

Only one of the four parts (risk, ambiguity, stag-hunt game, entry game) was

chosen for final payoffs. The probability was 0.25 for each part. Within the selected

part, the payoff was determined as specified in Section 3.3.1. This means that

only one decision of a player was payoff relevant, but each decision could be the one.

This procedure rules out incentives for hedging and provides the highest incentive to

consider the uncertainty of the outcome associated with each decision. The average

payoff was about 21.80e(minimum 6.60e, maximum 34.80e) including the fixed

show-up fee of 5e. Sessions lasted for around 70 minutes on average. Examples of

instructions, questionnaires, and screens are given in Appendix C.1.

3.4 Theoretical Framework

Let us start our theory considerations by observing that any choice in a simultaneous-

move game may be interpreted as a choice between lotteries whose outcomes depend

on the choices of other players. Our 2x2 games involve the choice between a lottery

L with payoff 20eor 15eand lottery R with payoff 5eor 25e. Note that the prob-

ability of receiving 15eafter choosing L is the same as the probability of receiving

25ewhen choosing R. It is the probability that the other player chooses R. In the

stag-hunt game (Game 1), a player gets the higher payoff of her chosen lottery, if

her partner chooses the same lottery. In the entry game (Game 2), a player gets the

higher payoff, if her partner chooses the other lottery.

The value of a lottery k for a subject i can be written as

i(¯x|πi) = E(ui(x)|πi) + ∆k

i(¯x|πi)

where ui(x) is subject i’s utility function, ¯xis the vector of potential monetary pay-

offs and πiis the subject’s probability distribution over outcomes. For an expected-

utility maximizer, ∆k

i(x|.) = 0 for all x. If we assume that subjects evaluate lotter-

ies with exogenously given probabilities by expected utility, the attitude towards or

against ambiguity or strategic uncertainty can be written as a deviation of the eval-

uation from expected utility, denoted by ∆k

i(x|.). A theory of ambiguity attitudes

specifies this deviation.

We propose to model ambiguity attitudes for binary lotteries and strategic-

uncertainty attitudes for a 2x2 game by two parameters αk

iand δk

isuch that the

utility value that subject i attaches to the possible outcomes from her own choice is

i(x1, x2, πi) = (πi+αk

i)ui(x1) + (1 −πi−αk

i)ui(x2)−δk

i,(3.1)

where x1≥x2are the potential monetary payoffs and πiis the subjective proba-

bility for receiving x1. The parameter δk

imay be interpreted as an aversion against

strategic uncertainty if it is positive, or as a preference for strategic uncertainty if

it is negative. The higher δk

i, the lower is the value of the lottery, in line with the

interpretation of an increasing aversion against uncertainty. The parameter δk

iaf-

fects the value of a lottery independent of the perceived risk that is associated with

it. The second parameter, αk

i, establishes the weight that the subject puts on the

higher outcome given her own choice. If αk

i>0, the subject puts a weight on the

higher payoff that exceeds her subjective probability for this outcome. If αk

i<0,

the subject puts a weight on the lower payoff that exceeds her subjective probability

for this outcome. We may interpret this parameter as optimism, where αk

i= 0 is the

unemotional Bayesian view on the lottery, while subjects with αk

i>0 may be called

optimists and subjects with αk

i<0 pessimists. Optimism [pessimism] may arise

from the excitement [fear] about the prospect of getting the high [low] amount when

it is determined by another human playing strategically (strategic uncertainty) or

by an unknown process (ambiguity). Note that the value of the lottery rises with

increasing optimism. Thereby, our model allows for a clear interpretation of both

parameters.

The value of an ambiguous lottery or a game may be higher [lower] than the

value of the highest [lowest] possible realization under certainty. From the standard

economic perspective, it may seem odd that the value of an uncertain situation could

be higher than the highest possible payoff or lower than the lowest one. However,

this may reflect particular attitudes towards strategic interactions with other human

players: a person may be generally uncomfortable with depending on other humans,

or may derive utility from playing a game with somebody else on top of the utility

generated by the monetary payoffs in this game.

In our experiment, we also elicit the certainty equivalent of participating in the

game, if the player’s chosen action is replaced by the opposing action. If the subject

is optimistic about getting x1= 25ein the game with his chosen action, she must

be pessimistic about receiving ¯x1= 20eunder the replaced choice. Thus, for this

counterfactual choice, the value of the implied lottery is

i(¯x1,¯x2,1−πi) = (πi+αk

i)ui(¯x2) + (1 −πi−αk

i)ui(¯x1)−δk

i,(3.2)

where ¯x1>¯x2are the payoffs implied by the counterfactual choice.

An alternative theory of ambiguity attitudes is the Choquet-expected utility with

neo-additive capacities that specifies a value function (Chateauneuf et al.,2007)

Vi(¯x, πi) = X

(1 −δi)πi(x)ui(x) + δi[αiui(xmax) + (1 −αi)ui(xmin)].

For a lottery with only two possible outcomes x1≥x2,

Vi(x1, x2, πi) = (πi+δi(αi−πi))ui(x1) + ((1 −πi)−δi(αi−πi))ui(x2)

=E(ui(x)|πi) + δi(αi−πi)[ui(x1)−ui(x2)].

The interpretation, given in the literature (see e.g., Greiner 2016), is that δiis the

ambiguity of a player (1 −δi) is her trust in her own beliefs) and αiis optimism.

By this interpretation, an increasing ambiguity may raise or lower the value of

the lottery, depending on whether optimism exceeds or stays below the subjective

probability for the higher payoff. The interpretation of αimay also cause a problem.

For δi>0, the evaluation rises in αi, but for δi<0, increasing “optimism” reduces

the value of the lottery. Restricting αiand δito be in [0,1] avoids this, but may be

inconsistent with large deviations of the value of a lottery from the expected utility

that it implies. Finally, the parameters are not identified from the evaluations of the

two lotteries that a subject can choose in a 2x2 game, if she assigns πi= 0.5 to the

other player’s choices. For these reasons, we use the model described by Equations

3.1 and 3.2 for further analysis.

3.4.1 Identification and Uncertainty Attitudes

For identification, we assume that αk

iand δk

iare the same for all lotteries with the

same source of uncertainty. With the data from our experiment, we compare these

parameters for three sources of uncertainty: we denote k=Ain the AMBIGUITY

treatment, k=Sin the stag-hunt game, and k=Ein the entry game.

Utility function and risk aversion

In order to estimate uncertainty attitudes, we assume that subjects have CRRA

utility functions, ui(x) = x(1−ri)/(1 −ri) for ri= 1 and ui(x) = ln(x) for ri= 1,

where riis the Arrow-Pratt measure of relative risk aversion (RRA). We use the

22 stated WTAs in the RISK treatment to estimate rifor each subject i. If all 22

WTAs are equal to the expected monetary payments of the respective lotteries, we

set ri= 0. For further details, see Section 3.5.3.

Identification of parameters

Let πibe a subject i’s probability to receive x1in a binary lottery kwith payoffs

x1> x2. Then, the subject’s WTA for an ambiguous lottery or for the chosen lottery

in a game is given by the value Wk

i(x1, x2, πi).

Our 2x2 games involve the choice between a lottery L with payoff 20eor 15e

and lottery R with payoff 5eor 25e. In the stag-hunt game (Game 1), a player

gets the higher payoff of his chosen lottery, if her partner choses the same lottery. In

the entry game (Game 2), a player gets the higher payoff, if her partner choses the

other lottery. Thus, in both games, we observe the values of two lotteries where the

probability πito win the higher payoff in the chosen lottery equals the probability

of getting the lower payoff in the counterfactual lottery. In the stag-hunt game, πi

is the subject’s probability that her partner chooses the same action. In the entry

game, πiis the subject’s probability that her partner chooses the opposite action.

Setting the utility of the stated WTA for the chosen strategy in game k equal to

i(x1, x2, πi) and the utility of the stated WTA for the opposing strategy in game k

equal to ¯

i(¯x1,¯x2,1−πi), while assuming a CRRA utility function with RRA rias

estimated from decision in the RISK treatment, yields two equations that identify

αk

iand δk

As we assumed that subjects evaluate lotteries as expected-utility maximizers,

the WTA for a lottery with payoffs 20eor 15eand a probability p for the higher

payoff should yield a utility that equals Eui(20,15|p).

If a subject chooses the strategy that leads to potential payoffs (x1, x2) in a game

with x1> x2, and for a subjective probability πiof getting x1, the value of this game

is given by Equation 3.1. Using this,

i(x1, x2, πi) = (αk

i+πi)ui(x1) + (1 −αk

i−πi)ui(x2)−δk

=E(ui(x1, x2|πi)) + αk

i(ui(x1)−ui(x2)) −δk

and replacing expected utility by utility from stated WTA in the risky lottery

(WTAR

i), we get11

(WTAk

i(x1, x2, πi))1−ri−(WTAR

i(x1, x2, p =πi))1−ri

1−ri

=αk

i[ui(x1)−ui(x2)] −δk

if and only if

δk

i=(WTAR

i(x1, x2, p =πi))1−ri−(WTAk

i(x1, x2, πi))1−ri+αk

i[x1−ri

1−x1−ri

1−ri

(3.3)

11Note that, alternatively, we could calculate the expected utility of this lottery by inserting

monetary payments in the estimated CRRA utility function. We prefer the more direct comparison

between stated WTAs, because this is less affected by assumptions on the utility function.

For the lottery with the alternative payoffs (¯x1,¯x2), the probability of achieving

the higher payoff is 1 −πi. Thus,

i(¯x1,¯x2,1−πi) = (1 −πi−αk

i)ui(¯x1) + (πi+αk

i)ui(¯x2)−δk

=Eui(¯x1,¯x2|1−πi)−αk

i(ui(¯x1)−ui(¯x2)) −δk

Replacing the value of the lottery by the utility of the certainty equivalent, WTAk

and EU by WTA under risk, we get:

(WTAk

i(¯x1,¯x2,1−πi))1−ri−(WTAR

i(¯x1,¯x2, p = 1 −πi))1−ri

1−ri

=−αk

i[ui(¯x1)−ui(¯x2)]−δk

if and only if

δk

i=(WTAR

i(¯x1,¯x2,1−πi))1−ri−(WTAk

i(¯x1,¯x2,1−πi))1−ri−αk

i[¯x1−ri

1−¯x1−ri

1−ri

(3.4)

Setting 3.3 equal to 3.4 yields

(WTAR

i(x1, x2, p =πi))1−ri−(WTAk

i(x1, x2, πi))1−ri+αk

i[x1−ri

1−x1−ri

= (WTAR

i(¯x1,¯x2,1−πi))1−ri−(WTAk

i(¯x1,¯x2,1−πi))1−ri−αk

i[¯x1−ri

1−¯x1−ri

Therefore

(WTAR

i(x1, x2, p =πi))1−ri−(WTAR

i(¯x1,¯x2,1−πi))1−ri

+ (WTAk

i(x1, x2, πi))1−ri−(WTAk

i(¯x1,¯x2,1−πi))1−ri

=−αk

i[x1−ri

1+ ¯x1−ri

1−x1−ri

2−¯x1−ri

2].

Hence,

αk

i=A

Bfor k=S, E, (3.5)

with

A= (WTAR

i(x1, x2, πi))1−ri−(WTAR

i(¯x1,¯x2,1−πi))1−ri

+ (WTAk

i(¯x1,¯x2,1−πi))1−ri−(WTAk

i(x1, x2, πi))1−ri

and

B=−[x1−ri

1+ ¯x1−ri

1−x1−ri

2−¯x1−ri

2]<0 for k=S, E.

For subjects with ri= 1,

A= ln(WTAR

i(x1, x2, πi)) −ln(WTAR

i(¯x1,¯x2,1−πi))

+ ln(WTAk

i(¯x1,¯x2,1−πi)) −ln(WTAk

i(x1, x2, πi)),

B=−[ln x1+ ln ¯x1−ln x2−ln ¯x2],

and δk

i= ln(WTAR

i(x1, x2, p =πi)) −ln(WTAk

i(x1, x2, πi)) + αk

i[ln(x1)−ln(x2)].

Note that (x1, x2) = (20,15) ⇔(¯x1,¯x2) = (25,5) and (x1, x2) = (25,5) ⇔

(¯x1,¯x2) = (20,15). In both games, if (x1, x2) = (25,5), πiis the probability that

the other player chooses R. If (x1, x2) = (20,15), πiis the probability that the other

player chooses L.

Under ambiguity (k=A), we elicit the WTAs for two lotteries with payoffs

(25,5) and (20,15) along with a subjective probability πifor receiving the higher

payoff in both of these lotteries. Setting utility of stated WTAs equal to WA

i(25,5, πi)

and WA

i(20,15, πi), respectively, identifies parameters (αA

i, δA

i). To see this, define

(x1, x2) = (20,15) in Equation 3.3 and use the same equation also for (x′

1, x′

2) =

(25,5). Then, by setting the right-hand sides of these equations equal to each other,

we get

(WTAR

i(20,15, p =πi))1−ri−(WTAA

i(20,15, πi))1−ri+αk

i[201−ri−151−ri]

= (WTAR

i(25,5, p =πi))1−ri−(WTAA

i(25,5, πi))1−ri+αk

i[251−ri−51−ri]

⇔αA

i=A′

B′,(3.6)

with

A′= (WTAR

i(20,15, πi))1−ri−(WTAR

i(25,5, πi))1−ri

+ (WTAA

i(25,5, πi))1−ri−(WTAA

i(20,15, πi))1−ri

and

B′= 251−ri−201−ri+ 151−ri−51−ri>0.

Plugging the result of Equation 3.6 into one of the Equations 3.3 also yields δA

i. For

subjects with ri= 1,

A′= ln WTAR

i(20,15, πi)−ln WTAR

i(25,5, πi)

+ ln WTAA

i(25,5, πi)−ln WTAA

i(20,15, πi),

B′= ln 25 −ln 20 + ln 15 −ln 5,

and

δk

i= ln WTAR

i(x1, x2, p =πi)−ln WTAk

i(x1, x2, πi)+αk

i(ln(x1)−ln(x2)) .

These calculations show that both parameters are identified through comparing

WTAs between treatments. By calculating our parameters from differences between

WTAs, we avoid the possibility that any systematic bias stemming from the BDM

mechanism affects our parameter estimates.

3.4.2 Hypotheses

Our goal is to find out whether the source of uncertainty affects uncertainty at-

titudes. Based on the theoretical model, our numerical predictions for the model

parameters are given by Bayesian behavior:

Hypothesis 1: There are no systematic attitudes towards or against ambiguity

or strategic uncertainty. Parameters αk

iand δk

iare distributed around 0.

Here, we test for each condition k∈ {A, S, E}whether the parameters αk

iand

δk

ifrom different subjects iare distributed around zero. As the literature generally

found average subjects to be ambiguity averse, we expect that Hypothesis 1 will be

rejected.

Subjects are likely to differ in their uncertainty attitudes and our experiment

is designed to capture how individual attitudes are affected by the source of un-

certainty being another human’s action and by the nature of strategic interaction.

Here, we exploit the within-subject design and use as null hypothesis:

Hypothesis 2: Subjects do not make any distinction between the sources of un-

certainty and between the considered strategic situation (strategic complementarity

versus substitutability): αA

i=αS

i=αE

iand δA

i=δS

i=δE

A positive (negative) δk

iis interpreted as a general aversion against (preference

for) ambiguity or strategic uncertainty. A positive (negative) αk

iis interpreted as

optimism (pessimism) for receiving the higher payoff under ambiguity or strategic

uncertainty.

3.5 Results

This section outlines the main empirical results based on the pre-registered pro-

cedures of sample selection and data analysis. They can be summarized as fol-

lows. Subjects react to the presence of uncertainty (notwithstanding Hypothesis 1),

but also make a systematic distinction between the different sources of uncertainty

(notwithstanding Hypothesis 2). Importantly, the magnitude of that last effect de-

pends on the strategic context. Regarding the two parameters of our structural

model, we find that the majority of subjects exhibits pessimism [optimism] in the

stag-hunt [entry] game while the median subject has neither a preference for nor an

aversion against strategic uncertainty.

3.5.1 Data Selection

We begin by applying the data selection criteria to the initial sample of 228 sub-

jects. The elicitation of both WTAs, for the preferred and the non-preferred action,

provides us with a consistency measure since it should be that WTApreferred ≥

WTAnotpreferred. If a participant violates this criterion such that her WTA for par-

ticipating with her preferred action is lower than the WTA for the not preferred

action in at least one of the games, we exclude this participant from our main data

analysis. The reason is that such a reversal indicates a systematic misunderstanding

of the BDM procedure that could affect all stated WTAs and data from these sub-

jects might just introduce noise. For the same reason, we exclude subjects whose

stated WTA for a lottery that pays the higher payoff with probability 1 is lower than

the stated WTA for an otherwise equal lottery that pays the higher payoff with prob-

ability 0. These criteria were pre-specified. We also pre-specified a robustness check

using the full sample.

In 45 [63] cases we observe a violation of choice consistency in the stag-hunt

[entry] game: the stated WTA for the preferred action is lower than the WTA for

the not preferred one.12 19 subjects violate our rationality criterion in the lotteries:

the stated WTA for a lottery that surely pays a high payoff is lower than the stated

WTA for an otherwise equal lottery that never pays a high payoff. Jointly put,

these criteria turn out to be stringent.13 In total, there are 102 subjects to whom

at least one of these exclusion criteria applies. We call the remaining 126 subjects

the restricted sample.

Ex post, after conducting the experiments, we detected that certain combinations

of choices lead to extreme values of estimated relative risk aversion (beyond +/-100)

and thereby also to estimated values for αand δin astronomical dimensions. In

total, there are 15 subjects with an estimated RRA outside [-100,+100], 7 in the

12For 5 subjects, the selected action in one of the games was not recorded due to a minor software

glitch. One of them failed to comply with the inclusion criterion for lottery choices, the remaining

4 are included in the restricted sample.

13While lack of understanding of the BDM mechanism may partially account for deviations from

expected utility, we also note that the BDM performs not worse than the alternative elicitation

methods for certainty equivalents in terms of bias and noise (Hey et al.,2009).

restricted sample. We exclude them from the parametric analysis. 5 other subjects

(1 from the restricted sample) have an estimated RRA>1, but stated a WTA of 0

for at least one of the games or lotteries needed to calculate uncertainty parameters.

For these subjects, some or all pairs (αk

i, δk

i), k=A, S, E, cannot be calculated. So,

we exclude these subjects from all analyses of uncertainty parameters.

3.5.2 Comparison of Certainty Equivalents

In the experiment, we elicit the WTAs for two lotteries with outcomes depending

on the strategy of another player or on ambiguity simultaneously with subjective

probabilities for the possible outcomes. As an initial descriptive step of our analyses,

we can directly compare the WTA of a lottery in a game (where the outcome is de-

termined by another player’s action) with the WTA of a lottery that yields the same

payoffs with exogenously given probabilities that match the subjective probabilities

in the game. Similarly, the WTA for an ambiguous lottery with unknown probabil-

ities can be compared to the WTA of a lottery yielding the same payoffs with given

probabilities that match the subject’s stated probabilities for the ambiguous lottery.

Note that in theory, the WTA for a game is the higher of the two WTAs for the

two possible actions. As a first step in analyzing uncertainty attitudes, we count

the number of subjects whose WTA for a game or for an ambiguous situation is

higher than, equal to or lower than the WTA for the analogous lottery played under

risk. This informs us about the average preference for, or aversion against, a given

source of uncertainty. Note that the size of these deviations may depend on payoffs

associated with the chosen strategy, but also on the subjective probabilities.

Table 3.4 presents the results of this comparison separately for the two lotteries

under ambiguity, for the lottery implied by the actually chosen action in each game,

but also for the counterfactual lottery implied by “replacing” the subject’s actual

choice with the alternative action.

Table 3.4: Comparison of certainty equivalents

Ambiguity: k=A Stag hunt: k=S Entry: k=E

(x1,x2)(20,15) (25,5) chosen replaced chosen replaced

WTAk

i(x1, x2, πi)> WTAR

i(x1, x2, πi)43 49 45 60 80 30

WTAk

i(x1, x2, πi) = WTAR

i(x1, x2, πi)26 29 29 23 12 23

WTAk

i(x1, x2, πi)< WTAR

i(x1, x2, πi)57 48 52 43 34 73

Notes: In the stag-hunt [entry] game, 81 [101] out of 126 subjects choose the action R.

The WTAs under ambiguity and for the stag-hunt game are not significantly

100

different from the WTAs under risk. A Wilcoxon signed rank test yields p-values

above 0.2. For the entry game, however, we find that subjects have a higher WTA

for the lottery implied by their own choice in the game than for the respective lottery

with exogenously given probability (p-value <0.001). The opposite effect occurs

once we look at the WTA for the lottery implied by replacing the actual choice with

the opposite action: it is lower than the WTA for the respective lottery under risk

(p-value <0.001). This indicates that the median subject tends to be optimistic

about the behavior of her partner in the entry game. The weight a player puts on

the payoff corresponding to her partner choosing a different action than her own

exceeds her stated probability of that outcome.

Direct comparisons between WTAs of different games or between a strategic

game and an ambiguous situation could only be possible if a subject stated the same

probability for getting the higher payoff in both contexts. Unfortunately, restricting

analysis to these observations would leave us with just a few matched pairs and

possibly introduce a selection bias. Thus, for further econometric analysis, we use

strategic-uncertainty attitudes as characterized by the parameters αk

iand δk

iof our

structural model. In order to identify these parameters, we need to estimate a utility

function for each subject.

3.5.3 Main Results

Our identification strategy relies on a two-step procedure. As a first step, we use the

individual certainty equivalents (WTA) elicited in 22 lotteries to estimate individual

parameter riof the CRRA utility function. We adopt a parametric procedure from

Hey et al. (2009). For a given lottery (x1, x2, πi), the observed WTAi(x1, x2, πi)

corresponds to the latent expected value Eui(x1, x2, πi), but is also subject to an

i.i.d. error ei∼N(0, s2

i):

WTAi(x1, x2, πi) = u−1

i(Eui(x1, x2, πi)) + ei.

For each individual i, the pair of parameters (ri, si) is estimated through standard

maximum likelihood (ML) estimation.14 As a second step, the estimated coeffi-

cient ˆriis used to compute two individual parameters (αk

i, δk

i) for each context of

uncertainty k=A, S, E following Equations 3.3,3.5, and 3.6.

14For 14 subjects (among which 6 appear in the restricted sample) the ML procedure cannot

converge since their parameter ris unbounded and takes extreme values: it either tends to plus

infinity or to minus infinity. For the sake of nonparametric tests, these subjects are assigned

extreme realizations of r going beyond values observed in the remainder of the sample: either

200 or -200, respectively. In parametric analyses, we only consider cases where the estimated

r∈[−100; 100], which requires removing all the subjects mentioned above as well as another one

with the estimated rof -112; this subject appears in both the restricted and the unrestricted

sample. The resulting range of estimated values of ris (-33, 4) in a sample of 207 observations.

101

Table 3.5: Summary of estimated uncertainty attitudes

Parameters Median #N > 0 #N= 0 #N < 0 Sign test

Restricted sample

ˆr0 60 11 54 -

ˆσ2.356 108 11 - -

αA0 53 14 58 0.704

αS-0.065 41 11 73 0.003

αE0.214 92 7 26 <0.001

δA0 58 14 53 0.704

δS0 51 12 62 0.347

δE-0.073 51 4 70 0.101

Unrestricted sample

ˆr-0.191 88 16 119 -

ˆσ2.535 193 16 - -

αA-0.005 89 20 114 0.092

αS-0.133 72 11 140 <0.001

αE0.104 131 10 82 0.001

δA0 103 21 99 0.833

δS0 92 16 115 0.126

δE-0.018 96 6 121 0.103

Notes: Columns 3-5 summarize the absolute frequencies of estimated parameter values (as listed

in column 1) being positive, negative, or null, respectively. The last column provides p-values from

a sign test of nullity of the median value of the respective parameter. Top (bottom) part of the

table: N= 125, restricted sample (N= 223, unrestricted sample).

Accordingly, the top part of Table 3.5 summarizes the first-step risk attitudes

and the second-step uncertainty attitude parameters, as estimated in the restricted

sample. Most subjects are found to be either risk seeking or risk averse, both types

of preferences emerging in similar proportions. Moving to the domain of uncertainty,

we find that, in our benchmark AMBIGUITY condition, most subjects are either

pessimistic (αA<0) or optimistic (αA>0), both of which again happen in equal

proportions. In a similar vein, most subjects are found to exhibit either aversion

against (δA>0) or preference for (δA<0) ambiguity. In purely descriptive terms,

the parameter of uncertainty aversion is not significantly different from zero in any

of the conditions. However, we observe that the median subject is pessimistic about

the behavior of the other player in the stag-hunt game (αS<0) and optimistic in

the entry game (αE>0).15

15Wilcoxon signed rank tests also reject αS= 0 and αE= 0 at the 1% level and across samples.

102

Statistical evidence provided in the last column in Table 3.5 does not corroborate

Hypothesis 1 stating that across all conditions, both parameters are located at zero.

The nonparametric sign test strongly rejects the nullity of the median of αin both

games; the nullity of the median cannot be rejected at the 5% level for any other

parameter.

Next, we provide a complementary parametric analysis using a Seemingly Unre-

lated Regression (SUR) estimation. For the ith subject, parameters αk

iand δk

iare

assumed to depend on the experimental condition k∈ {A, S, E}in the following

way:

αk

i=a0+aS×1[k=S] + aE×1[k=E] + uk

i,(3.7)

δk

i=d0+dS×1[k=S] + dE×1[k=E] + vk

i,(3.8)

where 1[k=X] = 1 if a decision is made in condition X, and 1[k=X]=0

otherwise. The AMBIGUITY condition Ais set as the reference condition. Hence,

E(αA

i) = a0,E(αS

i) = a0+aS,E(αE

i) = a0+aE, and E(δA

i) = d0,E(δS

i) = d0+dS,

E(δE

i) = d0+dE. In each of the two equations, errors are clustered at the individual

level due to the within-subject implementation of the experimental conditions.

The main virtue (and relative advantage with respect to the nonparametric meth-

ods) of this approach is that it provides a one-size-fits-all framework for fitting our

experimental data that fully accounts for the within-subject treatment variation

and the presence of two distinct preference parameters, αk

iand δk

i, that simultane-

ously arise as dependent variables. Furthermore, it allows us to go beyond single-

parameter tests, and instead test for the joint hypothesis that a group of parameters

has zero mean through a standard Wald test. It also allows us to test for order ef-

fects.16 However, the challenge here is to account for the presence of outliers arising

for two reasons. First, extreme risk preferences can drive the estimated uncertainty

parameters to astronomical values. Second, due to the cardinality of the value func-

tion in Equation 3.1, the parameter δk

iis expressed in units of subjective utility.

For both reasons, δk

ican take extreme values, whether positive or negative. We

tackle this issue in two ways. First, as explained above, parametric analyses con-

sider only subjects whose estimated RRA lies in [−100,100]. For this sample, we

Since the distribution of these parameters is asymmetric, we prefer to report the outcomes of a

more conservative sign test which does not require the symmetry assumption.

16Order effects may arise since the order of S and E treatments is random, yet balanced across

sessions. To check for the possible order effects, regression models 3.7 and 3.8 can be extended

by including an indicator variable for the order of the experimental conditions along with its

interactions with both independent treatment indicator variables. This specification allows us

to compare outcomes across treatments for a given order (in analogy to comparisons made in

models 3.7 and 3.8). It also allows for a formal statistical test of order effects in the data through

Chow test that we run simultaneously for both extended regressions to check whether the order-

related coefficients are jointly insignificant. This exercise points to the lack of order effects at the

conventional 5% significance level, and hence does not raise any indication of order effects. See

Table C.1 in Appendix C.2 for details.

103

apply the negative logarithm transformation to δk

i, i.e. we replace δk

iin Equation

3.8 by sign(δk

i) log(1 + |δk

i|) in order to de-emphasize extreme realizations. Second,

we estimate the SUR without logarithmic transformation, by only looking at indi-

viduals whose estimated RRA lies in [−3,+3], a range that should be considered

reasonable in the light of existing literature (see Charness et al. 2020). Applied

to the restricted and unrestricted samples in turn, this procedure delivers four re-

gression specifications that are reported in Table 3.6. Table 3.7 further summarizes

additional parametric mean tests based on the estimated coefficients.

Table 3.6: Uncertainty attitudes across treatments: parametric estimates from seemingly unrelated

regressions

(1) (2) (3) (4)

Dep. variable: αk

iδk

iαk

iδk

iαk

iδk

iαk

iδk

1[k=S]-0.038 -1.194 -0.123* 946.12** 0.027 -1.957 -0.090 214.61

(0.078) (1.429) (0.065) (413.71) (0.125) (2.048) (0.101) (216.37)

1[k=E] 0.347** -1.557 0.232** 672.49* 0.575** -2.353 0.385** 154.93

(0.154) (1.434) (0.108) (345.11) (0.259) (2.114) (0.173) (123.52)

Constant -0.120*** 0.364 -0.107*** -819.18* -0.103** 0.789 -0.093* -162.32

(0.040) (0.853) (0.041) (435.00) (0.047) (1.251) (0.048) (106.58)

Observations 624 561 354 321

Clusters 208 187 118 107

Notes: Standard errors are clustered at the subject level and reported in parentheses. 1[k=S] is a binary variable set to 1 for

condition T, and to 0 otherwise. In all models, we exclude cases with indefinite δk

ias well as those with estimated rioutside the range

[−100,100]. Specifications (1) and (3) use neglog transformation of δi. In specifications (2) and (4), we consider only subjects with an

estimated riin the range [−3,3]. Specifications (1) and (2) use the unrestricted sample, (3) and (4) the restricted sample. Significance

levels: * p<0.1 ** p<0.05 *** p<0.01.

104

Table 3.7: Results of mean testing across specifications

Tests (1) (2) (3) (4)

E(αA

i) = 0 0.003 0.009 0.030 0.053

E(αS

i) = 0 0.031 <0.001 0.536 0.067

E(αE

i) = 0 0.111 0.204 0.054 0.077

E(δA

i) = 0 0.670 0.060 0.528 0.128

E(δS

i) = 0 0.345 0.551 0.366 0.763

E(δE

i) = 0 0.176 0.389 0.242 0.949

E(αA

i) = E(αS

i) = E(αE

i) 0.016 <0.001 0.055 0.001

E(δA

i) = E(δS

i) = E(δE

i) 0.283 0.075 0.404 0.443

E(αA

i) = E(αS

i) = E(αE

i) = 0 0.001 <0.001 0.045 0.001

E(δA

i) = E(δS

i) = E(δE

i) = 0 0.401 0.151 0.580 0.375

Nullity of all parameters 0.001 <0.001 0.026 0.003

Notes: p-values corresponding to the stated mean test in column 1 are based on the

coefficient estimates from the four specifications reported in Table 3.6. Respective

samples contain 208, 187, 118 and 107 subjects for specifications (1), (2), (3) and (4).

Overall, results reported in Tables 3.5 (nonparametric median tests) and 3.7

(parametric mean tests) lead us to reject Hypothesis 1 on the absence of attitudes

towards uncertainty.17 These attitudes strongly vary across contexts. The nonpara-

metric tests (Table 3.5) indicate that the median of αS

iis negative while the median

of αE

iis positive. The parametric tests (Table 3.7) indicate that the mean of αA

differs from zero, while we cannot reject (at p=5%) that the means of αE

iand αS

are zero. Note, however, that the estimated values of αk

iand δk

iare not normally

distributed. The p-values from the Shapiro-Wilk W test are all below 0.001. Thus,

the main empirical rationale for rejecting Hypothesis 1 comes from the rejection of

the joint nullity of all parameters, from pessimism by the median subject in the stag-

hunt game and optimism by the median subject in the entry game. The estimates

for the AMBIGUITY condition alone would not be sufficient for rejecting Hypoth-

esis 1, because the median of αA

iequals zero (Table 3.5). The nullity of parameter

δk

i, in turn, comes as a persistent empirical finding across all tests and all treatments.

17Strictly speaking, joint tests reported at the bottom of Table 3.7 constitute the target testbed

for Hypothesis 1, although it should also be noted that they remain mute on the precise reasons

(i.e., which parameters are non-null) for the potential rejection. From this perspective, single-

parameter tests reported in Table 3.5 and the first six lines of Table 3.7 provide complementary

information.

105

Result 1 – We document systematic attitudes toward uncertainty. Parameters

αk

iare not distributed around 0 under strategic uncertainty pointing to pessimism

regarding the behavior of the other player under strategic complementarity, and to

optimism under strategic substitutability. Beside this, we do not find a systematic

preference for or aversion against strategic uncertainty.

Building on this result, we now turn to the formal comparisons of αand δ

between the three experimental conditions of uncertainty (ambiguity, stag-hunt,

and entry game) and test Hypothesis 2. Table 3.8 summarizes pairwise median

comparisons based on the Wilcoxon signed rank test, as estimated on either the

restricted or the unrestricted sample. Once again, the general finding goes against

our initial hypothesis: we observe more optimism in the entry game than in the

stag-hunt game or in the benchmark AMBIGUITY condition.18 Figure 3.1 provides

additional visual support of this result: the cumulative distribution function of α

in the entry game first-order stochastically dominates the remaining ones, while not

such differences arise for δ.

Table 3.8: Nonparametric comparisons of uncertainty attitudes across

treatments

Sample

Restricted (N= 125) Unrestricted (N= 223)

Comparison αk

iδk

iαk

iδk

Ambiguity – Stag hunt 0.226 0.478 0.021 0.571

Ambiguity – Entry <0.001 0.407 <0.001 0.893

Stag hunt – Entry <0.001 0.671 <0.001 0.410

Notes: Columns 2-5 provide p-values from two-sided Wilcoxon signed rank tests.

Parametric estimates presented in Table 3.6 point to similar conclusions: the en-

try game induces significantly stronger optimism as compared to both AMBIGUITY

and the stag-hunt game (p < 0.05 in all comparisons).19 A parametric comparison

of δacross treatments does not yield significant results at the 5% level.

18Echoing Footnote 14, one caveat here is that the symmetry assumption required by Wilcoxon

signed rank test may not hold in our data. An alternative nonparametric sign test yields the

same results with one exception: αk

iis significantly different between the stag-hunt game and the

AMBIGUITY condition (see Appendix C.2 for details).

19These comparisons require testing for the equality of E(αE

i) with E(αA

i) and E(αS

i).

106

Figure 3.1: Cumulative density functions of uncertainty attitude parameters across

conditions

(a) Optimism (b) Aversion

Notes: Data from the restricted sample trimmed to r∈[−3,3] (N= 107). The x axis in second graph contains

neglog transformation of δk

i:sign(δk

i)log(1 + |δk

i|) to account for a wide range of values taken by this variable.

Result 2 – Subjects distinguish between the different sources of uncertainty.

Uncertainty coming from interaction under strategic substitutability gives rise to

more optimism as compared to both ambiguity and interaction under strategic com-

plementarity. Strategic complementarity does not induce significant changes in atti-

tudes towards uncertainty as compared to ambiguity. We do not find significant and

systematic differences across the three treatments in terms of preferences towards

the source of uncertainty.

In Appendix C.3, we provide additional analyses on the individual underpinnings

of attitudes towards uncertainty based on the individual characteristics described

in our pre-results reviewed report. We do not find any systematic association of

individual characteristics with the six parameters of interest.

3.6 Conclusion

We have developed a method for measuring strategic-uncertainty attitudes and dis-

tinguishing them from risk and ambiguity attitudes. We elicit certainty equivalents

of participating in two strategic 2x2 games (stag-hunt and market-entry games) as

well as certainty equivalents of related lotteries that yield the same possible payoffs

with exogenously given probabilities (risk) and lotteries with unknown probabilities

(ambiguity). We use this information to identify for each game and for the ambigu-

ous environment two parameters of a structural model of uncertainty attitudes. The

parameters of this model capture subject-specific uncertainty aversion and optimism

regarding the subject’s subjective probability for the desired outcome. We then test

107

whether there are significant differences in the distribution of uncertainty attitudes

between games with strategic complements, games with strategic substitutes, and

ambiguous lotteries.

We find systematic attitudes towards uncertainty that vary across contexts.

While there is no evidence for a preference for, nor for an aversion against, ambigu-

ity or strategic uncertainty (in the sense of a fixed effect of the source of uncertainty

on utility), the median subject seems to be pessimistic about the behavior of the

other player in the stag-hunt game, and optimistic in the entry game, where opti-

mism/pessimism are proportional to the difference between the utility expressed by

stated WTAs in a given game and the subjective expected utility derived from the

stated probability for the other player’s choice.

In the entry game, optimism means that the median subject’s evaluation of the

game is shifted from her expected utility in direction of the higher payoff that arises

if the other player chooses the action opposing her own. In the stag-hunt game, the

median subject’s evaluation is shifted from her expected utility towards the lower

payoff. In stag hunt, the lower payoff arises if both players choose opposing actions.

Thus, the median subject evaluates both games with an extra weight on the other

player choosing the action opposed to her own.

Our results also show that the entry game stands out, because the distribution of

optimism in the entry game stochastically dominates the distribution of optimism in

stag-hunt and ambiguity treatments. This reflects the results by Nagel et al. (2018)

that indicate a higher degree of strategic uncertainty and higher levels of reasoning

in entry games than in stag-hunt games and lotteries.

Stag-hunt and entry games differ in the reasoning process leading to a decision. If

a player has an initial preference for one action, say L, and considers what she should

do had the other player also chosen L, then her initial preference is confirmed in the

stag-hunt game. If her partner reasons in the same way, it is optimal for both to

choose L. In the entry game, however, if the other player thinks like her and chooses

L, then she should choose action R instead; however, if the other player follows the

same reasoning as her, then she should switch back to L. This inconclusive reasoning

process may be the underlying reason for higher brain activity in Nagel et al. (2018)

and for the deviation between the stated value (WTA) of a game and its subjective

expected utility. Eventually, the extra weight associated with an opposing action

expressed by optimism in entry and pessimism in stag-hunt games is a precaution

against the other player applying a different reasoning process leading to a different

action.

Our findings further complement the literature on the choice/preference rela-

tionship in games. For instance, Chark and Chew (2015) compare choices in a

coordination game with strategic complementarities when the opponent is another

108

human or a die in presence of a safe opt-out option. Their findings indicate that the

source of uncertainty does not significantly alter choices in this game.20 In another

related experiment, Calford (2020) finds that uncertainty aversion measured with a

game can account for choices in another game. Our results are consistent with the

literature finding source-dependence in uncertainty attitudes (e.g., Abdellaoui et al.,

2011;Chark and Chew,2015). We add to this literature by developing a general

method for identifying and comparing attitudes towards strategic uncertainty. We

focus on attitude measurement in two prototypical games, but the method can be

easily applied to other settings.

Finally, our empirical evidence highlights the general importance of individual

probability distortion (rather than a domain-specific utility function) for under-

standing decision-making under uncertainty. This finding corroborates some of the

previous research on modelling uncertainty in individual (i.e., non-strategic) choices

(see, e.g., Abdellaoui et al.,2011;Attema et al.,2013) and further extends it by

showing that the relative importance of probability weighting also applies to strate-

gic contexts.

20However, they find indicative evidence for the role of source of uncertainty in another game (i.e.,

matching pennies) relative to coordination game. Their accompanying evidence from neuroimaging

data favor ambiguity attitudes over social preferences in explaining these results.

109

Appendix C

Appendix

C.1 Instructions

Whether Game 1 or 2 is played first is randomly chosen by the computer. Here we

only present instructions where Game 1 is played first.

Welcome!

You are about to take part in an economic experiment. You are not allowed to

talk to other participants during the experiment. If you have a cell phone, please

switch it off. If you have a question at any time, please raise your hand, and someone

will come to help you. Please do not ask your question aloud. If the question is

relevant for all participants, we will repeat it and answer it aloud. If you violate

these rules, we must exclude you from the experiment and from payment.

All the information you provide, the decisions you make, as well as the amount

of your gains from this experiment will remain strictly confidential and anonymous.

Participation in this experiment will earn you money. Your earnings will depend

on your decisions and may also be affected by the decisions made by others.

The experiment consists of five parts. You will receive specific instructions for

each part as the experiment goes on. At the end of the experiment, only one part

out of Parts 1 to 4 will be chosen at random to determine your final payoff for the

experiment, where each of these four parts has the same chance to be randomly

drawn. Within each part, you make several decisions. If a part is randomly chosen

for payment, one of those decisions will be drawn for payment by another random

mechanism of the computer, where each decision has the same chance to be ran-

domly drawn. Hence, only one of your decisions will affect your final payoff, but it

could be any one of your decisions. For showing up in time, you additionally obtain

5 Euros. The fifth part does not offer you the chance to earn money.

110

Specific Instructions for Part 1

In this part of the experiment, you will face 22 lotteries. 11 of them pay either

15 or 20 Euros. The others pay either 5 or 25 Euros. For both payoff types, the

probability to get the higher of the two possible payoffs varies from 0 to 100% in

steps of 10%.

For each of the 22 lotteries, we ask you the following question:

•Which amount (in Euros) would you prefer to receive with certainty instead

of letting the lottery determine your payoff?

You need to enter your answers for these questions in the columns “Opt-out

value for lottery that pays either 15 or 20 Euros” and “Opt-out value for lottery

that pays either 5 or 25 Euros”, respectively. You can state any value from 0 to

30 Euros, with up to two decimals. Your answers to these questions will determine

your candidate payoff for this part of the experiment with the following two-step

procedure:

If this part is selected for payoff, the computer will randomly select one of the 22

lotteries. Second, the computer will randomly draw an amount from 0.00 to 30.00

Euros with two decimals (each value in the interval is equally likely).

•If the randomly drawn amount is larger than or equal to your stated “Opt-

out value” for the selected lottery, your payoff is the amount drawn by the

computer.

•If the amount drawn by the computer is smaller than your stated “Opt-out

value” for the selected lottery, your payoff will be determined by the rules of

this lottery. This means, you will get the higher of the two possible payoffs

with the probability pstated in the left column. You will get the lower of the

two possible payoffs with the remaining probability 1 −p.

Example:

•Suppose that the computer selects the lottery that pays either 15 or 20 Euros

with a probability of receiving the higher payoff p= 90%. Suppose that your

stated “Opt-out value” for this lottery is equal to 17.50.

•If the amount drawn by the computer is at least 17.50, you will receive this

amount. So, if the drawn amount is 26.09, you receive 26.09 Euros.

111

•If the number drawn by the computer is smaller than your opt-out-value, say

9.79, your payoff for this part is determined by the selected lottery. Here, you

will receive 20 Euros with probability p= 90%. With probability 1−p= 10%,

you will receive 15 Euros.

You will see those 22 lotteries listed on your screen as described in the Table

below. Once you state both of your “Opt-out values” for each of the 22 lotteries

given in the Table below, you need to confirm these answers by clicking on the

“CONFIRM” button. You can change these “Opt-out values” as long as you have

not confirmed them.

Probability with which the

computer selects the higher

payoff

Opt-out values for lottery

that pays either 15eor 20e

Opt-out values for lottery

that pays either 5eor 25e

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Before beginning the actual Part 1, you will perform the same task with five

different lotteries. This phase is for practice purposes and will not influence your

payoff. You will also receive feedback about the random selections of the computer

in this practice round and about the consequence of the two-step procedure using

your stated opt-out values. Note that in the real experiment, you will not be in-

formed about these outcomes before the end of the experiment.

Specific Instructions for Part 2

In this part of the experiment, you will face 2 lotteries. One of them pays either

15 or 20 Euros. The other pays either 5 or 25 Euros. Note that these are the same

112

payoffs offered by the lotteries as in the previous part. But now, you will not be

informed about the probability with which the computer chooses the higher payoff.

The computer is programmed in such a way that the probability with which the

higher payoff is paid is one of the probabilities stated in Part 1, i.e.: 0, 10%, 20%,

. . . , 100%. The computer selects this probability before you submit your decision

for this part. Each of these 11 probabilities might be the one applied to the lotteries

in this part, but they are not equally likely. This means, some probabilities are

more likely to be drawn than others. However, you will not receive any further

information about the precise random mechanism.

For each of the two lotteries, we ask you the following question:

•Which amount (in Euros) would you prefer to receive with certainty instead

of letting the lottery determine your payoff?

You need to enter your answers for these questions in the boxes “Opt-out value

for lottery that pays either 15 or 20 Euros” and “Opt-out value for lottery that

pays either 5 or 25 Euros”, respectively. You can state any value from 0.00 to 30.00

Euros, with up to two decimals. Your answers to these questions will determine

your payoff for this part of the experiment with the following two-step procedure:

If this part is selected for payoffs, the computer will randomly select one of the

two lotteries. Second, the computer will randomly draw an amount from 0.00 to

30.00 (each amount in the interval is equally likely).

•If the randomly drawn amount is larger than or equal to your stated “Opt-

out value” for the selected lottery, your payoff is the amount drawn by the

computer.

•If the amount drawn by the computer is smaller than your stated “Opt-out

value” for the selected lottery, your payoff will be determined by the rules of

the chosen lottery. This means, you will get either of the two possible payoffs

15 or 20 Euros if the lottery that pays either 15 or 20 is selected and 5 or 25

Euros if the lottery that pays either 5 or 25 is selected.

Once you stated the two “Opt-out values”, we will ask you about your guess how

likely it is that the computer selects the higher payoff. We are asking your guess for

the following question:

•Out of 10 draws, how many times does the computer select the higher payoff?

If your guess exactly matches the true number of draws that the computer selects

the higher payoff, your payoff from this decision will be 20 Euros. If your guess is

not exactly accurate, then you may receive 20 or 10 Euros. The likelihood to receive

the high payoff (20 e) is higher, the closer your guess is to the expected number

113

of draws. This means the more accurate your guess is, the higher your payoff from

this decision will be. You can look up the precise mechanism rewarding your stated

beliefs by clicking on the button “more information.” The mechanism makes sure

that it is in your best interest to state your true belief about the expected number

of draws.1

Summary and Payoff Procedure for Part 2:

In this part of the experiment, you will first state your “Opt-out values” for

the two lotteries. Second, you will state your guess about the likelihood that the

computer selects the higher payoff.

A random mechanism will decide how your candidate payoff for this part of the

experiment will be determined. 2 out of 3 times, it will be determined based on the

two-step procedure which uses your stated “Opt-out values” as described in Part 1.

1 out of 3 times, it will be determined based on the accuracy of your stated guess.

C.1.1 Instructions for Games

Specific Instructions for Part 3

In this part, you are randomly matched with another participant in this session.

We will never inform you about the identity of this other participant. You and this

other participant will each choose between two Actions L and R.

The payoffs (in Euro) for you and the other participant are presented in the

Table below: in each cell, the first amount is your payoff, and the second amount is

the other participant’s payoff. These payoffs can be summarized as follows:

•If you and the participant you are matched with both choose L, you both

receive 20 Euros;

•If you choose L and the participant you are matched with chooses R, then you

receive 15 Euros and the other participant receives 5 Euros;

•If you and the participant you are matched with both choose R, you both

receive 25 Euros;

•If you choose R and the participant you are matched with chooses L, then you

receive 5 Euros and the other participant receives 15 Euros.

1The computer interface contains a button opening a pop-up window with specific description

of this procedure:

If this decision is selected for final payment, your gain will be determined according to the following

procedure. First, the computer calculates DIFF: the difference between your answer and the correct

answer, and then computes its square value: DIF F 2 = DIF F ∗DIF F . Second, the computer

randomly draws an integer number between 0 and 100 (each realization being equally likely).

If the value of DIFF2 is below that random integer, your payoff equals 20 euros; otherwise, your

payoff equals 10 euros.

114

Decision situation in Part 3 and associated payoffs in Euro.

Your decision

The other participant’s decision

L R

L 20e, 20e15e, 5e

R 5e, 15e25e, 25e

First, you and the other participant will decide between Actions L and R. We

call this “Decision 1”.

If this part is selected for payoffs, with 1/3 probability, your payoff as well as

the other participant’s payoff are determined by your and the other participant’s

Decision 1 as described above.

Once you made your Decision 1 (and before payoffs are determined), we ask you

to state two “Opt-out values” similar to the ones in Parts 1 and 2. The precise

questions are the following:

•If the computer replaces your decision with Action L, which amount (in Euro)

would you prefer to receive with certainty instead of continuing with Action

•If the computer replaces your decision with Action R, which amount (in Euro)

would you prefer to receive with certainty instead of continuing with Action

Just as in Parts 1 and 2, you need to state an amount from 0.00 to 30.00 Euros

for both questions above. You need to enter your answers for these questions in the

columns “Opt-out value for Action L” and “Opt-out value for Action R”, respec-

tively. You can state any value from 0.00 to 30.00 Euros, up to two decimals. Your

answers to these questions will determine your payoff for this part of the experiment

with the following two-step procedure: If Part 3 is selected for payoffs, with 1/3

probability, your payoff will be determined based on the two-step procedure which

uses your stated “Opt-out values”. In this case, the computer will randomly select

one of the two actions L or R for you. Second, the computer will randomly draw an

amount from 0.00 to 30.00 Euros (each amount in the interval is equally likely).

•If the randomly drawn amount is larger than or equal to your stated “Opt-

out value” for the action selected by the computer, your payoff is the amount

drawn by the computer.

•If the amount drawn by the computer is smaller than your stated “Opt-out

value” for the action selected by the computer, your payoff will be determined

by this action and the action chosen in “Decision 1” by the participant you

are matched with.

115

Example:

Suppose the computer replaces your action by R and draws the amount 21.24.

If your opt-out value for Action R is smaller than 21.24, you receive 21.24 Euros. If

your opt-out value is larger, your payoff depends on the other participant’s Decision

1. If the other participant has chosen L, you receive 5 Euros. If the other participant

has chosen R, you receive 25 Euros.

Once you stated the two “Opt-out values”, we will ask you about your guess how

likely it is that the other participants in this room choose Action R. We are asking

your guess for the following question:

•How many of the other 10 participants in this session choose Action R?

The payoff for your guess will be determined in the same way as in Part 2.

If your guess exactly matches the true number of choices for Action R, your

payoff from this decision will be 20 Euros. If your guess is not exactly accurate,

then you may receive 20 or 10 Euros. The likelihood to receive the high payoff (20

e) is higher, the closer your guess is to the expected number of draws. This means

the more accurate your guess is, the higher your payoff from this decision will be.

You can look up the precise mechanism rewarding your stated beliefs by clicking

on the button “more information.” The mechanism makes sure that it is in your

interest to state your true belief about the expected number of draws.

Finally, you need to confirm your decisions by clicking on the “CONFIRM” but-

ton. You can change your decisions as long as you have not confirmed them.

Summary and Payoff Procedure for Part 3:

In this part of the experiment, you will answer four questions. First, you will

state your preferred action (either L or R) for Decision 1. Second, you will state the

two “Opt-out values” in case the computer replaces your decision by L or R. Third,

you will state your guess on how many out of 10 randomly drawn other participants

would choose Action R as their preferred action.

Another random mechanism will decide how your candidate payoff for this part

of the experiment will be determined. 1 out of 3 times, it will be determined based

on yours and the other participant’s preferred action. 1 out of 3 times, it will be

determined based on the two-step procedure that uses your stated “Opt-out values”.

1 out of 3 times, it will be determined based on the accuracy of your stated guess.

Specific Instructions for Part 4

In Part 4, you will make exactly the same decisions as in Part 3. You are matched

with another participant (possibly different from Part 3). The only difference com-

116

pared to Part 3 is in the payoffs that you and the other participant receive depending

on your choices between Action L and Action R.

The payoffs (in Euro) for you and the other participant are presented in the table

below: in each cell, the first amount is your payoff, and the second amount is the

other participant’s payoff. These payoffs can be summarized as follows:

•If you and the participant you are matched with both choose L, you both

receive 5 Euros;

•If you choose L and the participant you are matched with chooses R, then you

receive 25 Euros and the other participant receives 20 Euros;

•If you and the participant you are matched with both choose R, you both

receive 15 Euros;

•If you choose R and the participant you are matched with chooses L, then you

receive 20 Euros and the other participant receives 25 Euros.

Decision situation in Part 4 and associated payoffs in Euro.

Your decision

The other participant’s decision

L R

L 5e, 5e25e, 20e

R 20e, 25e15e, 15e

Summary and Payoff Procedure:

You will answer the same four questions as in Part 3 and your candidate payoff

for this part of the experiment will be determined based on the same mechanism.

Completion and Questionnaires

You have completed the first four parts of the experiment. In each part, the

payoff resulting from one of your decisions is chosen as the candidate payoff for that

part of the experiment. One of these four candidate payoffs will be selected as your

final payoff by a random mechanism (each candidate payoff is equally likely to be

your final payoff). Before announcing your final payoff, we ask you to answer a series

of questions (Part 5). You will answer these questions using the interface on your

computer screen. Please follow the specific instructions on your screen to answer

these questions.

117

C.1.2 Example of Comprehension Quiz for the Treatment

(inserted on screens before Parts 3 and 4; information will be adapted to the games

used in the respective parts)

Before making your decisions for Part 3, please answer the following questions:

1. You will interact with another, randomly matched, participant;

•True

•False

Answer: True

2. In the decision situation of Part 3, if you choose L and the other participant

chooses R, your associated payoff is

•5e

•15 e

•20 e

•25 e

Answer (Game 1): 15 e

Answer (Game 2): 25 e

3. In the decision situation of Part 3, if you choose R and the other participant

chooses L, your associated payoff is

•5e

•15 e

•20 e

•25 e

Answer (Game 1): 5e

Answer (Game 2): 20 e

118

C.2 Additional Tables and Figures

Table C.1: Seemingly unrelated regressions with treatment order effects

Sample: (1) (2) (3) (4)

Dep. variable: αk

iδk

iαk

iδk

iαk

iδk

iαk

iδk

Indep. Variable

1[k=S].031 -.661 -.087 880.34* .113 -2.089 -.048 88.012

(.099) (1.190) (.057) (531.73) (.146) (1.681) (.067) (79.483)

1[k=E] .363** -.623 .349** 742.04 .539** -1.850 .541** 22.871

(.151) (1.190) (.147) (533.36) (.228) (1.707) (.218) (27.070)

StagFirst -.014 .730 -.020 663.03 -.093 -.429 -.097 11.172

(.079) (1.861) (.082) (766.11) (.099) (2.892) (.100) (198.11)

StagFirst ×1[k=S] -.158 -1.216 -.086 147.57 -.211 .324 -.110 330.39

(.159) (3.120) (.148) (845.60) (.266) (4.707) (.250) (554.20)

StagFirst ×1[k=E] -.037 -2.136 -.284 -168.900 .089 -1.236 -.406 344.64

(.329) (3.129) (.215) (637.31) (.591) (4.871) (.354) (317.72)

Constant -.114** .045 -.099* -1092.19 -.064 .964 -.056 -166.60

(.055) (.718) (.056) (710.73) (.057) (.984) (.059) (154.77)

Observations (clusters) 624 (208) 561 (187) 354 (118) 321 (107)

Chow test 0.702 0.448 0.466 0.232 0.552 0.483 0.317 0.493

Joint Chow test 0.627 0.269 0.483 0.299

Notes: StagFirst is a binary variable set to 1 if the stag-hunt game is played before the entry game, and to 0 otherwise. 1[k=T] is a

binary variable set to 1 for condition T, and to 0 otherwise. Standard errors are clustered at the subject level and reported in parentheses.

In all models, we exclude cases with indefinite δk

ias well as those with estimated rioutside the range (-100,100). Specifications (1) and

(3) use neglog transformation of δk

i. In specifications (2) and (4), estimated riis trimmed to the range [−3,3]. Specifications (1) and

(2)/(3) and (4) use unrestricted/restricted sample. The last two rows provide the resulting p-values from Chow tests that is the joint

insignificance of all the coefficients in front of the dummy for the specific parameter and for the entire SUR model. Significance levels: *

p<0.1 ** p<0.05 *** p<0.01.

Table C.2: Nonparametric comparisons of strategic uncertainty attitudes

across treatments

Sample

Restricted (N= 125) Unrestricted (N= 223)

Comparison αk

iδk

iαk

iδk

Ambiguity – Stag hunt 0.005 0.275 <0.001 0.734

Ambiguity – Entry <0.001 0.203 0.007 0.497

Stag hunt – Entry <0.001 0.779 <0.001 0.441

Notes: Columns 2-5 provide p-values from two-sided Wilcoxon signed rank tests.

119

C.3 Individual Underpinnings of Attitudes towards

Uncertainty

In this appendix, we explore individual underpinnings of attitudes towards uncer-

tainty. We use a seemingly unrelated regression model to estimate six simultaneous

equations. Each of the six individual preference parameters yi∈ {αA

i, αS

i, αE

i, δA

i, δS

i, δE

is regressed on a set of individual characteristics:

yi=by,0+by,1(ˆsi) + by,2Raven Scorei+by,3RMET Scorei+by,4SSS Scorei

cy,kSocDemInfk

i+wi,

where:

•ˆsiis the individual noise parameter estimated by ML from the RISK treatment

data;

•Raven Scoreiis the Raven test score;

•RMET Scoreiis the Reading the Mind in the Eyes Test score;

•SSS Scoreiis the total score on the Sensation Seeking Scale (SSS);

•SocDemInfk

iis a set of kbasic socio-demographic variables: age, gender (Fe-

male is an indicator variable that takes the value one for female subjects)

and major (Econ Buss and Engineer are also indicator variables that take

the value one when subjects’ major is economics or business and engineering,

respectively);

•wiis the residual.

Table C.3 reports the estimated results. Although there is no systematic associ-

ation between any of the explanatory variables and the six parameters of interest,

we do reject a joint hypothesis of coefficient nullity across the three αregressions

with p < 0.001; we do not so, however, for the three δregressions. This suggests

that the heterogeneity in pessimism (α) observed in our (restricted) experimental

sample is partially transmitted by individual differences which, however, cannot

account for the heterogeneity in the general preferences towards uncertainty (δ).

However, we also note that this result should be handled with care, since it is not

entirely confirmed in unrestricted sample estimations. Estimates provided in Table

C.4 point to a weak statistical link between our set of explanatory variables and the

six parameters of interest.

120

Table C.3: Seemingly unrelated regressions with individual characteristics: restricted

sample

αA

iαS

iαE

iδA

iδS

iδE

ˆσi-.031 -.147 -.068 .630 -.028 -.202

(.029) (.127) (.168) (.589) (.640) (.669)

Raven Score -.019 -.123 -.068 -.332 .214 .219

(.015) (.086) (.063) (.744) (.739) (.762)

RMET Score .005 -.003 -.020 -.453 .724 .820

(.016) (.016) (.023) (.517) (.513) (.546)

SSS total .015 .035 -.010 .505* -.292 -.324

(.010) (.031) (.059) (.277) (.284) (.286)

Female -.163* .418 .704* -.202 -2.595 -2.604

(.098) (.314) (.419) (2.283) (2.322) (2.362)

Age .002 -.006 -.015 .001 -.062 -.081

(.005) (.014) (.018) (.192) (.191) (.199)

Econ Buss -.030 -.032 .757 -.636 -3.485 -4.612

(.146) (.172) (1.174) (4.264) (4.568) (4.478)

Engineer .097 .202 -.323 1.726 -4.353** -4.614**

(.090) (.274) (.406) (2.181) (2.215) (2.286)

Constant -.284 1.037 2.307* 1.668 -8.631 -9.287

(.294) (.941) (1.339) (7.534) (7.870) (8.216)

Joint insignificance (p-value): 0.336 0.716 0.005 0.359 0.690 0.672

Notes: Standard errors are clustered at the subject level and reported in parentheses. Data correspond to specification

(3) in Table 3.6 (N= 118). Parameter δk

iis neglog-transformed. Significance levels: * p < 0.1, ** p < 0.05, ***

p < 0.01. Joint insignificance of coefficients for the three α(δ) regressions:p<0.001 (p= 0.707). Joint insignificance

of coefficients across the six models: p<0.001.

121

Table C.4: Seemingly unrelated regressions with individual characteristics: unre-

stricted sample

αA

iαS

iαE

iδA

iδS

iδE

ˆσi-.035* -.057 -.049 -.078 -.178 -.229

(.021) (.065) (.083) (.465) (.477) (.476)

Raven Score -.021 -.051 -.016 -.160 -.109 -.080

(.013) (.049) (.037) (.477) (.477) (.474)

RMET Score .010 -.008 -.025* -.174 .456 .543

(.013) (.012) (.015) (.346) (.334) (.351)

SSS total .014* .023 -.019 .138 .044 .008

(.007) (.020) (.036) (.217) (.225) (.214)

Female -.091 .257 .454* -1.250 -1.668 -1.663

(.082) (.194) (.266) (1.824) (1.867) (1.822)

Age .000 .002 -.009 .022 -.099 -.127

(.005) (.010) (.013) (.166) (.163) (.170)

Econ Buss .070 .014 .386 -1.533 -2.293 -3.399

(.110) (.137) (.679) (2.790) (2.875) (2.861)

Engineer .112 .039 -.297 -.438 -1.100 -2.335

(.083) (.177) (.249) (1.622) (1.639) (1.636)

Constant -.350 .088 1.667* 3.591 -6.278 -6.582

(.298) (.590) (.901) (6.233) (6.369) (6.248)

Joint insignificance (p-value): 0.087 0.001 0.023 0.729 0.870 0.839

Notes: Standard errors are clustered at the subject level and reported in parentheses. Data correspond to

specification (1) in Table 3.6 (N= 208). Parameter δk

iis neglog-transformed. Significance levels: * p < 0.1, **

p < 0.05, *** p < 0.01. Joint insignificance of coefficients for the three α(δ) regressions: p<0.001 (p= 0.606).

Joint insignificance of coefficients across the six models: p<0.001.

C.4 Screenshots

122

Figure C.1: Screen used in RISK treatment

Figure C.2: Screen used in STRATEGICUNCERTAINTY treatment (stag-hunt

game)

123

Chapter 4

Better than Perceived? Correcting

Misperceptions about Central

Bank Inflation Forecasts

An earlier version of this paper was circulated as: Bulutay, M. (2024). Better

than Perceived? Correcting Misperceptions about Central Bank Inflation Forecasts.

Berlin School of Economics Discussion Paper, No 34.

4.1 Introduction

Central banks publish their macroeconomic forecasts not only to inform the pub-

lic about the future of the economy, but also to manage expectations. However,

disagreements about the future persist between central banks and private agents.

For central banks, disagreement is particularly troubling when it comes to future

inflation, because inflation expectations can translate into inflation and deanchoring

can hinder the transmission of monetary policy. For private agents, it is inefficient

not to adopt the central bank’s inflation forecast because forming personal forecasts

is costly in terms of time and resources.1

Could the inflation disagreement between central banks and private agents be

explained by the latter underestimating the former’s forecasting ability and therefore

relying on their own assessments? Recent research shows that the public pays more

attention to inflation news when inflation is high and volatile (Weber et al.,2023;

Pf¨auti,2023;Korenok et al.,2023). Thus, larger forecast errors may be overweighted

when people try to think about forecast accuracy. If so, the public’s perception of

the central bank’s accuracy may be biased toward an underestimation of accuracy.

1Furthermore, previous research shows that central banks have an information advantage over

private agents, especially in times of high uncertainty (Gavin and Mandal,2003;El-Shagi et al.,

2016). See Binder and Sekkel (2023) for a review of the literature on central bank forecasts.

124

Central banks should correct such possible misperceptions in order to better influ-

ence inflation expectations. Public perception is also crucial for the independence

of central banks (Ehrmann and Fratzscher,2011).

This article uses two novel survey modules to (i) measure German households’

beliefs about the ECB’s inflation forecasting accuracy and (ii) test the causal effect of

central bank communication on any misperceptions. These modules are integrated

into the Deutsche Bundesbank’s Survey on Consumer Expectations in two waves

in 2022 and 2023, when inflation was high and volatile. Both modules include pre-

registered experiments that exogenously vary information sets and thus show the

causal effect of information on private expectations.

The first experiment, conducted in September 2022, elicits beliefs about the over-

all accuracy of the ECB’s inflation forecasts up to the time of the survey. Partici-

pants are then randomly assigned to receive treatment-specific information. While

all treatment groups (except the control group) are informed about the ECB’s most

recent medium-term inflation forecast, some are also informed about the average

accuracy of past inflation forecasts. Thanks to the random assignment, one can

estimate the causal effect of being exposed to correct information about the ECB’s

inflation forecast accuracy on variables measured later in the survey, such as inflation

expectations, consumption plans, and trust in the ECB.

The results show that only 13% of German households believe that the ECB’s

forecasts are as accurate as they actually are. A larger share (21%) believes that

the average absolute forecast error is larger than the largest forecast error ever

made by the ECB. Cross-sectional correlations show that the underestimation of

the ECB’s forecasting accuracy is negatively related to self-reported trust in the

ECB, even after controlling for a rich set of covariates, including education. Thus,

these misperceptions cannot be explained by illiteracy or misunderstanding alone

and seem to reflect trust in the ECB.

Information about the accuracy of past forecasts lowers inflation expectations,

reduces uncertainty about future inflation, promotes trust in the ECB, and discour-

ages the consumption of certain goods, such as major items (e.g., cars, furniture)

and clothing. Using instrumental variable estimation to account for endogeneity,

I identify the causal relationship between trust in the ECB and inflation expecta-

tions. This analysis shows that the information shifts inflation expectations through

its effect on trust in the ECB. In terms of marginal effects, a one standard deviation

increase in trust in the ECB reduces inflation expectations by 5.3% to 8.5% and

inflation uncertainty by 2.4% to 7.1%.

The second experiment takes place one year after the first, in September 2023.

This time, respondents are asked to report their short-term inflation expectations

and to guess the ECB’s one-year-ahead forecast for the same inflation. These two

125

responses make it possible not only to document possible misperceptions, but also

to identify the source and expected direction of the error. Later, respondents are

provided with information on the current inflation rate in the euro area and/or the

ECB’s inflation forecast. After the information phase, the survey measures trust in

the ECB, but in an indirect way. Instead of asking how much the respondent trusts

the ECB on a scale of 0 to 10, as in the previous experiment and as is common in the

literature, the question elicits the degree of agreement with six statements. These

statements capture different facets of central banking related to trust, including

the honesty of the ECB’s communication, the credibility of the inflation target,

the inclusiveness of monetary policy, and the adherence to the mandate. Thus,

the question measures trust without actually using the term trust and with more

granularity.2

A majority of respondents (62%) believe that the ECB’s inflation forecast will

undershoot actual inflation, reflecting a belief in the ECB’s optimism. While 19%

believe in the opposite deviation (i.e. pessimism), another 19% think that the ECB’s

forecast will be exactly right. Cross-sectional correlations again show that these be-

liefs reflect trust in the ECB. Using self-reported trust in the ECB from an earlier

question in the same wave of the survey, I show that those who believe the ECB

is optimistic report 0.24 standard deviations less trust in the ECB. However, be-

liefs about pessimism are not significantly correlated with self-reported trust in the

ECB. Thus, optimism in forecasts seems to be more dangerous for a central bank’s

reputation than pessimism.

Information treatments change public opinion about the ECB. Respondents who

receive information about the actual inflation forecast report higher trust in the

ECB, as measured by the average agreement with the six statements. However,

information about the current inflation rate does not significantly affect agreement

with statements and even neutralizes the positive effect of information about the

forecast.3Regarding the subscales, the results show that the positive effect of in-

formation on public opinion is strongest for the credibility of the inflation target.

Respondents report stronger agreement with the statement that the ECB will en-

sure price stability within three years. In addition to credibility, information also

improves the perceived honesty of the ECB’s communication, and better convinces

2This approach also provides a way to harmonize the results in this literature, as it is not clear

what is reported in response to ”trust in the central bank” questions. Different studies also use

different wording. While most studies ask respondents to indicate their trust in the central bank

on a scale of 0 to 10 (Christelis et al.,2020), some mention specific characteristics such as ”trust

in the ability to achieve price stability” (Hoffmann et al.,2022) or trust in the central bank ”to

care about the economic well-being of all (citizens)” (D’Acunto et al.,2021).

3At the time of the experiment, inflation was already below the ECB’s one-year-ahead inflation

forecast of 6.3%. This information may have revealed significant forecast errors, albeit in the

opposite direction of respondents’ initial beliefs.

126

respondents of the benefits of the ECB’s policy for their household.

This study contributes to the literature that examines the effects of macroe-

conomic information on households’ inflation expectations (Armantier et al.,2016;

Cavallo et al.,2017;Binder and Rodrigue,2018;Coibion et al.,2022;Kostyshyna

and Petersen,2024), consumption decisions (Roth and Wohlfart,2020;Dr¨ager et al.,

2022;Coibion et al.,2023), and attitudes toward the central bank (Bholat et al.,

2019;D’Acunto et al.,2021;Brouwer and de Haan,2022;Dr¨ager and Nghiem,2023;

Ehrmann et al.,2023;M´eon and Hayo,2023;Ash et al.,2024). In summary, trust in

(the credibility of) the central bank turns out to be a very sticky variable in terms

of the response to information interventions. In a closely related study, McMahon

and Rholes (2024) conduct an online experiment and introduce exogenous variation

in the forecast accuracy of a hypothetical central bank. They find that forecast ac-

curacy systematically affects the credibility of a central bank. The results presented

here complement the findings of this literature by demonstrating the benefits of in-

forming the public about forecast accuracy and by highlighting the caveats of such

a communication campaign. To the best of my knowledge, this is the first study to

document beliefs about a central bank’s forecasts.

Second, my results contribute to the literature that investigates the implications

of trust in central banks for monetary policy. Using dynamic general equilibrium

models, the previous literature shows that trust in central banks matters for the

transmission of monetary policy through its influence on expectations and risk at-

titudes (Bursian and Faia,2018;Hommes and Lustenhouwer,2019;Haldane et al.,

2020). Besides these models, several studies use household surveys to show the

causal effect of central bank trust on inflation expectations through instrumental

variable estimation (Mellina and Schmidt,2018;Christelis et al.,2020). Using a

similar analysis, I find a very similar quantitative relationship between trust and

inflation expectations. Monetary authorities can use these measures to assess the

importance of trust for inflation expectations.

Finally, the results presented here are related to the growing literature that in-

tervenes in misperceptions about economic facts. Recent evidence includes misper-

ceptions about outside options in the labor market (J¨ager et al.,2022), public debt

(Roth et al.,2022), returns to active investment (Haaland and Naess,2023), or the

gender wage gap (Settele,2022). These studies typically show that large segments

of society are uninformed or misinformed. Information interventions show that cor-

recting these misperceptions on seemingly niche topics has a significant return in

terms of beliefs, decisions, and policy demand.

127

4.2 First Experiment

This section describes an information provision experiment that aims to generate

exogenous variation in information sets. The experiment is implemented in the

September 2022 wave of the Bundesbank Online Panel - Households (BOP-HH)

survey. This survey, which has been running monthly since 2019, elicits the per-

ceptions and expectations of about 2000-6000 households in Germany on variables

such as inflation, house prices, and income, as well as consumption plans, policy

preferences, and so on. Table D.1 in the appendix summarizes the demographic

profile of the sample.

4.2.1 Design and Implementation

Figure 4.1: Flow of the first experiment

Q1: prior inf

T1 = Placebo T2 = Forecast T3 = T2 +

Accuracy T4 = T3

Q2: perc inaccuracy

Q3: post max,

post min

Q4: ecb trust

Notes: The graph shows the timeline of the first experiment. The blue boxes show the questions with their labels.

The green boxes show the information treatments with their labels.

Design

The experiment consists of four stages. These stages are shown in Figure 4.1. In

the first stage, inflation expectations in the euro area for the calendar year 2024 are

elicited by the following question:

”What do you think the rate of inflation or deflation in the euro area

will roughly be in 2024?”

If the respondent expects deflation, they enter a negative value. Answers to this

question are coded as prior inf.4After this stage, respondents answer a series of

4The original German texts, along with all other experimental material, can be found in the

Appendix D.1.

128

questions about their income and house prices in the core of the survey. At the

end of these questions, respondents enter the second stage to receive information.

They are randomly assigned to one of four treatment conditions. These treatments

are referred to as T1, T2, T3, and T4. There is a higher probability of assignment

(30%) for T2 and T4 because they are the main treatments.

The first group (T1) serves as an active control group. Respondents are given a

placebo information, which is the population growth rate in Germany (2% between

2010-21). This serves as a remedy for numerical anchoring bias, as the information

in the other groups is numerically similar.

The remaining groups receive a text with information about the ECB’s inflation

forecast. All start with an introductory text explaining the frequency of the fore-

casts and their relevance for the Governing Council’s decision-making process. In

addition, the groups receive treatment-specific information.

The second group (T2) is informed of the ECB’s inflation forecast for 2024, as

announced in the September 2022 press release, with the following text

”This September, the ECB projected a decline in annual euro area

inflation to 2.3% by the end of 2024”.

The last two groups (T3 and T4) are also informed of the ECB’s inflation forecast,

but the same text includes the following additional information:

”The ECB’s projections for the euro area inflation rate deviated by

less than one percentage point on average from the actual inflation rates

in the period from 2001 (when the projections began) to 2021.”

So there is no difference between T3 and T4 in terms of the information provided.

The difference is in the implementation. Respondents in T4 are asked to indicate

their beliefs about past forecast accuracy immediately before being shown the true

answer, while respondents in T3 are only shown the information. The following

question is used to elicit beliefs about past forecast accuracy:

”By how much do you think the ECB’s projections deviated on av-

erage from the actual inflation rates in the period from 2001 (when the

projections began) to 2021? Please give your best estimate.”

The options provide four ranges between ”0-1 percentage point (pp)”, ”1-2 pp”, ”2-3

pp”, and ”3 pp or more”.5Responses are coded with perc inaccuracy.

5The choice of ranges is motivated by two factors. First, households typically round their

responses to whole numbers, especially when they are very uncertain (Binder,2017). The closest

whole numbers to the correct answer are 0 and 1, hence ”0-1 pp”. There are four ranges to ensure

that respondents do not heuristically choose the middle range.

129

Immediately after reading the treatment-specific information in the second stage,

respondents are again asked to form expectations about the euro area inflation rate

in 2024. The question asks

”What are the minimum and maximum values you think the inflation

rate or deflation rate could have in the euro area in 2024?”

The answers are coded with post min and post max and used to infer two key mo-

ments. The difference post max −post min is used as a proxy for the variance of

expectations (i.e., inflation uncertainty) and is coded as inf unc. The midpoint of

post max and post min is used as a proxy for the point expectation and is coded as

post inf.6

The final stage of the module elicits respondents’ trust in institutions on a scale

of 0-10, where 0 indicates ”no trust at all” and 10 indicates ”absolute trust”. I elicit

trust in five institutions. These are the ECB, the Federal Government, the Court

of Justice of the European Union (CJEU), the Bundesbank, and media enterprises

(presented in random order and coded name trust). Such a formulation reduces the

experimental demand effect through obfuscation. In addition, two of the institutions

(the CJEU and media enterprises) will serve as instruments in an instrumental

variable estimation (see section 4.2.2).

Procedures and Data Selection

A total of 5527 respondents participated in the September 2022 wave of the survey.

Invitations are sent between 15-29 September 2022, shortly after the release of the

ECB’s inflation forecasts on 8 and 9 September. Euro area inflation stood at 9.1%

in August 2022 and has not yet peaked.

Using pre-registered exclusion criteria (see AsPredicted #107388), I drop respon-

dents who chose not to answer (N=63) or chose the ”Do not know” option (N=300)

to at least one question in the module (except perc inaccuracy question). I also

exclude respondents who expect the maximum inflation rate to be the same as the

minimum (N=198). Finally, 185 respondents who spent more than 2 hours in stages

(2)-(3) or less than 3 seconds in the information provision stage are excluded from

the sample to ensure data quality. This leaves 4863 respondents for the analysis.

Inflation expectations are further winsorized at the 5th/95th percentiles to reduce

the impact of outliers.7

6Although the midpoint is at best a noisy proxy for the point expectation, this assumption has

no downside for treatment effect analyses as long as the measurement error is orthogonal to the

treatment assignment.

7To illustrate, there are 65 respondents (1.3% of the final sample) who expect deflation or an

inflation rate above 40% pre-information.

130

4.2.2 Results

This section presents the results of the first experiment. Two predictions are pre-

registered:

1. Correcting misperceptions about the ECB’s forecast accuracy increases trust

in the ECB.

2. Trust mediates the impact of information on inflation expectations.

The Appendix D.2 presents a simple model framework that can be used to organize

these predictions.

Perceived vs. Objective Forecast Accuracy

Figure 4.2: Beliefs about inaccuracy and the actual distribution of forecast errors

Notes: The bars show the distribution of forecast errors, as measured by the absolute difference between the ECB

staff forecasts (i.e. excluding the Eurosystem forecasts) and realizations. The number above braces refers to the

proportion of respondents in T4 group who believe that the average absolute forecasts error is within the range

covered by the brace (N= 1440). The last brace covers >3. Data for forecasts cover the period from March 2001

to September 2021. Source: Author’s calculation based on Eurostat data.

The perc inaccuracy data from the fourth treatment can be used to document

the extent of misperception in the entire sample, since the question is asked before

treatment assignment. Figure 4.2 compares perc inaccuracy with actual data, as

shown by the distribution of absolute forecast errors over all forecasts. On the one

hand, only 13.5% of respondents think that the average absolute forecast error is

in the range ”0-1 pp”, while 77 of the calculated absolute errors out of 94 inflation

forecasts are actually in this range. On the other hand, at least 21% of the sample

believe that the average absolute forecast error is larger than the maximum error

131

made by the ECB over this period (2.3 pp). Taken together, these facts suggest that

German households significantly underestimate the ECB’s forecast accuracy.

Could it be that the responses to perc inaccuracy are due to misunderstanding or

noise? Three observations suggest that this is not the case. First, only 3 respondents

choose the ”do not know” option in response to this question, significantly fewer than

the usual number in the survey for the other questions. Second, the distribution of

perc inaccuracy is asymmetric, which is difficult to reconcile with random responses.

Third, the correlation between responses to perc inaccuracy and ecb trust is strong

even after controlling for demographic covariates including education (see Table D.2

in the Appendix). Taken together, confusion does not appear to be the main source

of variation in perc inaccuracy.

Treatment Effects on Inflation Expectations

How does correcting misperceptions with information affect the subjective distribu-

tion of inflation expectations? To measure the treatment effects on the revision of

point expectations, I run the following regression:

post infi−prior infi

| {z }

revisioni

=β0+β1×(2% −prior infi)

j=2

βj×treatj,i ×(2.3% −prior infi) + ei(4.1)

where the revision in inflation expectations is explained by the difference between

the signal (2% in T1, 2.3% in others) and the prior and its interaction with the

treatment. The regressors treatj,i are dummy variables that take the value one

if the observation is in treatment j. This specification follows from the Bayesian

updating framework, where the posterior belief is the weighted average of the prior

and the signal:

post =ω×signal + (1 −ω)×prior.

With respect to the equation (4.1), ωj=β1+βj. Thus, the parameters βjmeasure

treatment-specific learning rates, while β0and β1reflect mismeasurement due to

different question formats, experimenter demand, anchoring, etc. The main tests

are H0:b

β4−b

β2= 0 and H0:b

β3−b

β2= 0.

To test the effect of the treatments on inflation uncertainty, I estimate the fol-

lowing regression:

inf unci=α1+

j=2

αj×treatj,i +εi(4.2)

where inflation uncertainty is explained by the treatment indicators. Similarly, the

132

main tests are H0:bα4−bα2= 0 and H0:bα3−bα2= 0.

Table 4.1 reports the regression estimates. Three results emerge. First, respon-

dents in all treatment conditions learn from the information they are given and

express lower uncertainty relative to the control condition. Second, T4 works better

than T2 in terms of leading to more revision (t-stat= 1.74, p-value= 0.082) and

decreasing uncertainty across treatments (t-stat= −3.46, p-value= 0.001). Third,

T3 does not significantly affect either the revision of inflation expectations or infla-

tion uncertainty compared to T2. Thus, the forecast accuracy information is only

effective when the misperceptions are ”explicitly” revealed.8

Table 4.1: Treatment effects on learning and uncer-

tainty

(1) (2)

revision (βj)inf unc (αj)

T2 (= Forecast) 0.219∗∗∗ -1.152∗∗∗

(0.019) (0.148)

T3 (= T2 + Accuracy) 0.209∗∗∗ -1.293∗∗∗

(0.022) (0.158)

T4 (= T3 + Question) 0.253∗∗∗ -1.567∗∗∗

(0.019) (0.145)

β1,α10.352∗∗∗ 6.418∗∗∗

(0.017) (0.119)

β01.706∗∗∗ –

(0.067) (–)

N4863 4863

R20.35 0.03

Notes: Regression results based on equations (4.1) and (4.2) are re-

ported in columns (1) and (2), respectively. Robust standard errors are

reported in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.

Information, Trust, and Inflation Expectations

In this section, I examine the effect of information treatments on trust in the ECB

and the quantitative relationship between trust and inflation expectations. Fig-

ure 4.3 illustrates the proposed causal chain between these three variables using a

directed acyclic graph, which amounts to a mediation framework. Causal identifica-

8These results are robust to the inclusion of demographic control variables and to non-

winsorization of the data (see Table D.4 in the Appendix).

133

Figure 4.3: Directed acyclic graph showing the causal mechanisms

T M Y

ϕ1ϕ2

ϕ3

Notes:T: information (treatment), M:ecb trust (mediator), and Y:post inf or inf unc (outcome). Solid lines

refer to the causal mechanisms of interest, the dotted line refers to the direct effect and the dashed line refers to the

possible presence of post-treatment confounders.

tion of the parameters ϕ1and ϕ2shown in Figure 4.3 with the standard mediation

analysis of Baron and Kenny (1986) is problematic in the presence of confounders

and endogeneity (Bullock et al.,2010;Imai et al.,2011;Acharya et al.,2016). Instru-

mental variables (IVs) can be used to identify the causal effects (Imai et al.,2011).

The following two-stage least squares (2SLS) estimation can be used to estimate ϕ1,

ϕ2, and ϕ3:

Mi=ζ+ϕ1T+ρZi+XB +ϵi2,(4.3)

Yi=ψ+ϕ3T+ϕ2c

Mi+XB +ϵi1(4.4)

where Mis ecb trust,Zrefers to the instruments, and Xis a vector of controls. The

treatment dummy Ttakes the value one if the observation is from either T3 or T4,

and zero if it is from T2. The outcome variables in the second stage (Y) are either

post inf or inf unc. I propose trust in two non-economic institutions as instruments

for trust in the ECB. These institutions are the CJEU and media enterprises. The

main identifying assumption is that trust in these institutions is related to trust in

the ECB, while they are exogenous to post inf and inf unc.9

9In a similar analysis, Mellina and Schmidt (2018) use trust in three European institutions

including the CJEU and three German institutions as instruments. Christelis et al. (2020) use trust

in other people and the frequency of being cheated by a repair person in the past as instruments

on trust in the central bank.

134

Table 4.2: Causal Mechanisms with 2SLS

(1) (2) (3)

Dep. var.: ecb trust post inf inf unc

treatment (T) 0.049∗∗ -0.035 -0.274∗∗∗

(0.023) (0.083) (0.100)

ecb trust (c

M) -0.556∗∗∗ -0.294∗∗∗

(0.065) (0.075)

cjeu trust (Z1) 0.572∗∗∗

(0.014)

media trust (Z2) 0.197∗∗∗

(0.015)

constant 0.111 4.034∗∗∗ 7.004∗∗∗

(0.077) (0.275) (0.335)

N3886 3886 3886

adjusted R20.526 0.282 0.165

F-statistic 2147.27 87.037 50.866

J-statistic - 0.575 0.341

p-value - 0.448 0.559

Notes: The table reports the results of 2SLS regressions described in

equations (4.3) and (4.4). Trust-related variables are standardized

(mean zero, sd one). The p-value shows the results of the overiden-

tification test of all instruments (based on Hansen’s J-stat). The

following control variables are included: age (discrete), female (bi-

nary), university graduate (binary), personal income below 1500

Euro (binary), single-person household (binary), born in the GDR

before 1989 (binary), terms for the regions (binary), and prior infla-

tion expectations (continuous). Robust standard errors are reported

in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.

Table 4.2 shows the results. I begin by assessing the quality of the instruments.

First, the first stage F-statistic of 2147 is well above the conventional threshold

required for valid instruments. Second, both instruments are positively correlated

with the mediator. Third, the test for overidentification of the instruments (the

Hansen J statistic) does not reject the hypothesis of joint validity of the instruments.

Overall, these diagnostics do not indicate a lack of validity of the instruments.

The results of the first stage show that exposing households to information about

forecast accuracy (as was done in T3 and T4) increases trust in the ECB (b

ϕ1= 0.049,

t-stat = 2.12, p-value = 0.034). The results of the second stage show that trust in the

ECB is negatively related to inflation expectations and uncertainty. A one standard

deviation increase in trust is associated with a 0.56 percentage point decrease in

135

post inf (95% CI [-0.68,-0.43]) and a 0.29 percentage point decrease in inf unc (95%

CI [-0.44,-0.15]). However, the treatment indicator is only significant for inf unc,

which indicates that the absence of direct effects on post inf.10

Persistence, Consumption, and Attention

A natural question to ask is whether the information interventions have lasting

effects on trust and translate into economic behavior. This section addresses such

questions using the panel dimension of the survey.11 In the month following the

experiment (October 2022), the following variables are measured: (i) trust in the

ECB, (ii) changes in attention to inflation news, and (iii) changes in consumption

plans. Trust in the ECB is measured as before on a scale from 0 to 10. Changes in

attention are measured by the following question:

”Has your interest in inflation developments changed in recent weeks?”

Respondents could indicate that they pay more, less or the same attention to infla-

tion. Consumption plans are measured by the following question:

”Are you likely to spend more or less on the following items over the

next twelve months than in the last twelve months?”

Respondents answer for nine different categories. I focus on consumption of durable

goods, such as major purchases (e.g., cars, furniture) and clothing, and on savings.

Table 4.3 shows the results of linear regressions. For ecb trust, the effects are

already attenuated after one month. For the other outcome variables, T4 shows per-

sistent effects. Respondents in this group are 2.5% less likely to report having paid

less attention to inflation news in recent weeks (t-stat = −1.75, p-value = 0.081),

3.1% less likely to report an increase in spending intentions for major purchases

(t-stat = −1.69, p-value = 0.091), and 2.7% less likely to report more spending on

clothing and footwear (t-stat = −1.91, p-value = 0.056). Relative to T2, informa-

tion in T4 also reduces the probability of reporting an increase in consumption for

major purchases by 3.7% (t-stat = −2.21, p-value = 0.027). For savings plans and

paying more attention to inflation news, none of the treatments have a significant

coefficient. In summary, information about accuracy has discernible effects on some

behaviors that persist for at least four weeks when delivered after a question, as was

done in T4.

10These results are robust to excluding T3 from the sample and using only cjeu trust as the

instrument (see Tables D.5 and D.6 in the Appendix).

1157% of respondents who participated in the experiment remain in the panel the following

month (October 2022). This number drops abruptly to 27% in November 2022. The scope of this

exercise is therefore limited to testing the effects that last four weeks.

136

Table 4.3: Treatments effects on trust, attention, and consumption one month later

(1) (2) (3) (4) (5) (6)

Attention Spend more

Dep. var. ecb trust More Less Major Clothing Savings

T2 (= Forecast) -0.093 0.025 -0.026∗0.006 -0.007 -0.003

(0.138) (0.027) (0.014) (0.019) (0.015) (0.016)

T3 (= T2 + Accuracy) -0.002 0.007 -0.002 -0.019 -0.005 -0.002

(0.144) (0.029) (0.017) (0.020) (0.016) (0.018)

T4 (= T3 + Question) -0.190 0.011 -0.025∗-0.031∗-0.027∗0.004

(0.137) (0.027) (0.014) (0.018) (0.014) (0.016)

constant 4.275∗∗∗ 0.390∗∗∗ 0.107∗∗∗ 0.243∗∗∗ 0.171∗∗∗ 0.239∗∗∗

(0.298) (0.059) (0.029) (0.043) (0.035) (0.040)

N2772 2794 2794 2738 2738 2738

R20.03 0.00 0.01 0.02 0.02 0.03

Notes: The table reports OLS regressions on six outcome variables measured in the October wave of the survey (one

month after the experiment). The following control variables are included: age (discrete), female (binary), university

graduate (binary), personal income below 1500 Euro (binary), single-person household (binary), born in the GDR

before 1989 (binary), and terms for regions (binary). Regressions where the dependent variable is consumption plan

also control for expected changes in income (continuous). Robust standard errors are reported in parentheses. *

p < 0.1, ** p < 0.05, *** p < 0.01.

4.3 Second Experiment

This section presents a second complementary experiment implemented in BOP-

HH Wave 45 (September 2023). There are two main design changes. The first is

that households’ quantitative beliefs about short-term inflation and beliefs about

the ECB’s short-term inflation forecast are elicited. From the answers to these

questions, it is possible to infer the expected forecast error and its direction. Second,

post-information trust in the ECB is measured indirectly via a questionnaire and

without using the term trust in any part of the question. This approach allows

for more granularity and provides information on the effects of such information

treatments.

4.3.1 Design and Implementation

Design

The experiment consists of four stages. These stages are shown in Figure 4.4. In

the first stage, trust in the ECB is measured on a scale from 0 to 10, using the same

137

Figure 4.4: Flow of the first experiment

Q1: prior trust

Q2: exp inf,

perc forecast

G1 = Control G2 = Inflation G3 = Forecast G4 = G2 + G3

Q3: post trust

Notes: The graph shows the timeline of the second experiment. The blue boxes show the questions with their labels.

The green boxes show the information treatments with their labels.

question as in stage (4) of the first experiment. This is coded as prior trust. In the

second stage, respondents are asked to indicate their expectation for inflation in the

euro area in the calendar year and their belief in the ECB’s forecast for the same

calendar year one year ahead. The following question is used:

”What do you think the inflation rate in the euro area will be in 2023

overall, i.e. between December 2022 and December 2023? And what

inflation rate do you think the ECB forecasted in its projections for 2023

back in December 2022?”

The answers are coded with exp inflation and perc forecast.

After this question, respondents move to the third stage where they receive

information. They are randomly assigned to one of four groups. The first group

(G1) is a pure control group that receives no information.12 The group G2 receives

the most recent annual inflation rate as information, along with the following text:

”You will now be shown up-to-date information on the inflation rate

in the euro area. According to the latest statistics, the inflation rate in

the euro area between July 2022 and July 2023 was 5.3%.”

The group G3 receives the ECB’s one-year ahead inflation forecast for calendar year

2023 with the following text:

”You will now be shown up-to-date information on the inflation rate

in the euro area. In December 2022, the ECB forecasted that the inflation

rate in the euro area would be 6.3% by December 2023.”

12Because the information and the beliefs elicited after the information are on a different scale,

there is no need to account for the numerical anchoring bias.

138

The group G4 receives both sets of information.

After receiving the information, respondents indicate their level of agreement

(on a scale of 1-7) with six statements. These statements are shown in Table 4.4.

They are designed to capture different facets of trust in the ECB.13 The average

agreement across items is used as a measure of trust in the ECB and is coded as

post trust.

Table 4.4: Statements used to indirectly measure trust in the ECB

No Statement Label

(a) The ECB will ensure price stability in the euro area over the next three years. credibility

(b) The ECB looks after the economic well-being of everyone in the euro area. inclusivity

(d) The ECB communicates with the public in a transparent and honest manner. honesty

(e) The ECB has sufficient expertise to understand general economic developments. expertise

(f) The ECB makes decisions that benefit people like me. interest

Notes: Presented in random order. Scale of answers is 1 (completely disagree) to 7 (completely agree).

Procedures and Data Selection

In total, 3999 respondents participated September 2023 Wave of the BOP-HH. Us-

ing pre-registered exclusion criteria (see AsPredicted #145978), I drop respondents

who stated either ”No answer” or ”Don’t know” to one of the three questions before

treatments.. Respondents who do not know the ECB or who expects inflation (defla-

tion) to be above 25% or below -5% are also excluded. This leaves 3643 respondents

for the analysis and amounts to an exclusion rate of less than 10%. Demographic

variables do not vary much across the initial and the final samples (see Table D.7).

Exclusion is also balanced across treatments.

4.3.2 Results

This section presents the results of the second experiment. Three directional pre-

dictions are pre-registered:

1. Most households believe that the ECB forecast will undershoot actual inflation.

2. This belief (of optimism) is associated with lower trust in the ECB.

13These items are drawn and adapted from a variety of studies in the literature. For example,

statement (a) is used by Ehrmann et al. (2023) as a measure of the ECB’s credibility. A combination

of statements (b) and (f) is used by D’Acunto et al. (2021) to infer trust in the Fed. Similarly, Kril

et al. (2016) use a 17-item questionnaire to measure trust in and credibility of the Bank of Israel.

139

3. Information interventions have a positive impact on trust in the ECB.

Expected Inflation and Perceived Forecast

Figure 4.5: Expected inflation and perceived inflation forecast

0 5 10 15 20 25 30

Perceived one-year ahead ECB forecast for 2023

0 5 10 15 20 25 30

Euro area inflation expectations for 2023 (as of September 2023)

Notes: Circle size indicates frequency of observations. The solid line has a slope of 45 degrees. Two observations

where perc forecast was greater than 30% are removed from the graph for visual illustration (N= 3641).

Figure 4.5 shows respondents’ one-quarter ahead inflation expectations and their

beliefs of the ECB’s one-year ahead inflation forecast. A majority of respondents

believe that the ECB’s forecast will be below actual inflation (N= 2255, 62%). This

reflects the belief that the ECB’s inflation forecast is undershooting and therefore too

optimistic.14 While some believe that the ECB will overshoot inflation (N= 689,

19%), there are also many who believe that the ECB is right on target (N= 699,

19%). Among those who think the ECB will undershoot, 82% perceive the forecast

to be below what it actually is (perc forecast <6.3%) and 78% expect inflation to

be higher than the current rate (exp inflation >5.3%).

14Rather than optimism, this difference may reflect the effect of having more information, as

households make their forecast at a later date than the ECB. In Figure D.1 in Appendix D.3, the

x-axis variable is replaced by 12-month ahead inflation expectations for Germany in January 2023

in order to equalize the forecasting horizons. The results show a similar share of respondents who

believe that the ECB will undershoot.

140

Table 4.5: Perceived direction of the

ECB’s forecast error and trust in the ECB

(1) (2)

Dep. var.: prior trust

ecb optimistic -0.243∗∗∗ -0.236∗∗∗

(0.043) (0.042)

ecb pessimistic 0.095∗0.082

(0.052) (0.051)

constant 0.133∗∗∗ 0.044

(0.038) (0.087)

Controls included No Yes

N3643 3643

R20.02 0.06

Notes: The dependent variable (prior trust) is stan-

dardized to have a mean of zero and a standard de-

viation of one. The following control variables are in-

cluded to the regression (2): age (discrete), female (bi-

nary), university graduate (binary), personal income

below 1500 Euro (binary), single-person household

(binary), born in the GDR before 1989 (binary), and

terms for regions (binary). Robust standard errors

are reported in parentheses. * p < 0.1, ** p < 0.05,

*** p < 0.01.

Could respondents’ beliefs about the direction of the forecast error be an in-

dicator of (lack of) trust? Correlational evidence supports this insight. Table 4.5

shows the results of linear regressions in which prior trust is explained by dummy

variables that take the value one if a respondent believes that the ECB will under-

shoot (ecb optimistic) or overshoot (ecb pessimistic) inflation. The reference group

believes that the ECB is exactly right. The results show an asymmetric relationship.

While beliefs that the central bank will undershoot inflation are negatively related

to self-reported trust in the ECB, beliefs about overshooting inflation are not. Thus,

the perception of optimistic forecasts hurts the ECB more.

Effects of Information Treatments

How do the information treatments affect public opinion about the ECB? Table 4.6

shows the results of linear regressions explaining either the average agreement across

six statements (i.e., post trust) or individual responses to each of the six statements

141

with treatment dummies.15 A positive coefficient reflects an improvement in the

public’s perception of the institution. All regressions control for prior trust and the

initial deviations of perc forecast and exp inflation from 6.3% and 5.3%, respectively.

Table 4.6: Effects of information treatments on the public’s perception of the ECB

(1) (2) (3) (4) (5) (6) (7)

Dep. var.: post trust credibility inclusivity legitimacy honesty expertise interest

G2 (= Inflation) 0.041 0.045 0.028 0.028 0.052 0.020 0.047

(0.030) (0.036) (0.035) (0.036) (0.035) (0.036) (0.035)

G3 (= Forecast) 0.081∗∗∗ 0.106∗∗∗ 0.065∗0.048 0.074∗∗ 0.049 0.072∗∗

(0.030) (0.037) (0.035) (0.036) (0.035) (0.037) (0.035)

G4 (= G2 + G3) 0.035 0.060∗0.046 0.001 0.046 -0.003 0.039

(0.030) (0.036) (0.035) (0.036) (0.035) (0.036) (0.035)

constant -0.010 -0.009 -0.011 -0.001 -0.014 -0.007 -0.012

(0.023) (0.028) (0.026) (0.028) (0.027) (0.028) (0.027)

N3629 3629 3629 3629 3629 3629 3629

R20.59 0.41 0.44 0.40 0.43 0.41 0.46

Notes: Results of linear regressions in which agreement on six statements is explained by treatment dummies. All regressions control

for prior trust and the initial deviations of perc forecast and exp inflation from 6.3% and 5.3%, respectively. The dependent variables

are standardized to have a mean of zero and a standard deviation of one. Robust standard errors are reported in parentheses. * p < 0.1,

** p < 0.05, *** p < 0.01.

The results show that informing respondents about the ECB’s inflation forecast,

as in the G3 group, has a positive effect on trust in the ECB. The breakdown by

statement shows a similar result. The main effect of the information in G3 is on the

credibility of the ECB’s inflation target credibility (t-stat= 2.87, p-value= 0.004).

Treated respondents are also more likely to agree with the statement about the

ECB’s honesty (t-stat= 2.09, p-value= 0.037), to perceive more personal benefits

from monetary policy (t-stat= 2.04, p-value= 0.042), and to view the ECB’s policy

as more inclusive in the euro area (t-stat= 1.85, p-value= 0.065). In contrast, current

inflation information does not significantly affect public opinion in any category.

Interestingly, adding current inflation information to inflation forecast information,

as in G4, neutralizes the positive effect of the latter.

4.4 Conclusion

Using two surveys of German households conducted over two years, this article

shows that public perceptions of the ECB’s forecast errors matter for monetary

1514 respondents who choose not to answer at least one of the statements are excluded from this

analysis.

142

policy. The first survey documents that households significantly underestimate the

ECB’s forecast accuracy. The second survey shows that this underestimation reflects

beliefs about overly optimistic forecasts. In both cases, information that challenges

these perceptions has the intended effects on inflation expectations, uncertainty,

consumption plans, and trust in the central bank.

The policy implications of these results are subject to several caveats. The first

experiment demonstrates the importance of correcting misperceptions in a salient

way, as one of the treatments did by asking a question before presenting the belief-

challenging information. Second, these experiments were conducted under special

circumstances, when inflation was high and volatile, and may not be replicable under

the opposite conditions of low and stable inflation. Third, information may have

countervailing effects when it is delivered in a bundle, as was one of the treatments

in the second experiment. Information experiments provide an inexpensive way

to study these caveats before implementing communication policies for the general

population.

143

Appendix D

Appendix

D.1 Experimental Materials

All the material used in the experiment (e.g., questions, treatment texts) can be

found in the Bundesbank’s website both in German and in English:

https://www.bundesbank.de/en/bundesbank/research/survey-on-consumer-expectations

Waves 33 and 45 are relevant for the treatment texts and questions.

D.2 A Model of Belief Updating with Trust

This section presents a model of Bayesian belief updating that can be used to gen-

erate predictions in the first experiment. In the model, the quality of the so-called

primary signal (i.e., information about the state) is unknown, but can be partially

inferred from a secondary signal (i.e., information about the quality of the primary

signal). I conceptualize the framework of the model based on the expectation for-

mation process of a household, where the central bank’s inflation forecast is the

signal.

D.2.1 Model

Household ihas the prior belief that future inflation πis normally distributed with

mean π0(i) and variance σ2

0(i). The central bank publishes its inflation forecast

πf, an unbiased but noisy signal of future inflation distributed as N(π, σ2

f). The

variance σ2

fis referred to as the forecast uncertainty. For convenience, σ2

0and σ2

fare

assumed to be orthogonal.

Using Bayes rule, I can express the posterior of the household for inflation with

π1(i) = π0(i) + α(i)(πf−π0(i)),(D.1)

144

where α(i) is the weight household igives to the inflation forecast. It equals

α(i) = 





σ2

0(i)

σ2

0(i)+σ2

f,if {πf, σ2

f} ∈ Ω(i)

0,otherwise

(D.2)

where Ω(i) is the information set of the household i. The variance of the posterior

(hereafter referred to as the posterior inflation uncertainty) equals

σ2

1(i) = 





σ2

0(i)σ2

σ2

0(i)+σ2

f,if {πf, σ2

f} ∈ Ω(i)

σ2

0(i),otherwise.

(D.3)

The setup shown so far is a standard Bayesian updating model. From now on, I

assume that the household does not know the forecast uncertainty, i.e. {σ2

f}is not

in Ω(i). He has the prior that the forecast uncertainty is normally distributed with

mean σ2

f,0(i). To convey the quality of the inflation forecast πf, the central bank

can publish its past forecast performance as a secondary signal. Statistically, this

corresponds to the variance of past forecast errors σ2

h, where the subscript hrefers

to history. Past forecast errors are unbiased and normally distributed. Using this

setup, one can express the posterior about forecast uncertainty with:

σ2

f,1(i) = 





σ2

f,0(i) + ω(i)(σ2

h−σ2

f,0(i)),if {πf, σ2

h} ∈ Ω(i)

σ2

f,0(i),otherwise

(D.4)

where ω(i) is the weight iassigns to the past performance.

Trust is introduced into the model in a reduced-form manner through the house-

hold’s belief on forecast uncertainty σ2

f,0(i) (i.e. perceived performance). This belief

can potentially deviate from the objective performance σ2

h, as in

κ0(i)≡σ2

f,0(i)

σ2

=g(τ0(i))

σ2

(D.5)

where κ0(i) is a coefficient that reflects the magnitude and direction of these devi-

ations prior to the central bank communication. If κ0(i) is greater (smaller) than

1, then iunderestimates (overestimates) the central bank’s forecast performance.

Trust in the central bank τ0(i)∈[0,1] enters the framework through its relation to

κ0(i). It is assumed that g′<0, so that trust and underestimation are negatively

related, ceteris paribus.

145

D.2.2 Hypotheses

The model makes a number of predictions. The following predictions require that the

public initially underestimates the central bank’s performance (κ0>1). Let us call

the central bank’s decision to communicate past forecast performance ”correcting

misperceptions”.

Hypotheses 1: Correcting misperceptions increases the weight that households

place on the inflation forecast (i.e. α(i)).

Hypotheses 2: Correcting misperceptions reduces inflation uncertainty (i.e.

σ2

1(i)).

These hypotheses can be verified by assuming κ1< κ0that information reduces

underestimation. Thus, the household notices that the signal is of higher quality.

These simple predictions hold even in the absence of the concept of trust (i.e.,

τ0⊥σ2

f,0).

Hypotheses 3: Correcting misperceptions increases trust in the central bank

(i.e. τ1> τ0).

This hypothesis is not based on the model, but finds its justification in the litera-

ture. Information interventions on factual misperceptions typically change people’s

attitudes towards the misperceived topic.

Hypothesis 4: Trust in the central bank mediates the effects of correcting

misperceptions on inflation expectations.

The justification for this hypothesis is as follows. First, the third hypothesis

must be corroborated (i.e., τ1> τ0). Higher trust leads to a weaker underestimation

by the public due to the assumption of g′<0. This leads to the same effects in the

first two hypotheses. Thus, there is a dual effect of communication policy to correct

misperceptions when it also influences trust.

146

D.3 Supporting Material

D.3.1 First Experiment

Table D.1: Demographic profile

Pre-exclusion Post-exclusion Refreshers

Age 57.99 58.16 54.93

(15.10) (15.03) (15.27)

Female 0.410 0.387 0.456

(0.492) (0.487) (0.499)

University graduate 0.499 0.523 0.473

(0.500) (0.500) (0.500)

Personal income <1500 euros 0.291 0.280 0.277

(0.454) (0.449) (0.448)

Single-person household 0.257 0.254 0.260

(0.437) (0.436) (0.439)

Born in the GDR before 1990 0.155 0.153 0.181

(0.362) (0.360) (0.385)

Northern Germany 0.168 0.170 0.150

(0.374) (0.376) (0.357)

Western Germany 0.258 0.258 0.231

(0.438) (0.438) (0.422)

Southern Germany 0.405 0.405 0.410

(0.491) (0.491) (0.492)

Eastern Germany 0.168 0.167 0.210

(0.374) (0.373) (0.407)

N5527 4863 520

Notes: Descriptives for demographic variables are reported for the full and restricted (post-exclusion)

samples. Standard deviations are reported in parentheses below means.

147

Table D.2: Ordered logistic regression on the ECB per-

ception

(1) (2)

Dep. var.: perc accuracy

ecb trust -0.109∗∗∗ -0.109∗∗∗

(0.023) (0.031)

Age 0.010∗∗∗ 0.010∗

(0.004) (0.005)

Female 0.434∗∗∗ 0.381∗∗

(0.120) (0.185)

University graduate -0.323∗∗∗ -

(0.118) (.)

Personal income <1500 euros 0.294 0.247

(0.304) (0.394)

Single-person household -0.385 -0.420

(0.321) (0.429)

Born in the GDR before 1990 0.362 0.269

(0.300) (0.406)

Northern Germany -0.057 -0.063

(0.309) (0.415)

Western Germany -0.121 -0.209

(0.285) (0.384)

Southern Germany -0.094 -0.008

(0.275) (0.371)

N1022 517

Pseudo R20.0237 0.0186

Notes: The dependent variable perc inaccuracy quantifies the per-

ceived value of average forecast errors. Trust in the ECB is measured

post-treatment. Column (2) filters the data to university graduated

sample. Robust standard errors are reported in parentheses. * p < 0.1,

** p < 0.05, *** p < 0.01.

148

Table D.3: Demographic characteristics of respondents by the level of

misperception

perc accuracy 0-1 pp. 1-2 pp. 2-3 pp. ≥3 pp.

Age 56.17 58.21 59.58 61.66

(15.10) (15.22) (14.89) (13.55)

Female 0.297 0.370 0.433 0.436

(0.458) (0.483) (0.496) (0.497)

University degree 0.626 0.552 0.421 0.475

(0.485) (0.498) (0.494) (0.500)

Personal income <1500 euros 0.297 0.251 0.277 0.304

(0.458) (0.434) (0.448) (0.461)

Single-person household 0.277 0.228 0.247 0.264

(0.449) (0.420) (0.432) (0.442)

Born in the GDR before 1989 0.128 0.111 0.165 0.178

(0.335) (0.314) (0.371) (0.383)

Northern Germany 0.221 0.137 0.159 0.185

(0.416) (0.344) (0.366) (0.389)

Western Germany 0.200 0.309 0.216 0.231

(0.401) (0.463) (0.412) (0.422)

Southern Germany 0.415 0.414 0.439 0.389

(0.494) (0.493) (0.497) (0.488)

Eastern Germany 0.164 0.140 0.186 0.195

(0.371) (0.347) (0.390) (0.397)

N195 614 328 303

Notes: The table reports the mean demographic characteristic of respondents by response

categories to the ECB perception question. Standard deviations are reported in parentheses

below means.

149

Table D.4: Robustness regressions for reduced-form treatment

effects: With controls and without winsorization

(1) (2) (3) (4)

revision revision inf unc inf unc

β20.266∗∗∗ 0.217∗∗∗

(0.067) (0.019)

β30.166∗∗ 0.208∗∗∗

(0.083) (0.022)

β40.292∗∗∗ 0.250∗∗∗

(0.076) (0.019)

α2-1.170∗∗∗ -1.081∗∗∗

(0.369) (0.141)

α3-1.519∗∗∗ -1.241∗∗∗

(0.354) (0.151)

α4-2.023∗∗∗ -1.447∗∗∗

(0.325) (0.137)

β10.390∗∗∗ 0.344∗∗∗

(0.068) (0.017)

β0or α12.201∗∗∗ 2.753∗∗∗ 7.378∗∗∗ 9.899∗∗∗

(0.223) (0.234) (0.255) (0.320)

Winsorized No Yes No Yes

Controls included No Yes No Yes

N4863 4863 4863 4863

R20.40 0.36 0.01 0.11

Notes: The table reports regression estimates from equations (4.1) and (4.2).

Columns (1) and (3) use non-winsorized data, while columns (2) and (4) include

control variables. The following control variables are included: age (discrete),

female (binary), university graduate (binary), personal income below 1500 Euro

(binary), single-person household (binary), born in the GDR before 1989 (binary),

and terms for the regions (binary). Robust standard errors are reported in paren-

theses. * p < 0.1, ** p < 0.05, *** p < 0.01.

150

Table D.5: Robustness regressions for the causal

mechanisms: Only T4

(1) (2) (3)

Dep. var.: ecb trust post inf inf unc

treatment (T) 0.051∗∗ -0.113 -0.390∗∗∗

(0.026) (0.093) (0.110)

ecb trust (c

M) -0.596∗∗∗ -0.321∗∗∗

(0.073) (0.085)

cjeu trust (Z1) 0.567∗∗∗

(0.016)

media trust (Z2) 0.208∗∗∗

(0.017)

constant 0.142 4.141∗∗∗ 6.867∗∗∗

(0.087) (0.316) (0.375)

N2905 2905 2905

adjusted R20.534 0.283 0.169

F-stat 1631.93 67.938 41.180

J-stat – 0.246 0.059

p-value – 0.620 0.808

Notes: The table reports the results of 2SLS regressions where

the treatment indicator takes a value of one only for T4. Trust-

related variables are standardized. The following control variables

are included: age (discrete), female (binary), university graduate

(binary), personal income below 1500 Euro (binary), single-person

household (binary), born in the GDR before 1989 (binary), terms

for the regions (binary), and prior inflation expectations (continu-

ous). Robust standard errors are reported in parentheses. * p < 0.1,

** p < 0.05, *** p < 0.01.

151

Table D.6: Robustness regressions for the causal

mechanisms: Only cjeu trust as instrument

(1) (2) (3)

Dep. var.: ecb trust post inf inf unc

treatment (T) 0.050∗∗ -0.035 -0.274∗∗∗

(0.024) (0.083) (0.100)

ecb trust (c

M) -0.544∗∗∗ -0.283∗∗∗

(0.067) (0.077)

cjeu trust (Z1) 0.670∗∗∗

(0.011)

constant 0.168∗∗ 4.028∗∗∗ 6.999∗∗∗

(0.079) (0.275) (0.336)

N3886 3886 3886

adjusted R20.498 0.283 0.165

F-statistic 3586.79 86.002 50.644

Notes: The table reports the results of 2SLS regressions with a

single instrument (cjeu trust). Trust-related variables are stan-

dardized. The following control variables are included: age (dis-

crete), female (binary), university graduate (binary), personal

income below 1500 Euro (binary), single-person household (bi-

nary), born in the GDR before 1989 (binary), terms for the re-

gions (binary), and prior inflation expectations (continuous). Ro-

bust standard errors are reported in parentheses. * p < 0.1, **

p < 0.05, *** p < 0.01.

152

D.3.2 Second Experiment

Table D.7: Demographic profile

Pre-exclusion Post-exclusion Refreshers

Age 56.87 56.86 49.94

(15.92) (15.84) (17.04)

Female 0.410 0.387 0.454

(0.492) (0.487) (0.498)

University graduate 0.506 0.518 0.550

(0.500) (0.500) (0.498)

Personal income <1500 euros 0.301 0.290 0.285

(0.459) (0.454) (0.452)

Single-person household 0.266 0.263 0.247

(0.442) (0.440) (0.432)

Born in the GDR before 1990 0.152 0.151 0.155

(0.359) (0.358) (0.363)

Northern Germany 0.169 0.164 0.165

(0.374) (0.370) (0.372)

Western Germany 0.260 0.261 0.235

(0.439) (0.439) (0.424)

Southern Germany 0.406 0.410 0.420

(0.491) (0.492) (0.494)

Eastern Germany 0.166 0.165 0.179

(0.372) (0.371) (0.384)

N3999 3643 502

Notes: Descriptives for demographic variables are reported for the full and restricted (post-exclusion)

samples. Standard deviations are reported in parentheses below means.

153

Table D.8: Correlation matrix for different facets of public trust in the ECB

prior trust tCredibility inclusivity legitimacy honesty expertise self interest

prior trust 1.0000

tCredibility 0.6176 1.0000

inclusivity 0.6705 0.6563 1.0000

legitimacy 0.6405 0.5798 0.5998 1.0000

honesty 0.6505 0.6234 0.6975 0.6278 1.0000

expertise 0.6222 0.5699 0.6035 0.6439 0.6017 1.0000

self interest 0.6673 0.6460 0.7099 0.5800 0.6933 0.5793 1.0000

Notes: The table reports correlations only for the control group G1.

Figure D.1: Expected inflation (for Germany) and perceived inflation forecast

0 5 10 15 20 25 30

Perceived one-year ahead ECB forecast for 2023

0 5 10 15 20 25 30

Inflation expectations for Germany (as of January 2023)

Notes: Circle size indicates frequency of observations. The solid line has a slope of 45 degrees. Five observations

where perc forecast or inflation expectations were greater than 30% are removed from the graph for visual illustration

(N= 744).

154

Chapter 5

Conclusion

The chapters of this dissertation aim at demonstrating the usefulness of controlled

experiments in addressing issues relevant to macroeconomics. This chapter provides

a brief discussion of the results, highlighting the main implications and limitations.

Chapter 1provides evidence for the robustness of the experience effects to a

particular real-world complication of heterogeneous shocks. Such a demonstra-

tion would not be clean outside the laboratory, because each economy experiences

a unique sequence of shocks with non-comparable institutions/agents across and

within time. In the laboratory, one can repeat the same decision situation as many

times as necessary, with control over the environment. The main limitation of this

experiment is that we have only one treatment condition, so we cannot generalize

to all shock sequences. We leave this to future work. In terms of expectations, we

find that the Heuristic Switching Model (HSM) fits the aggregate data well. The

best-fitting HSM model suggests that there is no increase in the share of funda-

mentalist/rational expectations. Adaptive learning seems to be responsible for the

acceleration of convergence. Overall, these results support the view that gradual

convergence to equilibrium is a good description of economies with inexperienced

agents, but may fall short when agents have a lot of experience with large shocks.

Chapter 2finds that prices react asymmetrically to positive and negative cost

shocks, even in the absence of the frictions often claimed to cause the phenomenon.

With experiments, we can solve the identification problem of market power, intro-

duce arbitrarily large shocks into the equilibria, and drop real-world complications

such as capacity constraints or search costs to isolate the root cause. Our results

suggest that imperfect competition in the form of tacit collusion is a sufficient cause

of the phenomenon. However, the usual antidote to tacit collusion, the introduc-

tion of more competitors, has no discernible effect. Future work can use alternative

competition policies as treatments. Also, our oligopolists do not have perfect in-

formation about each other’s prices, which may have facilitated collusion. From

a macroeconomic modeling perspective, these results support the view that asym-

155

metric price transmission is the rule rather than the exception. Hence, the game

theoretic foundations of DSGE models need to be reconsidered.1

Chapter 3uses laboratory experiments to identify deep parameters of the util-

ity function. Notably, we ask whether strategic-uncertainty is treated differently

from normal ambiguity, and if so, whether this is due to aversion to this source

or pessimism about other players’ decisions. This identification requires measuring

beliefs and certainty equivalents of comparable lotteries and games, which is un-

likely to occur in the field data. We find that pessimism is significantly higher in the

stag-hunt game than in the market entry game and a comparable ambiguous lottery.

Thus, strategic-uncertainty attitudes may not be stable across games. Applying this

method to other games can inform us about the characteristics of a game that affect

stability. Future studies can also compare our parameters with other methods, such

as the belief-hedges method of Baillon et al. (2018,2021). The implications of these

attitudes for behavior such as equilibrium selection is also an open avenue.

Chapter 4employs survey experiments to test implications of central bank com-

munication about its inflation forecasts and accuracy on the general population.

There is a shorter path between the results of these experiments and policy because

the subjects are drawn from a representative sample and are unaware of the exper-

iment. The results show that communicating forecast accuracy in a short message

with a neutral tone is largely beneficial for the European Central Bank. However,

there are caveats. The salience of the intervention is found to be critical, as well as

the bundling of information in a text. Moreover, these results come from a period

of high and volatile inflation. Future studies could test alternative formulations of

the communication and look more closely at longer-term effects. Forecast accuracy

information partially affects consumption plans in our experiment, although these

effects are modest. It is worth replicating these findings and extending their scope

to a wide range of decisions that households make based on future inflation.

1Recently, Wang and Werning (2022) introduce a dynamic oligopoly game into an otherwise

standard New Keynesian DSGE. Also, Rebelo et al. (2024) propose a macro model that generates

such an asymmetry using behavioral foundations (e.g., dual-system theory).

156

Bibliography

Abbink, K. and J. Brandts (2008): “Pricing in Bertrand competition with

increasing marginal costs,” Games and Economic Behavior, 63, 1–31.

Abdellaoui, M., A. Baillon, L. Placido, and P. P. Wakker (2011): “The

rich domain of uncertainty: Source functions and their experimental implementa-

tion,” American Economic Review, 101, 695–723.

Acharya, A., M. Blackwell, and M. Sen (2016): “Explaining causal findings

without bias: Detecting and assessing direct effects,” American Political Science

Review, 110, 512–529.

Ahrens, S., I. Pirschel, and D. J. Snower (2017): “A theory of price adjust-

ment under loss aversion,” Journal of Economic Behavior and Organization, 134,

78–95.

Amir, R., P. Erickson, and J. Jin (2017): “On the microeconomic foundations

for linear demand for differentiated products,” Journal of Economic Theory, 169,

641–665.

Angeletos, G.-M. and J. La’o (2010): “Noisy business cycles,” NBER Macroe-

conomics Annual, 24, 319–378.

Anufriev, M. and C. Hommes (2012): “Evolutionary selection of individual

expectations and aggregate outcomes in asset pricing experiments,” American

Economic Journal: Microeconomics, 4, 35–64.

Anufriev, M., C. H. Hommes, and R. H. Philipse (2013): “Evolutionary

selection of expectations in positive and negative feedback markets,” Journal of

Evolutionary Economics, 23, 663–688.

Arifovic, J. and J. Duffy (2018): “Heterogeneous agent modeling: Experimen-

tal evidence,” in Handbook of Computational Economics, ed. by C. Hommes and

B. LeBaron, Elsevier, vol. 4, 491–540.

157

Armantier, O., S. Nelson, G. Topa, W. Van der Klaauw, and B. Zafar

(2016): “The price is right: Updating inflation expectations in a randomized price

information experiment,” Review of Economics and Statistics, 98, 503–523.

Ash, E., H. Mikosch, A. Perakis, and S. Sarferaz (2024): “Seeing and

hearing is believing: The role of audiovisual communication in shaping inflation

expectations,” Working Paper no. 515, KOF Swiss Economic Institute.

Assenza, T., P. Heemeijer, C. H. Hommes, and D. Massaro (2021): “Man-

aging self-organization of expectations through monetary policy: A macro exper-

iment,” Journal of Monetary Economics, 117, 170–186.

Attema, A. E., W. B. Brouwer, and O. L’Haridon (2013): “Prospect theory

in the health domain: A quantitative assessment,” Journal of Health Economics,

32, 1057–1065.

Bacon, R. W. (1991): “Rockets and feathers: the asymmetric speed of adjustment

of UK retail gasoline prices to cost changes,” Energy Economics, 13, 211–218.

Baillon, A., H. Bleichrodt, C. Li, and P. P. Wakker (2021): “Belief

hedges: Measuring ambiguity for all events and all models,” Journal of Economic

Theory, 198, 105353.

Baillon, A., Z. Huang, A. Selim, and P. P. Wakker (2018): “Measuring

ambiguity attitudes for all (natural) events,” Econometrica, 86, 1839–1858.

Baillon, A., N. Liu, and D. van Dolder (2017): “Comparing uncertainty

aversion towards different sources,” Theory and Decision, 83, 1–18.

Ball, L. and N. G. Mankiw (1994): “Asymmetric price adjustment and eco-

nomic fluctuations,” Economic Journal, 104, 247–261.

Bao, T., J. Duffy, and C. Hommes (2013): “Learning, forecasting and opti-

mizing: An experimental study,” European Economic Review, 61, 186–204.

Bao, T., C. Hommes, J. Sonnemans, and J. Tuinstra (2012): “Individual

expectations, limited rationality and aggregate outcomes,” Journal of Economic

Dynamics and Control, 36, 1101–1120.

Baron, R. M. and D. A. Kenny (1986): “The moderator–mediator variable

distinction in social psychological research: Conceptual, strategic, and statistical

considerations,” Journal of Personality and Social Psychology, 51, 1173–1182.

158

Baron-Cohen, S., S. Wheelwright, J. Hill, Y. Raste, and I. Plumb

(2001): “The “Reading the Mind in the Eyes” Test revised version: a study with

normal adults, and adults with Asperger syndrome or high-functioning autism,”

Journal of Child Psychology and Psychiatry and Allied Disciplines, 42, 241–251.

Bayer, R.-C. and C. Ke (2018): “What causes rockets and feathers? An ex-

perimental investigation,” Journal of Economic Behavior and Organization, 153,

223–237.

Beard, T. R. and R. O. Beil (1994): “Do people rely on the self-interested

maximization of others? An experimental test,” Management Science, 40, 252–

262.

Becker, G. M., M. H. DeGroot, and J. Marschak (1964): “Measuring

utility by a single-response sequential method,” Behavioral Science, 9, 226–232.

Benabou, R. and R. Gertner (1993): “Search with learning from prices: does

increased inflationary uncertainty lead to higher markups?” Review of Economic

Studies, 60, 69–93.

Bholat, D., N. Broughton, J. Ter Meer, and E. Walczak (2019): “En-

hancing central bank communications using simple and relatable information,”

Journal of Monetary Economics, 108, 1–15.

Binder, C. and A. Rodrigue (2018): “Household informedness and long-run

inflation expectations: Experimental evidence,” Southern Economic Journal, 85,

580–598.

Binder, C. C. (2017): “Measuring uncertainty based on rounding: New method

and application to inflation expectations,” Journal of Monetary Economics, 90,

1–12.

Binder, C. C. and R. Sekkel (2023): “Central bank forecasting: A survey,”

Journal of Economic Surveys, 1–23.

Bohnet, I. and R. Zeckhauser (2004): “Trust, risk and betrayal,” Journal of

Economic Behavior and Organization, 55, 467–484.

Borenstein, S., A. C. Cameron, and R. Gilbert (1997): “Do gasoline prices

respond asymmetrically to crude oil price changes?” Quarterly Journal of Eco-

nomics, 112, 305–339.

Borenstein, S. and A. Shepard (1996): “Sticky Prices, Inventories, and Market

Power in Wholesale Gasoline Markets,” Working Paper 5468, National Bureau of

Economic Research.

159

Bos, I. and D. Vermeulen (2022): “On the microfoundation of linear oligopoly

demand,” The BE Journal of Theoretical Economics, 22, 1–15.

Brandenburger, A. (1996): “Strategic and structural uncertainty in games,”

in Wise Choices: Games, Decisions, and Negotiations, ed. by R. Zeckhauser,

R. Keene, and J. Sibenius, Harvard Business School Press, 221–232.

Brandts, J., D. J. Cooper, and R. A. Weber (2015): “Legitimacy, com-

munication and leadership in the turnaround game,” Management Science, 61,

2627–2645.

Bray, M. (1982): “Learning, estimation, and the stability of rational expectations,”

Journal of Economic Theory, 26, 318–339.

——— (1983): “Convergence to rational expectations equilibrium,” in Individual

Forecasting and Aggregate Outcomes, ed. by R. Frydman and E. S. Phelps, Cam-

bridge: Cambridge University Press, 123–137.

Brock, W. A. and C. H. Hommes (1997): “A rational route to randomness,”

Econometrica, 65, 1059–1095.

Brouwer, N. and J. de Haan (2022): “The impact of providing information

about the ECB’s instruments on inflation expectations and trust in the ECB:

Experimental evidence,” Journal of Macroeconomics, 73, 103430.

Brown, S. P. and M. K. Yucel (2000): “Gasoline and crude oil prices: why the

asymmetry?” Economic & Financial Review, 23–29.

Bullock, J. G., D. P. Green, and S. E. Ha (2010): “Yes, but what’s the

mechanism? (don’t expect an easy answer),” Journal of Personality and Social

Psychology, 98, 550–558.

Bursian, D. and E. Faia (2018): “Trust in the monetary authority,” Journal of

Monetary Economics, 98, 66–79.

Byrne, D. P. and N. De Roos (2019): “Learning to coordinate: A study in

retail gasoline,” American Economic Review, 109, 591–619.

Calford, E. (2020): “Uncertainty Aversion in Game Theory: Experimental Evi-

dence,” Journal of Economic Behavior and Organization, 176, 720–734.

Camerer, C. F., T.-H. Ho, and J.-K. Chong (2004): “A cognitive hierarchy

model of games,” Quarterly Journal of Economics, 119, 861–898.

160

Camerer, C. F. and R. Karjalainen (1994): “Ambiguity aversion and non-

additive beliefs in non-cooperative games: Experimental evidence,” in Models

and Experiments in Risk and Rationality, ed. by B. Munier and M. J. Machina,

Springer.

Camerer, C. F. and M. Weber (1992): “Recent developments in modeling

preferences: Uncertainty and ambiguity,” Journal of Risk and Uncertainty, 5,

325–370.

Cason, T. N. and D. Friedman (2002): “A laboratory study of customer mar-

kets,” BE Journal of Economic Analysis & Policy, 2, 1–43.

Cavallo, A., G. Cruces, and R. Perez-Truglia (2017): “Inflation expec-

tations, learning, and supermarket prices: Evidence from survey experiments,”

American Economic Journal: Macroeconomics, 9, 1–35.

Chark, R. and S.-H. Chew (2015): “A neuroimaging study of preference for

strategic uncertainty,” Journal of Risk and Uncertainty, 50, 209–227.

Charness, G., T. Garcia, T. Offerman, and M. C. Villeval (2020): “Do

measures of risk attitude in the laboratory predict behavior under risk in and

outside of the laboratory?” Journal of Risk and Uncertainty, 60, 99–123.

Chateauneuf, A., J. Eichberger, and S. Grant (2007): “Choice under un-

certainty with the best and worst in mind: Neo-additive capacities,” Journal of

Economic Theory, 137, 538–567.

Chen, Y. and M. Riordan (2007): “Price and variety in the spokes model,”

Economic Journal, 117, 897–921.

Chierchia, G., R. Nagel, and G. Coricelli (2018): “Betting ”on nature” or

”betting on others”: Anti-coordination induces uniquely high levels of entropy,”

Scientific Reports, 8, 1–11.

Christelis, D., D. Georgarakos, T. Jappelli, and M. van Rooij (2020):

“Trust in the central bank and inflation expectations,” International Journal of

Central Banking, 16, 1–37.

Coibion, O., D. Georgarakos, Y. Gorodnichenko, and M. Van Rooij

(2023): “How does consumption respond to news about inflation? Field evidence

from a randomized control trial,” American Economic Journal: Macroeconomics,

15, 109–152.

161

Coibion, O., Y. Gorodnichenko, and M. Weber (2022): “Monetary policy

communications and their effects on household inflation expectations,” Journal of

Political Economy, 130, 1537–1584.

Cooper, K., H. Schneider, and M. Waldman (2021): “Limited rationality

and the strategic environment: Further evidence from a pricing game,” Journal

of Behavioral and Experimental Economics, 90, 101632.

Cooper, K. B., H. S. Schneider, and M. Waldman (2017): “Limited ratio-

nality and the strategic environment: Further theory and experimental evidence,”

Games and Economic Behavior, 106, 188–208.

Cooper, R. and A. John (1988): “Coordinating coordination failures in Keyne-

sian models,” Quarterly Journal of Economics, 103, 441–463.

Cornand, C. and F. Heinemann (2019a): “Experiments in macroeconomics:

methods and applications,” in Handbook of Research Methods and Applications in

Experimental Economics, Edward Elgar Publishing, 269–294.

——— (2019b): “Monetary policy obeying the Taylor principle turns prices into

strategic substitutes,” Journal of Economic Behavior and Organization, forth-

coming.

Cornea-Madeira, A., C. Hommes, and D. Massaro (2019): “Behavioral het-

erogeneity in US inflation dynamics,” Journal of Business & Economic Statistics,

37, 288–300.

Costa-Gomes, M. A. and G. Weizs¨

acker (2008): “Stated beliefs and play in

normal-form games,” Review of Economic Studies, 75, 729–762.

D’Acunto, F., A. Fuster, and M. Weber (2021): “Diverse policy committees

can reach underrepresented groups,” Working Paper no. 29275, National Bureau

of Economic Research.

Danz, D., L. Vesterlund, and A. J. Wilson (2020): “Belief Elicitation: Limit-

ing Truth Telling with Information on Incentives,” Working Paper 27327, National

Bureau of Economic Research.

Davidson, R. and J. G. MacKinnon (2000): “Bootstrap tests: How many

bootstraps?” Econometric Reviews, 19, 55–68.

Davis, D. (2009): “Pure numbers effects, market power, and tacit collusion in

posted offer markets,” Journal of Economic Behavior and Organization, 72, 475–

488.

162

Davis, D. and O. Korenok (2011): “Nominal shocks in monopolistically com-

petitive markets: An experiment,” Journal of Monetary Economics, 58, 578–589.

Deaton, A. and J. Muellbauer (1980): “An Almost Ideal Demand System,”

American Economic Review, 70, 312–326.

Deck, C. A. and B. J. Wilson (2008): “Experimental gasoline markets,” Journal

of Economic Behavior and Organization, 67, 134–149.

Dr¨

ager, L., M. Lamla, and D. Pfajfar (2022): “How to limit the spillover

from the 2021 inflation surge to inflation expectations?” Working Paper Series in

Economics 407, L¨uneburg.

Dr¨

ager, L. and G. Nghiem (2023): “Inflation literacy, inflation expectations,

and trust in the central bank: A Survey experiment,” Working Paper no. 10539,

Center for Economic Studies & Ifo Institute.

Duersch, P. and T. A. Eife (2019): “Price competition in an inflationary envi-

ronment,” Journal of Monetary Economics, 104, 48–66.

Duffy, J. (2017): “Macroeconomics: A Survey of Laboratory Research,” in The

Handbook of Experimental Economics, Volume 2, ed. by J. H. Kagel and A. E.

Roth, Princeton: Princeton University Press, 1–90.

Dufwenberg, M. and U. Gneezy (2000): “Price competition and market con-

centration: an experimental study,” International Journal of Industrial Organi-

zation, 18, 7–22.

Dufwenberg, M., T. Lindqvist, and E. Moore (2005): “Bubbles and expe-

rience: An experiment,” American Economic Review, 95, 1731–1737.

Ehrmann, M. and M. Fratzscher (2011): “Politics and monetary policy,”

Review of Economics and Statistics, 93, 941–960.

Ehrmann, M., D. Georgarakos, and G. Kenny (2023): “Credibility gains

from communicating with the public: Evidence from the ECB’s new monetary

policy strategy,” Working Paper no. 2785, European Central Bank.

El-Shagi, M., S. Giesen, and A. Jung (2016): “Revisiting the relative fore-

cast performances of Fed staff and private forecasters: A dynamic approach,”

International Journal of Forecasting, 32, 313–323.

Evans, G. W. and S. Honkapohja (2001): Learning and expectations in macroe-

conomics, Princeton University Press.

163

Farjam, M. (2019): “On whom would I want to depend; humans or computers?”

Journal of Economic Psychology, 72, 219–228.

Fehr, E. and J.-R. Tyran (2001): “Does money illusion matter?” American

Economic Review, 91, 1239–1262.

——— (2005): “Individual irrationality and aggregate outcomes,” Journal of Eco-

nomic Perspectives, 19, 43–66.

——— (2008): “Limited rationality and strategic interaction: the impact of the

strategic environment on nominal inertia,” Econometrica, 76, 353–394.

Fiala, L. and S. Suetens (2017): “Transparency and cooperation in repeated

dilemma games: a meta study,” Experimental Economics, 20, 755–771.

Fischbacher, U. (2007): “z-Tree: Zurich toolbox for ready-made economic exper-

iments,” Experimental Economics, 10, 171–178.

Fischhoff, B., P. Slovic, and S. Lichtenstein (1977): “Knowing with cer-

tainty: The appropriateness of extreme confidence.” Journal of Experimental Psy-

chology: Human Perception and Performance, 3, 552.

Fonseca, M. A. and H.-T. Normann (2012): “Explicit vs. tacit collusion—The

impact of communication in oligopoly experiments,” European Economic Review,

56, 1759–1772.

Frederick, S. (2005): “Cognitive reflection and decision making,” Journal of

Economic Perspectives, 19, 25–42.

Freeman, K. (1948): Ancilla to the pre-Socratic philosophers: A complete transla-

tion of the fragments in Diels, Fragmente der Vorsokratiker, Harvard University

Press.

Frey, G. and M. Manera (2007): “Econometric models of asymmetric price

transmission,” Journal of Economic Surveys, 21, 349–415.

Gavin, W. T. and R. J. Mandal (2003): “Evaluating FOMC forecasts,” Inter-

national Journal of Forecasting, 19, 655–667.

Gilboa, I. and D. Schmeidler (1989): “Maxmin expected utility with a non-

unique prior,” Journal of Mathematical Economics, 18, 141–153.

——— (1995): “Case-based decision theory,” Quarterly Journal of Economics, 110,

605–639.

164

Gode, D. K. and S. Sunder (1993): “Allocative efficiency of markets with zero-

intelligence traders: Market as a partial substitute for individual rationality,”

Journal of Political Economy, 101, 119–137.

Green, E. J. and R. H. Porter (1984): “Noncooperative collusion under im-

perfect price information,” Econometrica, 52, 87–100.

Greiner, B. (2015): “Subject pool recruitment procedures: organizing experi-

ments with ORSEE,” Journal of the Economic Science Association, 1, 114–125.

——— (2016): “Strategic uncertainty aversion in bargaining: Experimental evi-

dence,” Mimeo.

Haaland, I. and O.-A. E. Naess (2023): “Misperceived Returns to Active In-

vesting,” Working Paper no. 10257, Center for Economic Studies & Ifo Institute.

Haldane, A., A. Macaulay, and M. McMahon (2020): “The 3 E’s of central

bank communication with the public,” Staff Working Paper no. 847, Bank of

England.

Haltiwanger, J. and M. Waldman (1985): “Rational expectations and the

limits of rationality: An analysis of heterogeneity,” American Economic Review,

75, 326–340.

——— (1989): “Limited rationality and strategic complements: the implications for

macroeconomics,” Quarterly Journal of Economics, 104, 463–483.

Hanaki, N., E. Akiyama, and R. Ishikawa (2018): “Effects of different ways of

incentivizing price forecasts on market dynamics and individual decisions in asset

market experiments,” Journal of Economic Dynamics and Control, 88, 51–69.

Hanaki, N., Y. Koriyama, A. Sutan, and M. Willinger (2019): “The strate-

gic environment effect in beauty contest games,” Games and Economic Behavior,

113, 587–610.

Hanaki, N. and A. Masili¯

unas (2021): “Market concentration and incentives

to collude in Cournot oligopoly experiments,” Discussion paper 1131, ISER.

Haruvy, E., Y. Lahav, and C. N. Noussair (2007): “Traders’ expectations

in asset markets: experimental evidence,” American Economic Review, 97, 1901–

1920.

Heemeijer, P., C. Hommes, J. Sonnemans, and J. Tuinstra (2009): “Price

stability and volatility in markets with positive and negative expectations feed-

back: An experimental investigation,” Journal of Economic Dynamics and Con-

trol, 33, 1052–1072.

165

Heinemann, F. (2012): “Understanding financial crises: The contribution of ex-

perimental economics,” Annales d’Economie et de Statistique, 107-108, 7–29.

Heinemann, F., R. Nagel, and P. Ockenfels (2009): “Measuring strategic

uncertainty in coordination games,” Review of Economic Studies, 76, 181–221.

Hey, J. D., A. Morone, and U. Schmidt (2009): “Noise and bias in eliciting

preferences,” Journal of Risk and Uncertainty, 39, 213–235.

Hoffmann, M., E. Moench, L. Pavlova, and G. Schultefrankenfeld

(2022): “Would households understand average inflation targeting?” Journal of

Monetary Economics, 129, S52–S66.

Hommes, C. (2011): “The heterogeneous expectations hypothesis: Some evidence

from the lab,” Journal of Economic Dynamics and Control, 35, 1–24.

——— (2021): “Behavioral and experimental macroeconomics and policy analysis:

A complex systems approach,” Journal of Economic Literature, 59, 149–219.

Hommes, C. and J. Lustenhouwer (2019): “Inflation targeting and liquidity

traps under endogenous credibility,” Journal of Monetary Economics, 107, 48–62.

Hommes, C., J. Sonnemans, J. Tuinstra, and H. Van de Velden (2005):

“Coordination of expectations in asset pricing experiments,” Review of Financial

Studies, 18, 955–980.

Hommes, C. H. (2006): “Heterogeneous agent models in economics and finance,”

in Handbook of Computational Economics, ed. by L. Tesfatsion and K. Judd,

Elsevier, vol. 2, 1109–1186.

——— (2018): “Behavioral & experimental macroeconomics and policy analysis: a

complex systems approach,” Working Paper No. 2201, European Central Bank.

Horstmann, N., J. Kr¨

amer, and D. Schnurr (2018): “Number effects and

tacit collusion in experimental oligopolies,” Journal of Industrial Economics, 66,

650–700.

Hossain, T. and R. Okui (2013): “The binarized scoring rule,” Review of Eco-

nomic Studies, 80, 984–1001.

Huck, S., H.-T. Normann, and J. Oechssler (2004): “Two are few and

four are many: number effects in experimental oligopolies,” Journal of Economic

Behavior and Organization, 53, 435–446.

166

Hussam, R. N., D. Porter, and V. L. Smith (2008): “Thar she blows: Can

bubbles be rekindled with experienced subjects?” American Economic Review,

98, 924–37.

Hyndman, K., E. Y. Ozbay, A. Schotter, and W. Z. Ehrblatt (2012):

“Convergence: an experimental study of teaching and learning in repeated

games,” Journal of the European Economic Association, 10, 573–604.

Imai, K., L. Keele, D. Tingley, and T. Yamamoto (2011): “Unpacking the

black box of causality: Learning about causal mechanisms from experimental and

observational studies,” American Political Science Review, 105, 765–789.

Isaac, R. M. and J. M. Walker (1988): “Group size effects in public goods

provision: The voluntary contributions mechanism,” The Quarterly Journal of

Economics, 103, 179–199.

Ivanov, A. (2011): “Attitudes to ambiguity in one-shot normal-form games: An

experimental study,” Games and Economic Behavior, 71, 366–394.

J¨

ager, S., C. Roth, N. Roussille, and B. Schoefer (2022): “Worker beliefs

about outside options,” Working paper no. 29623, National Bureau of Economic

Research.

Kahneman, D. and A. Tversky (1972): “Subjective probability: A judgment

of representativeness,” Cognitive Psychology, 3, 430–454.

Kelsey, D. and S. le Roux (2015): “An experimental study on the effect of

ambiguity in a coordination game,” Theory and Decision, 79, 667–688.

Kocher, M. G. and M. Sutter (2006): “Time is money—Time pressure, in-

centives, and the quality of decision-making,” Journal of Economic Behavior and

Organization, 61, 375–392.

Kop´

anyi-Peuker, A. and M. Weber (2021): “Experience does not eliminate

bubbles: Experimental evidence,” The Review of Financial Studies, 34, 4450–

4485.

Korenok, O., D. Munro, and J. Chen (2023): “Inflation and attention thresh-

olds,” Review of Economics and Statistics, 1–28.

Kostyshyna, O. and L. Petersen (2024): “Communicating inflation uncer-

tainty and household expectations,” Working Paper no. 2023–63, Bank of Canada.

167

Kril, Z., D. Leiser, and A. Spivak (2016): “What determines the credibility

of the central bank of Israel in the public eye,” International Journal of Central

Banking, 12, 67–93.

Lerner, A. P. (1934): “The concept of monopoly and the measurement of

monopoly power,” Review of Economic Studies, 1, 157–175.

Li, C., U. Turmunkh, and P. P. Wakker (2020): “Social and strategic ambi-

guity versus betrayal aversion,” Games and Economic Behavior, 123, 272–287.

Loy, J.-P., C. R. Weiss, and T. Glauben (2016): “Asymmetric cost pass-

through? Empirical evidence on the role of market power, search and menu costs,”

Journal of Economic Behavior and Organization, 123, 184–192.

Marquardt, P., C. N. Noussair, and M. Weber (2019): “Rational expecta-

tions in an experimental asset market with shocks to market trends,” European

Economic Review, 114, 116–140.

Maskin, E. and J. Tirole (1988): “A theory of dynamic oligopoly, II: Price

competition, kinked demand curves, and Edgeworth cycles,” Econometrica, 56,

571–599.

Mauersberger, F. and R. Nagel (2018): “Levels of reasoning in Keynesian

Beauty Contests: a generative framework,” in Handbook of computational eco-

nomics, Elsevier, vol. 4, 541–634.

McMahon, M. and R. Rholes (2024): “Building Central Bank Credibility: The

Role of Forecast Performance,” .

Mellina, S. and T. Schmidt (2018): “The role of central bank knowledge

and trust for the public’s inflation expectations,” Discussion Paper no. 32/2018,

Deutsche Bundesbank.

M´

eon, P.-G. and B. Hayo (2023): “Preaching to the agnostic: Inflation reporting

can increase trust in the central bank but only among people with weak priors,”

Working Paper no. 10636, Center for Economic Studies & Ifo Institute.

Meyer, J. and S. von Cramon-Taubadel (2004): “Asymmetric price trans-

mission: a survey,” Journal of Agricultural Economics, 55, 581–611.

Milgrom, P. and J. Roberts (1990): “Rationalizability, learning, and equilib-

rium in games with strategic complementarities,” Econometrica, 58, 1255–1277.

Morgan, J., H. Orzen, and M. Sefton (2006): “An experimental study of

price dispersion,” Games and Economic Behavior, 54, 134–158.

168

Morris, S. and H. S. Shin (2003): “Global games: Theory and applications,”

in Advances in Economics and Econometrics, ed. by M. Dewatripont, L. Hansen,

and S. Turnovsky, Cambridge University Press, vol. 1, 56–114.

Muth, J. F. (1961): “Rational expectations and the theory of price movements,”

Econometrica, 29, 315–335.

Nagel, R. (1995): “Unraveling in guessing games: An experimental study,” Amer-

ican Economic Review, 85, 1313–1326.

Nagel, R., A. Brovelli, F. Heinemann, and G. Coricelli (2018): “Neu-

ral mechanisms mediating degrees of strategic uncertainty,” Social Cognitive and

Affective Neuroscience, 13, 52–62.

Offerman, T., J. Sonnemans, G. Van De Kuilen, and P. P. Wakker

(2009): “A truth serum for non-bayesians: Correcting proper scoring rules for

risk attitudes,” Review of Economic Studies, 76, 1461–1489.

Orzen, H. (2008): “Counterintuitive number effects in experimental oligopolies,”

Experimental Economics, 11, 390–401.

Peltzman, S. (2000): “Prices rise faster than they fall,” Journal of Political Econ-

omy, 108, 466–502.

Perdiguero-Garc´

ıa, J. (2013): “Symmetric or asymmetric oil prices? A meta-

analysis approach,” Energy policy, 57, 389–397.

Petersen, L. and A. Winn (2014): “Does money illusion matter? Comment,”

American Economic Review, 104, 1047–62.

Pfajfar, D. and B. ˇ

Zakelj (2014): “Experimental evidence on inflation expec-

tation formation,” Journal of Economic Dynamics and Control, 44, 147–168.

Pf¨

auti, O. (2023): “The inflation attention threshold and inflation surges,” arXiv

preprint arXiv:2308.09480.

Plonsky, O., K. Teodorescu, and I. Erev (2015): “Reliance on small samples,

the wavy recency effect, and similarity-based learning.” Psychological Review, 122,

621–647.

Potters, J. and S. Suetens (2009): “Cooperation in experimental games of

strategic complements and substitutes,” Review of Economic Studies, 76, 1125–

1147.

169

Prevost, M., M.-E. Carrier, G. Chowne, P. Zelkowitz, L. Joseph, and

I. Gold (2014): “The Reading the Mind in the Eyes test: Validation of a French

version and exploration of cultural variations in a multi-ethnic city,” Cognitive

Neuropsychiatry, 19, 189–204.

Reagan, P. B. and M. L. Weitzman (1982): “Asymmetries in price and quan-

tity adjustments by the competitive firm,” Journal of Economic Theory, 27, 410–

420.

Rebelo, S., M. Santana, and P. Teles (2024): “Behavioral Sticky Prices,”

Working Paper 32214, National Bureau of Economic Research.

Roth, C., S. Settele, and J. Wohlfart (2022): “Beliefs about public debt and

the demand for government spending,” Journal of Econometrics, 231, 165–187.

Roth, C. and J. Wohlfart (2020): “How do expectations about the macroe-

conomy affect personal expectations and behavior?” Review of Economics and

Statistics, 102, 731–748.

Salop, S. C. (1979): “Monopolistic competition with outside goods,” Bell Journal

of Economics, 10, 141–156.

Schotter, A. and I. Trevino (2014): “Belief elicitation in the laboratory,”

Annual Review of Economics, 6, 103–128.

Settele, S. (2022): “How do beliefs about the gender wage gap affect the demand

for public policy?” American Economic Journal: Economic Policy, 14, 475–508.

Shestakova, N., O. Powell, and D. Gladyrev (2019): “Bubbles, experience

and success,” Journal of Behavioral and Experimental Finance, 22, 206–213.

Smith, V. L. (1962): “An experimental study of competitive market behavior,”

Journal of Political Economy, 70, 111–137.

Smith, V. L., G. L. Suchanek, and A. W. Williams (1988): “Bubbles,

crashes, and endogenous expectations in experimental spot asset markets,” Econo-

metrica, 56, 1119–1151.

Sonnemans, J. and J. Tuinstra (2010): “Positive expectations feedback exper-

iments and number guessing games as models of financial markets,” Journal of

Economic Psychology, 31, 964–984.

Stahl, D. O. (1996): “Boundedly rational rule learning in a guessing game,”

Games and Economic Behavior, 16, 303–330.

170

Sutan, A. and M. Willinger (2009): “Guessing with negative feedback: An

experiment,” Journal of Economic Dynamics and Control, 33, 1123–1133.

Tversky, A. and D. Kahneman (1973): “Availability: A heuristic for judging

frequency and probability,” Cognitive Psychology, 5, 207–232.

Wang, O. and I. Werning (2022): “Dynamic oligopoly and price stickiness,”

American Economic Review, 112, 2815–2849.

Weber, M., B. Candia, T. Ropele, R. Lluberas, S. Frache, B. H. Meyer,

S. Kumar, Y. Gorodnichenko, D. Georgarakos, O. Coibion, et al.

(2023): “Tell me something I don’t already know: Learning in low and high-

inflation settings,” Working Paper no. 31485, National Bureau of Economic Re-

search.

World Health Organization (2018): “WHO methods and data sources for

country-level causes of death 2000–2016,” Tech. rep., Global Health Estimates

Technical Paper WHO/HMM/IER/GHE/2018.1, WHO, Geneva.

Yang, H. and L. Ye (2008): “Search with learning: understanding asymmetric

price adjustments,” RAND Journal of Economics, 39, 547–564.

Zhelobodko, E., S. Kokovin, M. Parenti, and J.-F. Thisse (2012): “Mo-

nopolistic competition: beyond the constant elasticity of substitution,” Econo-

metrica, 80, 2765–2784.

171

Publications

The following publications have been used as chapters of this dissertation:

1. Accepted manuscript: Bulutay, M., C. Cornand, and A. Zylbersztejn (2022).

“Learning to deal with repeated shocks under strategic complementarity: An

experiment,” Journal of Economic Behavior and Organization, 200, 1318–1343.

https://doi.org/10.1016/j.jebo.2020.05.023

2. Accepted manuscript: Bulutay, M., D. Hales, P. Julius, and W. Tasch (2021).

“Imperfect tacit collusion and asymmetric price transmission,” Journal of Eco-

nomic Behavior and Organization, 192, 584–599.

https://doi.org/10.1016/j.jebo.2021.10.018

3. Accepted manuscript: Bruttel, L., M. Bulutay, C. Cornand, F. Heinemann,

and A. Zylbersztejn (2023). “Measuring strategic-uncertainty attitudes,” Ex-

perimental Economics, 26, 522–549.

https://doi.org/10.1007/s10683-022-09779-2.

4. Preprint: Bulutay, M. (2024). “Better than Perceived? Correcting Mispercep-

tions about Central Bank Inflation Forecasts,” Working Paper no. 34, Berlin

School of Economics.

https://doi.org/10.48462/opus4-5355