A decision maker (DM) makes choices from different sets of alternatives. The DM is initially ignorant of the payoff associated with each alternative, and learns these payoffs only after a large number of choices have been made. We show that, in the presence of an outside option, once payoffs are learned, the optimal choice rule from sets of alternatives can be rationalized by a DM with strict preferences over all alternatives. Under this model, the DM has preferences for preferences while being ignorant of what preferences are “right”.

Discussion Paper

Publication year: 2018

We consider an agent who acquires information on a state of nature from an information structure before facing a decision problem. How much information is worth depends jointly on the decision problem and on the information structure. We represent the decision problem by the set of possible payoffs indexed by states of nature. We establish and exploit the duality between this set on one hand and the value of information function, which maps beliefs to expected payoffs under optimal actions at these beliefs, on the other. We then derive global estimates of the value of information of any information structure from local properties of the value function and of the set of optimal actions taken at the prior belief only.

Consider agents who are heterogeneous in their preferences and wealth levels. These agents may acquire information prior to choosing an investment that has a property of no-arbitrage, and each piece of information bears a corresponding cost. We associate a numeric index to each information purchase (information-cost pair). This index describes the normalized value of the information purchase: it is the risk-aversion level of the unique CARA agent who is indifferent between accepting and rejecting the pur- chase, and it is characterized by a “duality” principle that states that agents with a stronger preference for information should engage more often in information purchases. No agent more risk-averse than the index finds it profitable to acquire the information, whereas all agents less risk-averse than the index do. Given an empirically measured range of degrees of risk aversion in a competitive economy with no-arbitrage investments, our model therefore comes close to describing an inverse demand for information, by predicting what pieces of information are acquired by agents and which ones are not. Among several desirable properties, the normalized value formula induces a complete ranking of information structures that extends Blackwell’s classic ordering.

*m*. We characterise statistical properties satisfied by random plays generated by a correlated pair of automata with *m* states each. We show that in some respect the pair of automata can be identified with a more complex automaton of size comparable to* m log m*. We investigate implications of these results on the correlated min–max value of repeated games played by automata.

We introduce tests for finite-sample linear regressions with heteroskedastic errors. The tests are exact, i.e., they have guaranteed type I error probabilities when bounds are known on the range of the dependent variable, without any assumptions about the noise structure. We provide upper bounds on probability of type II errors, and apply the tests to empirical data.

Consider any investor who fears ruin when facing any set of investments that satisfy no-arbitrage. Before investing, he can purchase information about the state of nature in the form of an information structure. Given his prior, information structure *α *investment dominates information structure *β *if, whenever he is willing to buy *β* at some price, he is also willing to buy *α* at that price. We show that this informativeness ordering is complete and is represented by the decrease in entropy of his beliefs, regardless of his preferences, initial wealth, or investment problem. We also show that no prior-independent informativeness ordering based on similar premises exists.

We show that if an agent reasons according to standard inference rules, the axioms of truth and introspection extend from the set of non-epistemic propositions to the whole set of propositions. This implies that the usual axiomatization of the partitional possibility correspondence, which describes an agent who processes information rationally, is redundant.

We study the impact of unobservable stochastic replacements for the long-run player in the classical reputation model with a long-run player and a series of short-run players. We provide explicit lower bounds on the Nash equilibrium payoffs of a long-run player, both ex-ante and following any positive probability history. Under general conditions on the convergence rates of the discount factor to one and of the rate of replacement to zero, both bounds converge to the Stackelberg payoff if the type space is sufficiently rich. These limiting conditions hold in particular if the game is played very frequently.

We introduce entropy techniques to study the classical reputation model in which a long-run player faces a series of short-run players. The long-run player’s actions are possibly imperfectly observed. We derive explicit lower and upper bounds on the equilibrium payoffs to the long-run player.

In games with incomplete information, more information to a player implies a broader strategy set for this player in the normal form game, hence more knowledge implies more ability. We prove that, on the other hand, given two normal form games G and G’ such that players in a subset J of the set of players possess more strategies in G’ than in G, there exist two games with incomplete information with normal forms G and G’ such that players in J are more informed in the second than in the first. More ability can then be rationalized by more knowledge, and our result thus establishes the formal equivalence between ability and knowledge.

This papers studies an optimization problem under entropy constraints arising from repeated games with signals. We provide general properties of solutions and a full characterization of optimal solutions for 2 × 2 sets of actions. As an application we compute the min max values of some repeated games with signals.

In Bayesian environments with private information, as described by the types of Harsanyi, how can types of agents be (statistically) disassociated from each other and how are such disassociations reflected in the agents’ knowledge structure? Conditions studied are (i) subjective independence (the opponents’ types are independent conditional on one’s own) and (ii) type disassociation under common knowledge (the agents’ types are independent, conditional on some common-knowledge variable). Subjective independence is motivated by its implications in Bayesian games and in studies of equilibrium concepts. We find that a variable that disassociates types is more informative than any common-knowledge variable. With three or more agents, conditions (i) and (ii) are equivalent. They also imply that any variable which is common knowledge to two agents is common knowledge to all, and imply the existence of a unique common-knowledge variable that disassociates types, which is the one defined by Aumann.

An observer of a process (x_t) believes the process is governed by Q whereas the true law is P. We bound the expected average distance between P(x_t|x1,…,x_{t−1}) and Q(x_t|x_1,…,x_{t−1}) for t = 1,…,n by a function of the relative entropy between the marginals of P and Q on the n first realizations. We apply this bound to the cost of learning in sequential decision problems and to the merging of Q to P.

We characterize the maximum payoff that a team can guarantee against another in a class of repeated games with imperfect

monitoring. Our result relies on the optimal tradeoff for the team between optimization of stage payoffs and generation of

signals for future correlation.

We study a repeated game with asymmetric information about a dynamic state of nature. In the course of the game, the better-informed player can communicate some or all of his information to the other. Our model covers costly and/or bounded communication. We characterize the set of equilibrium payoffs and contrast these with the communication equilibrium payoffs, which by definition entail no communication costs.

We introduce cheap talk in a dynamic investment model with information externalities. We first show how social learning adversely affects the credibility of cheap talk messages. Next, we show how an informational cascade makes truthtelling incentive compatible. A separating equilibrium only exists for high-surplus projects. Both an investment subsidy and an investment tax can increase welfare. The more precise the sender’s information, the higher her incentives to truthfully reveal her private information.

Let (x_n) be a process with values in a finite set X and law P, and let y_n = f(x_n) be a function of the process. At stage n,

the conditional distribution p_n = P[x_n | x_1 x_{n−1}], element of Pi = Delta(X), is the belief that a perfect observer, who

observes the process online, holds on its realization at stage n. A statistician observing the signals y1, …, y_n holds a belief e_n = P[p_n | x_1,…, x_n] ∈ Delta(Pi) on the possible predictions of the perfect observer. Given X and f , we characterize the set of limits of expected empirical distributions of the process en when P ranges over all possible laws of (x_n).

It is sometimes argued that road safety measures or automobile safety standards fail to save lives because safer highways or safer cars induce more dangerous driving. A similar but less extreme view is that ignoring the behavioral adaptation of drivers would bias the cost–benefit analysis of a traffic safety measure. This article derives cost–benefit rules for automobile safety

regulation when drivers may adapt their risk-taking behavior in response to changes in the quality of the road network. The focus is on the financial externalities induced by accidents because of the insurance system as well as on the consequences of drivers’ risk aversion. We establish that road safety measures are Pareto improving if their monetary cost is lower than the difference between their (adjusted for risk aversion) direct welfare gain with unchanged behavior and the induced variation in insured losses due to drivers’ behavioral adaptation. The article also shows how this rule can be extended to take other accident external costs into account.

Nous introduisons un modèle de communication avec état de la nature dynamique. En utilisant l’entropie comme mesure d’information, nous caractérisons les distributions empiriques espérées sur les actions qui sont réalisables. Nous présentons des applications aux jeux avec et sans intérêts communs.

This article studies situations in which agents do not initially know the effect of their decisions, but learn from experience the payoffs induced by their choices and their opponents’. We chararacterize equilibrium payoffs in terms of simple strategies in which an exploration phase is followed by a payoff acquisition phase.

We exhibit a general class of interactive decision situations in which all the agents benefit from more information. This class includes as a special case the classical comparison of statistical experiments à la Blackwell. More specifically, we consider pairs consisting of a game with incomplete information G and an information structure S such that the extended game Gamma(G;S) has a unique Pareto payoff profile u. We prove that u is a Nash payoff profile of Gamma(G;S), and that for any information structure T that is coarser than S, all Nash payoff profiles of Gamma(G;T) are dominated by u.

We then prove that our condition is also necessary in the following sense: Given any convex compact polyhedron of payoff profiles, whose Pareto frontier is not a singleton, there exists an extended game Gamma(G;S) with that polyhedron as the convex hull of feasible payoffs, an information structure T coarser than S and a player i who strictly prefers a Nash equilibrium in Gamma(G;T) to any Nash equilibrium in Gamma(G;S).

Many results on repeated games played by finite automata rely on the complexity of the exact implementation of a coordinated play of length n. For a large proportion of sequences, this complexity appears to be no less than n. We study the complexity of a coordinated play when allowing for a few mismatches. We prove the existence of a constant C such that if (m ln(m))/n ≥ C, for almost any sequence of length n, there exists an automaton of size m that achieves a coordination ratio close to 1 with it. Moreover, we show that one can take any constant C such that C > e|X| ln(X), where |X| is the size of the alphabet from which the sequence is drawn. Our result contrasts with Neyman (1997) that shows that when (m ln(m))/n is close to 0, for almost no sequence of length n there exists an automaton of size m that achieves a coordination ratio significantly larger 1/|X| with it.

We characterize the max min of repeated zero-sum games in which player one plays in pure strategies conditional on the private observation of a fixed sequence of random variables. Meanwhile we introduce a definition of a strategic distance between probability measures, and relate it to the standard Kullback distance.

We consider the “and” communication device that receives input from two players and outputs the public signal *yes* if both inputs are *yes* and outputs *no* otherwise. We prove that no correlation can securely be implemented through this device, even if an infinite number of communication rounds are allowed.

We introduce the notion of an information structure I as being richer than another J when for every game G, all correlated equilibrium distributions of G induced by J are also induced by I. In particular, if I is richer than J then I can make all agents as well off as J in any game. We also define J to be faithfully reproducible from I when all the players can compute from their information in I “new information” that reproduces what they could have received from J. Our main result is that I is richer than J if and only if J is faithfully reproducible from I.