We study the description and value of information in zero-sum games. We define a series of informational relations between information schemes, and show that informational equivalence classes are captured by canonical information structures. Moreover, two information schemes induce the same value in every game if and only if they are informationally equivalent. We prove the existence of a revealing game in which unique optimal strategies are homeomorphic to canonical types.
We study the robustness of equilibria with regards to small payoff perturbations of the dynamic game. We find that complete penal codes, that specify players’ strategies after every history, have only limited robustness. For some generic games, no complete codes exist that are robust to even arbitrarily small perturbations. We define incomplete penal codes as partial descriptions of equilibrium strategies and introduce a notion of robustness for incomplete penal codes. We prove a Folk Theorem in robust incomplete codes that generates a Folk Theorem in a class of stochastic games.
In decision problems under incomplete information, actions (identified to payoff vectors indexed by states of nature) and beliefs are naturally paired by bilinear duality. We exploit this duality to analyze the value of information, using concepts and tools from convex analysis. We define the value function as the support function of the set of available actions: the subdifferential at a belief is the set of optimal actions at this belief; the set of beliefs at which an action is optimal is the normal cone of the set of available actions at this point. Our main results are 1) a necessary and sufficient condition for positive value of information 2) global estimates of the value of information of any information structure from local properties of the value function and of the set of optimal actions taken at the prior belief only. We apply our results to the marginal value of information at the null, that is, when the agent is close to receiving no information at all, and we provide conditions under which the marginal value of information is infinite, null, or positive and finite.
We show how group testing can be used in three applications to multiply the efficiency of tests: estimation of virus prevalence, releasing group to the work force, and testing for individual infectious status. For an infection level around 2%, group testing could potentially allow to save 94% of tests in the first application, 95% in the second, and 85% in the third one.
We show how group testing can be used in three applications to multiply the efficiency of tests against COVID-19: estimating virus prevalence, releasing group to the work force, and testing for individual infectious status. For an infection level around 2%, group testing could potentially allow to save 94% of tests in the first application, 95% in the second, and 85% in the third one.
We study the impact of manipulating the attention of a decision-maker who learns sequentially about a number of items before making a choice. Under natural assump- tions on the decision-maker’s strategy, directing attention toward one item increases its likelihood of being chosen regardless of its value. This result applies when the decision- maker can reject all items in favor of an outside option with known value; if no outside option is available, the direction of the effect of manipulation depends on the value of the item. A similar result applies to manipulation of choices in bandit problems.
A decision maker (DM) makes choices from different sets of alternatives. The DM is initially ignorant of the payoff associated with each alternative, and learns these payoffs only after a large number of choices have been made. We show that, in the presence of an outside option, once payoffs are learned, the optimal choice rule from sets of alternatives can be rationalized by a DM with strict preferences over all alternatives. Under this model, the DM has preferences for preferences while being ignorant of what preferences are “right”.
In a choice model, we characterize the loss induced by misperceptions of payoff- relevant parameters across a distribution of decision problems. When the agent cannot avoid misperceptions but has some control over the distribution of errors, we show that strategies that minimize loss from misperception exhibit systematic biases, akin to some documented in the behavioural and psychological literatures. We include illusion of control, order effect, overprecision, and overweighting of small probabilities as illustrative examples.
We study the impact of manipulating the attention of a decision-maker who learns sequentially about a number of items before making a choice. Under natural assumptions on the decision-maker’s strategy, forcing attention toward one item increases the likelihood of its being chosen.
We consider an agent who acquires information on a state of nature from an information structure before facing a decision problem. How much information is worth depends jointly on the decision problem and on the information structure. We represent the decision problem by the set of possible payoffs indexed by states of nature. We establish and exploit the duality between this set on one hand and the value of information function, which maps beliefs to expected payoffs under optimal actions at these beliefs, on the other. We then derive global estimates of the value of information of any information structure from local properties of the value function and of the set of optimal actions taken at the prior belief only.
Consider agents who are heterogeneous in their preferences and wealth levels. These agents may acquire information prior to choosing an investment that has a property of no-arbitrage, and each piece of information bears a corresponding cost. We associate a numeric index to each information purchase (information-cost pair). This index describes the normalized value of the information purchase: it is the risk-aversion level of the unique CARA agent who is indifferent between accepting and rejecting the pur- chase, and it is characterized by a “duality” principle that states that agents with a stronger preference for information should engage more often in information purchases. No agent more risk-averse than the index finds it profitable to acquire the information, whereas all agents less risk-averse than the index do. Given an empirically measured range of degrees of risk aversion in a competitive economy with no-arbitrage investments, our model therefore comes close to describing an inverse demand for information, by predicting what pieces of information are acquired by agents and which ones are not. Among several desirable properties, the normalized value formula induces a complete ranking of information structures that extends Blackwell’s classic ordering.
This paper studies the interaction of automata of size m. We characterise statistical properties satisfied by random plays generated by a correlated pair of automata with m states each. We show that in some respect the pair of automata can be identified with a more complex automaton of size comparable to m log m. We investigate implications of these results on the correlated min–max value of repeated games played by automata.
Stakes affect aggregate performance in a wide variety of settings. At the individual level, we define the critical ability as an agent’s ability to adapt performance to the importance of the situation. We identify individual critical abilities of professional tennis players, relying on point-level data from twelve years of the US Open tournament. We establish persistent heterogeneity in critical abilities. We find a significant statistical relationship between identified critical abilities and overall career success, which validates the identification procedure and suggests that response to pressure is a significant factor for success.
We study the impact of unobservable stochastic replacements for the long-run player in the classical reputation model with a long-run player and a series of short-run players. We provide explicit lower bounds on the Nash equilibrium payoffs of a long-run player, both ex-ante and following any positive probability history. Under general conditions on the convergence rates of the discount factor to one and of the rate of replacement to zero, both bounds converge to the Stackelberg payoff if the type space is sufficiently rich. These limiting conditions hold in particular if the game is played very frequently.
We introduce entropy techniques to study the classical reputation model in which a long-run player faces a series of short-run players. The long-run player’s actions are possibly imperfectly observed. We derive explicit lower and upper bounds on the equilibrium payoffs to the long-run player.
We study the relationship between a player’s lowest equilibrium payoff in a repeated game with imperfect monitoring and this player’s min max payoff in the corresponding one-shot game. We characterize the signal structures under which these two payoffs coincide for any payoff matrix. Under an identifiability assumption, we further show that, if the monitoring structure of an infinitely repeated game “nearly” satisfies this condition, then these two payoffs are approximately equal, independently of the discount factor. This provides conditions under which existing folk theorems exactly characterize the limiting payoff set.
In games with incomplete information, more information to a player implies a broader strategy set for this player in the normal form game, hence more knowledge implies more ability. We prove that, on the other hand, given two normal form games G and G’ such that players in a subset J of the set of players possess more strategies in G’ than in G, there exist two games with incomplete information with normal forms G and G’ such that players in J are more informed in the second than in the first. More ability can then be rationalized by more knowledge, and our result thus establishes the formal equivalence between ability and knowledge.
This papers studies an optimization problem under entropy constraints arising from repeated games with signals. We provide general properties of solutions and a full characterization of optimal solutions for 2 × 2 sets of actions. As an application we compute the min max values of some repeated games with signals.
In Bayesian environments with private information, as described by the types of Harsanyi, how can types of agents be (statistically) disassociated from each other and how are such disassociations reflected in the agents’ knowledge structure? Conditions studied are (i) subjective independence (the opponents’ types are independent conditional on one’s own) and (ii) type disassociation under common knowledge (the agents’ types are independent, conditional on some common-knowledge variable). Subjective independence is motivated by its implications in Bayesian games and in studies of equilibrium concepts. We find that a variable that disassociates types is more informative than any common-knowledge variable. With three or more agents, conditions (i) and (ii) are equivalent. They also imply that any variable which is common knowledge to two agents is common knowledge to all, and imply the existence of a unique common-knowledge variable that disassociates types, which is the one defined by Aumann.
An observer of a process (x_t) believes the process is governed by Q whereas the true law is P. We bound the expected average distance between P(x_t|x1,…,x_{t−1}) and Q(x_t|x_1,…,x_{t−1}) for t = 1,…,n by a function of the relative entropy between the marginals of P and Q on the n first realizations. We apply this bound to the cost of learning in sequential decision problems and to the merging of Q to P.
We characterize the maximum payoff that a team can guarantee against another in a class of repeated games with imperfect
monitoring. Our result relies on the optimal tradeoff for the team between optimization of stage payoffs and generation of
signals for future correlation.
We study a repeated game with asymmetric information about a dynamic state of nature. In the course of the game, the better-informed player can communicate some or all of his information to the other. Our model covers costly and/or bounded communication. We characterize the set of equilibrium payoffs and contrast these with the communication equilibrium payoffs, which by definition entail no communication costs.
We introduce cheap talk in a dynamic investment model with information externalities. We first show how social learning adversely affects the credibility of cheap talk messages. Next, we show how an informational cascade makes truthtelling incentive compatible. A separating equilibrium only exists for high-surplus projects. Both an investment subsidy and an investment tax can increase welfare. The more precise the sender’s information, the higher her incentives to truthfully reveal her private information.
Let (x_n) be a process with values in a finite set X and law P, and let y_n = f(x_n) be a function of the process. At stage n,
the conditional distribution p_n = P[x_n | x_1 x_{n−1}], element of Pi = Delta(X), is the belief that a perfect observer, who
observes the process online, holds on its realization at stage n. A statistician observing the signals y1, …, y_n holds a belief e_n = P[p_n | x_1,…, x_n] ∈ Delta(Pi) on the possible predictions of the perfect observer. Given X and f , we characterize the set of limits of expected empirical distributions of the process en when P ranges over all possible laws of (x_n).
It is sometimes argued that road safety measures or automobile safety standards fail to save lives because safer highways or safer cars induce more dangerous driving. A similar but less extreme view is that ignoring the behavioral adaptation of drivers would bias the cost–benefit analysis of a traffic safety measure. This article derives cost–benefit rules for automobile safety
regulation when drivers may adapt their risk-taking behavior in response to changes in the quality of the road network. The focus is on the financial externalities induced by accidents because of the insurance system as well as on the consequences of drivers’ risk aversion. We establish that road safety measures are Pareto improving if their monetary cost is lower than the difference between their (adjusted for risk aversion) direct welfare gain with unchanged behavior and the induced variation in insured losses due to drivers’ behavioral adaptation. The article also shows how this rule can be extended to take other accident external costs into account.
Nous introduisons un modèle de communication avec état de la nature dynamique. En utilisant l’entropie comme mesure d’information, nous caractérisons les distributions empiriques espérées sur les actions qui sont réalisables. Nous présentons des applications aux jeux avec et sans intérêts communs.