Approachability, regret and calibration: Implications and equivalences
Pages: 181  254,
Issue 2,
April
2014
doi:10.3934/jdg.2014.1.181 Abstract
References
Full text (897.5K)
Related Articles
Vianney Perchet  Université ParisDiderot, Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, 8 place FM/13, Paris, France (email)
1 
J. Abernethy, P. Bartlett and E. Hazan, Blackwell approachability and lowregret learning are equivalent, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 2746. 

2 
S. As Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies, SIAM J. Control Optim., 48 (2009), 24612479. 

3 
P. Auer, N. CesaBianchi and C. Gentile, Adaptive and selfconfident online learning algorithms, J. Comput. System Sci., 64 (2002), 4875, Special issue on COLT 2000 (Palo Alto, CA). 

4 
R. J. Aumann, Subjectivity and correlation in randomized strategies, J. Math. Econom., 1 (1974), 6796. 

5 
R. J. Aumann and M. B. Maschler, Repeated Games with Incomplete Information, MIT Press, Cambridge, MA, 1995, With the collaboration of Richard E. Stearns. 

6 
M. Benaïm and M. Faure, Consistency of vanishingly smooth fictitious play, Math. Oper. Res., 38 (2013), 437450. 

7 
M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions. II. Applications, Math. Oper. Res., 31 (2006), 673695. 

8 
A. Bernstein, S. Mannor and N. Shimkin, Opportunistic strategies for generalized noregret problems, J. Mach. Learn. Res.: Workshop Conf. Proc., 30 (2013), 158171. 

9 
A. Bernstein and N. Shimkin, Responsebased approachability and its application to generalized noregret algorithms, Manuscript. 

10 
L. J. Billera and B. Sturmfels, Fiber polytopes, The Annals of Mathematics, 135 (1992), 527549. 

11 
D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific J. Math., 6 (1956), 18. 

12 
D. Blackwell, Controlled random walks, in Proceedings of the International Congress of Mathematicians, 1954, Amsterdam, vol. III, 1956, 336338. 

13 
D. Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions, John Wiley and Sons, Inc., New York, 1954. 

14 
A. Blum and Y. Mansour, From external to internal regret, in Learning theory, vol. 3559 of Lecture Notes in Comput. Sci., Springer, Berlin, (2005), 621636. 

15 
S. Bubeck, Introduction to online optimization, Manuscript. 

16 
N. CesaBianchi and G. Lugosi, Potentialbased algorithms in online prediction and game theory, Machine Learning, 51 (2003), 239261. 

17 
N. CesaBianchi and G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, Cambridge, 2006. 

18 
X. Chen and H. White, Laws of large numbers for Hilbert spacevalued mixingales with applications, Econometric Theory, 12 (1996), 284304. 

19 
A. P. Dawid, The wellcalibrated Bayesian, J. Amer. Statist. Assoc., 77 (1982), 605613. 

20 
A. P. Dawid, Selfcalibrating priors do not exist: Comment, J. Amer. Statist. Assoc., 80 (1985), 339342. 

21 
L. Evans and P. E. Souganidis, Differential games and representation formulas for solutions of HamiltonJacobi equations, Indiana Univ. Math. J., 33 (1984), 773797. 

22 
K. Fan, Minimax theorems, Proc. Nat. Acad. Sci. U. S. A., 39 (1953), 4247. 

23 
K. Fan, A minimax inequality and applications, in Inequalities, III (Proc. Third Sympos., Univ. California, Los Angeles, Calif., 1969; dedicated to the memory of Theodore S. Motzkin), Academic Press, New York, (1972), 103113. 

24 
W. Feller, An Introduction to Probability Theory and its Applications. Vol. I, Third edition, John Wiley & Sons Inc., New York, 1968. 

25 
D. P. Foster, A proof of calibration via blackwell's approachability theorem, Games and Economic Behavior, 29 (1999), 7378. 

26 
D. P. Foster, A. Rakhlin, K. Sridharan and A. Tewari, Complexitybased approach to calibration with checking rules, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 293314. 

27 
D. P. Foster and R. V. Vohra, Calibrated learning and correlated equilibrium, Games Econom. Behav., 21 (1997), 4055. 

28 
D. P. Foster and R. V. Vohra, Asymptotic calibration, Biometrika, 85 (1998), 379390. 

29 
D. P. Foster and R. V. Vohra, Regret in the online decision problem, Games Econom. Behav., 29 (1999), 735. 

30 
D. P. Foster and R. V. Vohra, Calibration: Respice, adspice, prospice, Advances in Economics and Econometrics, 1 (2013), 423442. 

31 
D. Fudenberg and D. M. Kreps, Learning mixed equilibria, Games Econom. Behav., 5 (1993), 320367. 

32 
D. Fudenberg and D. K. Levine, Conditional universal consistency, Games Econom. Behav., 29 (1999), 104130. 

33 
D. Fudenberg and D. K. Levine, An easier way to calibrate, Games Econom. Behav., 29 (1999), 131137. 

34 
F. Gul, D. Pearce and E. Stachetti, A bound on the proportion of pure strategy equilibria in generic games, Math. Oper. Res., 18 (1993), 548552. 

35 
P. Hall and C. C. Heyde, Martingale Limit Theory and its Application, Academic Press Inc., New York, 1980, Probability and Mathematical Statistics. 

36 
J. Hannan, Approximation to bayes risk in repeated play, in Contributions to the Theory of Games, vol. III of Annals of Mathematics Studies 39, Princeton University Press, Princeton, N. J., 1957, 97139. 

37 
S. Hart and A. MasColell, A simple adaptive procedure leading to correlated equilibrium, Econometrica, 68 (2000), 11271150. 

38 
S. Hart and A. MasColell, A general class of adaptive strategies, J. Econom. Theory, 98 (2001), 2654. 

39 
S. Hart and A. MasColell, Regretbased continuoustime dynamics, Games Econom. Behav., 45 (2003), 375394. 

40 
E. Hazan and S. M. Kakade, (weak) calibration is computationaly hard, J. Mach. Learn. Res.: Workshop Conf. Proc., 23 (2012), 3.13.10. 

41 
J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play, Econometrica, 70 (2002), 22652294. 

42 
J. Hofbauer, S. Sorin and Y. Viossat, Time average replicator and bestreply dynamics, Math. Oper. Res., 34 (2009), 263269. 

43 
S. M. Kakade and D. P. Foster, Deterministic calibration and Nash equilibrium, in Learning theory, vol. 3120 of Lecture Notes in Comput. Sci., Springer, Berlin, (2004), 3348. 

44 
O. Kallenberg and R. Sztencel, Some dimensionfree features of vectorvalued martingales, Probability Theory and Related Fields, 88 (1991), 215247. 

45 
E. Kohlberg, Optimal strategies in repeated games with incomplete information, Internat. J. Game Theory, 4 (1975), 724. 

46 
J. Kwon, Hilbert Distance, Bounded Convex Functions, and Application to the Exponential Weight Algorithm, Master's thesis, ENS Lyon, 2012. 

47 
E. Lehrer, Any inspection is manipulable, Econometrica, 69 (2001), 13331347. 

48 
E. Lehrer, Approachability in infinite dimensional spaces, Internat. J. Game Theory, 31 (2002), 253268. 

49 
E. Lehrer, A wide range noregret theorem, Games Econom. Behav., 42 (2003), 101115. 

50 
E. Lehrer and E. Solan, Excludability and bounded computational capacity, Math. Oper. Res., 31 (2006), 637648. 

51 
E. Lehrer and E. Solan, Learning to play partiallyspecified equilibrium, Manuscript. 

52 
E. Lehrer and E. Solan, Approachability with bounded memory, Games Econom. Behav., 66 (2009), 9951004. 

53 
E. Lehrer and S. Sorin, Minmax via differential inclusion, J. Conv. Analysis, 14 (2007), 271273. 

54 
N. Littlestone and M. Warmuth, The weighted majority algorithm, Information and Computation, 108 (1994), 212261. 

55 
R. D. Luce and H. Raiffa, Games and Decisions: Introduction and Critical Survey, John Wiley & Sons Inc., New York, N. Y., 1957. 

56 
S. Mannor and N. Shimkin, Regret minimization in repeated matrix games with variable stage duration, Games Econom. Behav., 63 (2008), 227258. 

57 
S. Mannor and G. Stoltz, A geometric proof of calibration, Math. Oper. Res., 35 (2010), 721727. 

58 
S. Mannor, G. Stoltz and V. Perchet, Robust approachability and regret minimization in games with partial monitoring, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 515536. 

59 
S. Mannor, G. Stoltz and V. Perchet, Setvalued approachability, with applications to regret minimization in games with partial monitoring, Manuscript. 

60 
S. Mannor and J. N. Tsitsiklis, Approachability in repeated games: Computational aspects and a Stackelberg variant, Games Econom. Behav., 66 (2009), 315325. 

61 
D. McFadden, Conditional logit analysis of qualitative choice behavior, Frontiers in econometrics, Academic Press: New York, 1974, 105142. 

62 
J.F. Mertens, S. Sorin and S. Zamir, Repeated games, CORE discussion paper, (1994), 94209422. 

63 
J. Neveu, Bases Mathématiques du Calcul des Probabilités, Masson et Cie, éditeurs, Paris, 1970. 

64 
J. Neveu, Martingales à Temps Discret, Masson et Cie, éditeurs, Paris, 1972. 

65 
D. Oakes, Selfcalibrating priors do not exist, J. Amer. Statist. Assoc., 80 (1985), 339342, With comments by A. P. Dawid and Mark J. Schervish. 

66 
W. Olszewski, Calibration and expert testing, in Handbook of Game Theory (eds. P. Young and S. Zamir), vol. IV, to appear. 

67 
V. Perchet, Calibration and internal noregret with random signals, Algorithmic learning theory, 6882, Lecture Notes in Comput. Sci., 5809, Springer, Berlin, 2009. 

68 
V. Perchet, Approchabilité, Calibration et Regret dans les Jeux à Observations Partielles (in French), PhD thesis, Université Pierre et Marie Curie, 2010. 

69 
V. Perchet, Approachability of convex sets in games with partial monitoring, J. Optim. Theory Appl., 149 (2011), 665677. 

70 
V. Perchet, Noregret with partial monitoring: Calibrationbased optimal algorithms, J. Mach. Learn. Res., 12 (2011), 18931921. 

71 
V. Perchet, Exponential weights approachability, applications to regret minimization and calibration, Manuscript. 

72 
V. Perchet and M. Quincampoix, Purely informative game: Application to approachability with partial monitoring, Manuscript. 

73 
A. Rakhlin, Lecture notes on online learning, Manuscript. 

74 
A. Rakhlin, K. Sridharan and A. Tewari, Online Learning: Random Averages, Combinatorial Parameters, and Learnability, in Neural Information Processing Systems, 2010. 

75 
A. Rakhlin, K. Sridharan and A. Tewari, Online learning: Beyond regret, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 559594. 

76 
W. Rudin, Real and Complex Analysis, McGrawHill Series in Higher Mathematics, New York, 1974. 

77 
A. Sandroni, R. Smorodinsky and R. V. Vohra, Calibration with many checking rules, Math. Oper. Res., 28 (2003), 141153. 

78 
E. Seneta, Nonnegative Matrices and Markov Chains, 2nd edition, Springer Series in Statistics, SpringerVerlag, New York, 1981. 

79 
S. Sorin, A First Course on ZeroSum Repeated Games, SpringerVerlag, 2002. 

80 
S. Sorin, Lectures on Dynamics in Games, Unpublished Lecture Notes, 2008. 

81 
S. Sorin, Exponential weight algorithm in continuous time, Math. Program., 116 (2009), 513528. 

82 
X. Spinat, A necessary and sufficient condition for approachability, Math. Oper. Res., 27 (2002), 3144. 

83 
G. Stoltz, Incomplete Information and Internal Regret in Prediction of Individual Sequences, PhD thesis, Université ParisSud, 2005. 

84 
G. Stoltz and G. Lugosi, Internal regret in online portfolio selection, Mach. Learn., 59 (2005), 125159. 

85 
G. Stoltz and G. Lugosi, Learning correlated equilibria in games with compact sets of strategies, Games Econom. Behav., 59 (2007), 187208. 

86 
N. Vieille, Weak approachability, Math. Oper. Res., 17 (1992), 781791. 

87 
Y. Viossat and A. Zapechelnyuk, Noregret dynamics and fictitious play, J. Econom. Theory, 148 (2013), 825842. 

88 
V. Vovk, Aggregating strategies, in Proceedings of the 3rd Annual Workshop on Computational Learning Theory, 1990, 372383. 

89 
V. Vovk, I. Nouretdinov, A. Takemura and G. Shafer, Defensive forecasting for linear protocols, in Algorithmic Learning Theory, vol. 3734 of Lecture Notes in Comput. Sci., Springer, Berlin, (2005), 459473. 

90 
D. Walkup and R. J.B. Wets, A lipschitzian characterization of convex polyhedra, Proceedings of the American Mathematical Society, 23 (1969), 167173. 

91 
A. Zapechelnyuk, Betterreply dynamics with bounded recall, Math. Oper. Res., 33 (2008), 869879. 

92 
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in Proceedings of the Twentieth International Conference on Machine Learning (ICML), 2003. 

Go to top
