2014, 1(2): 181-254. doi: 10.3934/jdg.2014.1.181

Approachability, regret and calibration: Implications and equivalences

1. 

Université Paris-Diderot, Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, 8 place FM/13, Paris, France

Received  January 2013 Revised  January 2014 Published  March 2014

Blackwell approachability, regret minimization and calibration are three criteria used to evaluate a strategy (or an algorithm) in sequential decision problems, described as repeated games between a player and Nature. Although they have at first sight not much in common, links between them have been discovered: for instance, both consistent and calibrated strategies can be constructed by following, in some auxiliary game, an approachability strategy.
    We gather seminal and recent results, develop and generalize Blackwell's elegant theory in several directions. The final objectives is to show how approachability can be used as a basic powerful tool to exhibit a new class of intuitive algorithms, based on simple geometric properties. In order to be complete, we also prove that approachability can be seen as a byproduct of the very existence of consistent or calibrated strategies.
Citation: Vianney Perchet. Approachability, regret and calibration: Implications and equivalences. Journal of Dynamics & Games, 2014, 1 (2) : 181-254. doi: 10.3934/jdg.2014.1.181
References:
[1]

J. Abernethy, P. Bartlett and E. Hazan, Blackwell approachability and low-regret learning are equivalent,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 27.

[2]

S. As Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies,, SIAM J. Control Optim., 48 (2009), 2461. doi: 10.1137/090749098.

[3]

P. Auer, N. Cesa-Bianchi and C. Gentile, Adaptive and self-confident on-line learning algorithms,, J. Comput. System Sci., 64 (2002), 48. doi: 10.1006/jcss.2001.1795.

[4]

R. J. Aumann, Subjectivity and correlation in randomized strategies,, J. Math. Econom., 1 (1974), 67. doi: 10.1016/0304-4068(74)90037-8.

[5]

R. J. Aumann and M. B. Maschler, Repeated Games with Incomplete Information,, MIT Press, (1995).

[6]

M. Benaïm and M. Faure, Consistency of vanishingly smooth fictitious play,, Math. Oper. Res., 38 (2013), 437. doi: 10.1287/moor.1120.0568.

[7]

M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions. II. Applications,, Math. Oper. Res., 31 (2006), 673. doi: 10.1287/moor.1060.0213.

[8]

A. Bernstein, S. Mannor and N. Shimkin, Opportunistic strategies for generalized no-regret problems,, J. Mach. Learn. Res.: Workshop Conf. Proc., 30 (2013), 158.

[9]

A. Bernstein and N. Shimkin, Response-based approachability and its application to generalized no-regret algorithms,, Manuscript., ().

[10]

L. J. Billera and B. Sturmfels, Fiber polytopes,, The Annals of Mathematics, 135 (1992), 527. doi: 10.2307/2946575.

[11]

D. Blackwell, An analog of the minimax theorem for vector payoffs,, Pacific J. Math., 6 (1956), 1. doi: 10.2140/pjm.1956.6.1.

[12]

D. Blackwell, Controlled random walks,, in Proceedings of the International Congress of Mathematicians, (1954), 336.

[13]

D. Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions,, John Wiley and Sons, (1954).

[14]

A. Blum and Y. Mansour, From external to internal regret,, in Learning theory, (2005), 621. doi: 10.1007/11503415_42.

[15]

S. Bubeck, Introduction to online optimization,, Manuscript., ().

[16]

N. Cesa-Bianchi and G. Lugosi, Potential-based algorithms in on-line prediction and game theory,, Machine Learning, 51 (2003), 239.

[17]

N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games,, Cambridge University Press, (2006). doi: 10.1017/CBO9780511546921.

[18]

X. Chen and H. White, Laws of large numbers for Hilbert space-valued mixingales with applications,, Econometric Theory, 12 (1996), 284. doi: 10.1017/S0266466600006599.

[19]

A. P. Dawid, The well-calibrated Bayesian,, J. Amer. Statist. Assoc., 77 (1982), 605. doi: 10.1080/01621459.1982.10477856.

[20]

A. P. Dawid, Self-calibrating priors do not exist: Comment,, J. Amer. Statist. Assoc., 80 (1985), 339. doi: 10.1080/01621459.1985.10478117.

[21]

L. Evans and P. E. Souganidis, Differential games and representation formulas for solutions of Hamilton-Jacobi equations,, Indiana Univ. Math. J., 33 (1984), 773. doi: 10.1512/iumj.1984.33.33040.

[22]

K. Fan, Minimax theorems,, Proc. Nat. Acad. Sci. U. S. A., 39 (1953), 42. doi: 10.1073/pnas.39.1.42.

[23]

K. Fan, A minimax inequality and applications,, in Inequalities, (1972), 103.

[24]

W. Feller, An Introduction to Probability Theory and its Applications. Vol. I,, Third edition, (1968).

[25]

D. P. Foster, A proof of calibration via blackwell's approachability theorem,, Games and Economic Behavior, 29 (1999), 73. doi: 10.1006/game.1999.0719.

[26]

D. P. Foster, A. Rakhlin, K. Sridharan and A. Tewari, Complexity-based approach to calibration with checking rules,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 293.

[27]

D. P. Foster and R. V. Vohra, Calibrated learning and correlated equilibrium,, Games Econom. Behav., 21 (1997), 40. doi: 10.1006/game.1997.0595.

[28]

D. P. Foster and R. V. Vohra, Asymptotic calibration,, Biometrika, 85 (1998), 379. doi: 10.1093/biomet/85.2.379.

[29]

D. P. Foster and R. V. Vohra, Regret in the on-line decision problem,, Games Econom. Behav., 29 (1999), 7. doi: 10.1006/game.1999.0740.

[30]

D. P. Foster and R. V. Vohra, Calibration: Respice, adspice, prospice,, Advances in Economics and Econometrics, 1 (2013), 423. doi: 10.1017/CBO9781139060011.014.

[31]

D. Fudenberg and D. M. Kreps, Learning mixed equilibria,, Games Econom. Behav., 5 (1993), 320. doi: 10.1006/game.1993.1021.

[32]

D. Fudenberg and D. K. Levine, Conditional universal consistency,, Games Econom. Behav., 29 (1999), 104. doi: 10.1006/game.1998.0705.

[33]

D. Fudenberg and D. K. Levine, An easier way to calibrate,, Games Econom. Behav., 29 (1999), 131. doi: 10.1006/game.1999.0726.

[34]

F. Gul, D. Pearce and E. Stachetti, A bound on the proportion of pure strategy equilibria in generic games,, Math. Oper. Res., 18 (1993), 548. doi: 10.1287/moor.18.3.548.

[35]

P. Hall and C. C. Heyde, Martingale Limit Theory and its Application,, Academic Press Inc., (1980).

[36]

J. Hannan, Approximation to bayes risk in repeated play,, in Contributions to the Theory of Games, (1957), 97.

[37]

S. Hart and A. Mas-Colell, A simple adaptive procedure leading to correlated equilibrium,, Econometrica, 68 (2000), 1127. doi: 10.1111/1468-0262.00153.

[38]

S. Hart and A. Mas-Colell, A general class of adaptive strategies,, J. Econom. Theory, 98 (2001), 26. doi: 10.1006/jeth.2000.2746.

[39]

S. Hart and A. Mas-Colell, Regret-based continuous-time dynamics,, Games Econom. Behav., 45 (2003), 375. doi: 10.1016/S0899-8256(03)00178-7.

[40]

E. Hazan and S. M. Kakade, (weak) calibration is computationaly hard,, J. Mach. Learn. Res.: Workshop Conf. Proc., 23 (2012), 1.

[41]

J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play,, Econometrica, 70 (2002), 2265. doi: 10.1111/1468-0262.00376.

[42]

J. Hofbauer, S. Sorin and Y. Viossat, Time average replicator and best-reply dynamics,, Math. Oper. Res., 34 (2009), 263. doi: 10.1287/moor.1080.0359.

[43]

S. M. Kakade and D. P. Foster, Deterministic calibration and Nash equilibrium,, in Learning theory, (2004), 33. doi: 10.1007/978-3-540-27819-1_3.

[44]

O. Kallenberg and R. Sztencel, Some dimension-free features of vector-valued martingales,, Probability Theory and Related Fields, 88 (1991), 215. doi: 10.1007/BF01212560.

[45]

E. Kohlberg, Optimal strategies in repeated games with incomplete information,, Internat. J. Game Theory, 4 (1975), 7. doi: 10.1007/BF01766399.

[46]

J. Kwon, Hilbert Distance, Bounded Convex Functions, and Application to the Exponential Weight Algorithm,, Master's thesis, (2012).

[47]

E. Lehrer, Any inspection is manipulable,, Econometrica, 69 (2001), 1333. doi: 10.1111/1468-0262.00244.

[48]

E. Lehrer, Approachability in infinite dimensional spaces,, Internat. J. Game Theory, 31 (2002), 253. doi: 10.1007/s001820200115.

[49]

E. Lehrer, A wide range no-regret theorem,, Games Econom. Behav., 42 (2003), 101. doi: 10.1016/S0899-8256(03)00032-0.

[50]

E. Lehrer and E. Solan, Excludability and bounded computational capacity,, Math. Oper. Res., 31 (2006), 637. doi: 10.1287/moor.1060.0211.

[51]

E. Lehrer and E. Solan, Learning to play partially-specified equilibrium,, Manuscript., ().

[52]

E. Lehrer and E. Solan, Approachability with bounded memory,, Games Econom. Behav., 66 (2009), 995. doi: 10.1016/j.geb.2007.07.011.

[53]

E. Lehrer and S. Sorin, Minmax via differential inclusion,, J. Conv. Analysis, 14 (2007), 271.

[54]

N. Littlestone and M. Warmuth, The weighted majority algorithm,, Information and Computation, 108 (1994), 212. doi: 10.1006/inco.1994.1009.

[55]

R. D. Luce and H. Raiffa, Games and Decisions: Introduction and Critical Survey,, John Wiley & Sons Inc., (1957).

[56]

S. Mannor and N. Shimkin, Regret minimization in repeated matrix games with variable stage duration,, Games Econom. Behav., 63 (2008), 227. doi: 10.1016/j.geb.2007.07.006.

[57]

S. Mannor and G. Stoltz, A geometric proof of calibration,, Math. Oper. Res., 35 (2010), 721. doi: 10.1287/moor.1100.0465.

[58]

S. Mannor, G. Stoltz and V. Perchet, Robust approachability and regret minimization in games with partial monitoring,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 515.

[59]

S. Mannor, G. Stoltz and V. Perchet, Set-valued approachability, with applications to regret minimization in games with partial monitoring,, Manuscript., ().

[60]

S. Mannor and J. N. Tsitsiklis, Approachability in repeated games: Computational aspects and a Stackelberg variant,, Games Econom. Behav., 66 (2009), 315. doi: 10.1016/j.geb.2008.03.008.

[61]

D. McFadden, Conditional logit analysis of qualitative choice behavior,, Frontiers in econometrics, (1974), 105.

[62]

J.-F. Mertens, S. Sorin and S. Zamir, Repeated games,, CORE discussion paper, (1994), 9420. doi: 10.1057/9780230226203.3424.

[63]

J. Neveu, Bases Mathématiques du Calcul des Probabilités,, Masson et Cie, (1970).

[64]

J. Neveu, Martingales à Temps Discret,, Masson et Cie, (1972).

[65]

D. Oakes, Self-calibrating priors do not exist,, J. Amer. Statist. Assoc., 80 (1985), 339. doi: 10.1080/01621459.1985.10478117.

[66]

W. Olszewski, Calibration and expert testing,, in Handbook of Game Theory (eds. P. Young and S. Zamir), ().

[67]

V. Perchet, Calibration and internal no-regret with random signals,, Algorithmic learning theory, (5809), 68. doi: 10.1007/978-3-642-04414-4_10.

[68]

V. Perchet, Approchabilité, Calibration et Regret dans les Jeux à Observations Partielles (in French),, PhD thesis, (2010).

[69]

V. Perchet, Approachability of convex sets in games with partial monitoring,, J. Optim. Theory Appl., 149 (2011), 665. doi: 10.1007/s10957-011-9797-3.

[70]

V. Perchet, No-regret with partial monitoring: Calibration-based optimal algorithms,, J. Mach. Learn. Res., 12 (2011), 1893.

[71]

V. Perchet, Exponential weights approachability, applications to regret minimization and calibration,, Manuscript., ().

[72]

V. Perchet and M. Quincampoix, Purely informative game: Application to approachability with partial monitoring,, Manuscript., ().

[73]

A. Rakhlin, Lecture notes on online learning,, Manuscript., ().

[74]

A. Rakhlin, K. Sridharan and A. Tewari, Online Learning: Random Averages, Combinatorial Parameters, and Learnability,, in Neural Information Processing Systems, (2010).

[75]

A. Rakhlin, K. Sridharan and A. Tewari, Online learning: Beyond regret,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 559.

[76]

W. Rudin, Real and Complex Analysis,, McGraw-Hill Series in Higher Mathematics, (1974).

[77]

A. Sandroni, R. Smorodinsky and R. V. Vohra, Calibration with many checking rules,, Math. Oper. Res., 28 (2003), 141. doi: 10.1287/moor.28.1.141.14264.

[78]

E. Seneta, Nonnegative Matrices and Markov Chains,, 2nd edition, (1981).

[79]

S. Sorin, A First Course on Zero-Sum Repeated Games,, Springer-Verlag, (2002).

[80]

S. Sorin, Lectures on Dynamics in Games,, Unpublished Lecture Notes, (2008).

[81]

S. Sorin, Exponential weight algorithm in continuous time,, Math. Program., 116 (2009), 513. doi: 10.1007/s10107-007-0111-y.

[82]

X. Spinat, A necessary and sufficient condition for approachability,, Math. Oper. Res., 27 (2002), 31. doi: 10.1287/moor.27.1.31.333.

[83]

G. Stoltz, Incomplete Information and Internal Regret in Prediction of Individual Sequences,, PhD thesis, (2005).

[84]

G. Stoltz and G. Lugosi, Internal regret in on-line portfolio selection,, Mach. Learn., 59 (2005), 125. doi: 10.1007/s10994-005-0465-4.

[85]

G. Stoltz and G. Lugosi, Learning correlated equilibria in games with compact sets of strategies,, Games Econom. Behav., 59 (2007), 187. doi: 10.1016/j.geb.2006.04.007.

[86]

N. Vieille, Weak approachability,, Math. Oper. Res., 17 (1992), 781. doi: 10.1287/moor.17.4.781.

[87]

Y. Viossat and A. Zapechelnyuk, No-regret dynamics and fictitious play,, J. Econom. Theory, 148 (2013), 825. doi: 10.1016/j.jet.2012.07.003.

[88]

V. Vovk, Aggregating strategies,, in Proceedings of the 3rd Annual Workshop on Computational Learning Theory, (1990), 372.

[89]

V. Vovk, I. Nouretdinov, A. Takemura and G. Shafer, Defensive forecasting for linear protocols,, in Algorithmic Learning Theory, (2005), 459. doi: 10.1007/11564089_35.

[90]

D. Walkup and R. J.-B. Wets, A lipschitzian characterization of convex polyhedra,, Proceedings of the American Mathematical Society, 23 (1969), 167. doi: 10.1090/S0002-9939-1969-0246200-8.

[91]

A. Zapechelnyuk, Better-reply dynamics with bounded recall,, Math. Oper. Res., 33 (2008), 869. doi: 10.1287/moor.1080.0323.

[92]

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent,, in Proceedings of the Twentieth International Conference on Machine Learning (ICML), (2003).

show all references

References:
[1]

J. Abernethy, P. Bartlett and E. Hazan, Blackwell approachability and low-regret learning are equivalent,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 27.

[2]

S. As Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies,, SIAM J. Control Optim., 48 (2009), 2461. doi: 10.1137/090749098.

[3]

P. Auer, N. Cesa-Bianchi and C. Gentile, Adaptive and self-confident on-line learning algorithms,, J. Comput. System Sci., 64 (2002), 48. doi: 10.1006/jcss.2001.1795.

[4]

R. J. Aumann, Subjectivity and correlation in randomized strategies,, J. Math. Econom., 1 (1974), 67. doi: 10.1016/0304-4068(74)90037-8.

[5]

R. J. Aumann and M. B. Maschler, Repeated Games with Incomplete Information,, MIT Press, (1995).

[6]

M. Benaïm and M. Faure, Consistency of vanishingly smooth fictitious play,, Math. Oper. Res., 38 (2013), 437. doi: 10.1287/moor.1120.0568.

[7]

M. Benaïm, J. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions. II. Applications,, Math. Oper. Res., 31 (2006), 673. doi: 10.1287/moor.1060.0213.

[8]

A. Bernstein, S. Mannor and N. Shimkin, Opportunistic strategies for generalized no-regret problems,, J. Mach. Learn. Res.: Workshop Conf. Proc., 30 (2013), 158.

[9]

A. Bernstein and N. Shimkin, Response-based approachability and its application to generalized no-regret algorithms,, Manuscript., ().

[10]

L. J. Billera and B. Sturmfels, Fiber polytopes,, The Annals of Mathematics, 135 (1992), 527. doi: 10.2307/2946575.

[11]

D. Blackwell, An analog of the minimax theorem for vector payoffs,, Pacific J. Math., 6 (1956), 1. doi: 10.2140/pjm.1956.6.1.

[12]

D. Blackwell, Controlled random walks,, in Proceedings of the International Congress of Mathematicians, (1954), 336.

[13]

D. Blackwell and M. A. Girshick, Theory of Games and Statistical Decisions,, John Wiley and Sons, (1954).

[14]

A. Blum and Y. Mansour, From external to internal regret,, in Learning theory, (2005), 621. doi: 10.1007/11503415_42.

[15]

S. Bubeck, Introduction to online optimization,, Manuscript., ().

[16]

N. Cesa-Bianchi and G. Lugosi, Potential-based algorithms in on-line prediction and game theory,, Machine Learning, 51 (2003), 239.

[17]

N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games,, Cambridge University Press, (2006). doi: 10.1017/CBO9780511546921.

[18]

X. Chen and H. White, Laws of large numbers for Hilbert space-valued mixingales with applications,, Econometric Theory, 12 (1996), 284. doi: 10.1017/S0266466600006599.

[19]

A. P. Dawid, The well-calibrated Bayesian,, J. Amer. Statist. Assoc., 77 (1982), 605. doi: 10.1080/01621459.1982.10477856.

[20]

A. P. Dawid, Self-calibrating priors do not exist: Comment,, J. Amer. Statist. Assoc., 80 (1985), 339. doi: 10.1080/01621459.1985.10478117.

[21]

L. Evans and P. E. Souganidis, Differential games and representation formulas for solutions of Hamilton-Jacobi equations,, Indiana Univ. Math. J., 33 (1984), 773. doi: 10.1512/iumj.1984.33.33040.

[22]

K. Fan, Minimax theorems,, Proc. Nat. Acad. Sci. U. S. A., 39 (1953), 42. doi: 10.1073/pnas.39.1.42.

[23]

K. Fan, A minimax inequality and applications,, in Inequalities, (1972), 103.

[24]

W. Feller, An Introduction to Probability Theory and its Applications. Vol. I,, Third edition, (1968).

[25]

D. P. Foster, A proof of calibration via blackwell's approachability theorem,, Games and Economic Behavior, 29 (1999), 73. doi: 10.1006/game.1999.0719.

[26]

D. P. Foster, A. Rakhlin, K. Sridharan and A. Tewari, Complexity-based approach to calibration with checking rules,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 293.

[27]

D. P. Foster and R. V. Vohra, Calibrated learning and correlated equilibrium,, Games Econom. Behav., 21 (1997), 40. doi: 10.1006/game.1997.0595.

[28]

D. P. Foster and R. V. Vohra, Asymptotic calibration,, Biometrika, 85 (1998), 379. doi: 10.1093/biomet/85.2.379.

[29]

D. P. Foster and R. V. Vohra, Regret in the on-line decision problem,, Games Econom. Behav., 29 (1999), 7. doi: 10.1006/game.1999.0740.

[30]

D. P. Foster and R. V. Vohra, Calibration: Respice, adspice, prospice,, Advances in Economics and Econometrics, 1 (2013), 423. doi: 10.1017/CBO9781139060011.014.

[31]

D. Fudenberg and D. M. Kreps, Learning mixed equilibria,, Games Econom. Behav., 5 (1993), 320. doi: 10.1006/game.1993.1021.

[32]

D. Fudenberg and D. K. Levine, Conditional universal consistency,, Games Econom. Behav., 29 (1999), 104. doi: 10.1006/game.1998.0705.

[33]

D. Fudenberg and D. K. Levine, An easier way to calibrate,, Games Econom. Behav., 29 (1999), 131. doi: 10.1006/game.1999.0726.

[34]

F. Gul, D. Pearce and E. Stachetti, A bound on the proportion of pure strategy equilibria in generic games,, Math. Oper. Res., 18 (1993), 548. doi: 10.1287/moor.18.3.548.

[35]

P. Hall and C. C. Heyde, Martingale Limit Theory and its Application,, Academic Press Inc., (1980).

[36]

J. Hannan, Approximation to bayes risk in repeated play,, in Contributions to the Theory of Games, (1957), 97.

[37]

S. Hart and A. Mas-Colell, A simple adaptive procedure leading to correlated equilibrium,, Econometrica, 68 (2000), 1127. doi: 10.1111/1468-0262.00153.

[38]

S. Hart and A. Mas-Colell, A general class of adaptive strategies,, J. Econom. Theory, 98 (2001), 26. doi: 10.1006/jeth.2000.2746.

[39]

S. Hart and A. Mas-Colell, Regret-based continuous-time dynamics,, Games Econom. Behav., 45 (2003), 375. doi: 10.1016/S0899-8256(03)00178-7.

[40]

E. Hazan and S. M. Kakade, (weak) calibration is computationaly hard,, J. Mach. Learn. Res.: Workshop Conf. Proc., 23 (2012), 1.

[41]

J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play,, Econometrica, 70 (2002), 2265. doi: 10.1111/1468-0262.00376.

[42]

J. Hofbauer, S. Sorin and Y. Viossat, Time average replicator and best-reply dynamics,, Math. Oper. Res., 34 (2009), 263. doi: 10.1287/moor.1080.0359.

[43]

S. M. Kakade and D. P. Foster, Deterministic calibration and Nash equilibrium,, in Learning theory, (2004), 33. doi: 10.1007/978-3-540-27819-1_3.

[44]

O. Kallenberg and R. Sztencel, Some dimension-free features of vector-valued martingales,, Probability Theory and Related Fields, 88 (1991), 215. doi: 10.1007/BF01212560.

[45]

E. Kohlberg, Optimal strategies in repeated games with incomplete information,, Internat. J. Game Theory, 4 (1975), 7. doi: 10.1007/BF01766399.

[46]

J. Kwon, Hilbert Distance, Bounded Convex Functions, and Application to the Exponential Weight Algorithm,, Master's thesis, (2012).

[47]

E. Lehrer, Any inspection is manipulable,, Econometrica, 69 (2001), 1333. doi: 10.1111/1468-0262.00244.

[48]

E. Lehrer, Approachability in infinite dimensional spaces,, Internat. J. Game Theory, 31 (2002), 253. doi: 10.1007/s001820200115.

[49]

E. Lehrer, A wide range no-regret theorem,, Games Econom. Behav., 42 (2003), 101. doi: 10.1016/S0899-8256(03)00032-0.

[50]

E. Lehrer and E. Solan, Excludability and bounded computational capacity,, Math. Oper. Res., 31 (2006), 637. doi: 10.1287/moor.1060.0211.

[51]

E. Lehrer and E. Solan, Learning to play partially-specified equilibrium,, Manuscript., ().

[52]

E. Lehrer and E. Solan, Approachability with bounded memory,, Games Econom. Behav., 66 (2009), 995. doi: 10.1016/j.geb.2007.07.011.

[53]

E. Lehrer and S. Sorin, Minmax via differential inclusion,, J. Conv. Analysis, 14 (2007), 271.

[54]

N. Littlestone and M. Warmuth, The weighted majority algorithm,, Information and Computation, 108 (1994), 212. doi: 10.1006/inco.1994.1009.

[55]

R. D. Luce and H. Raiffa, Games and Decisions: Introduction and Critical Survey,, John Wiley & Sons Inc., (1957).

[56]

S. Mannor and N. Shimkin, Regret minimization in repeated matrix games with variable stage duration,, Games Econom. Behav., 63 (2008), 227. doi: 10.1016/j.geb.2007.07.006.

[57]

S. Mannor and G. Stoltz, A geometric proof of calibration,, Math. Oper. Res., 35 (2010), 721. doi: 10.1287/moor.1100.0465.

[58]

S. Mannor, G. Stoltz and V. Perchet, Robust approachability and regret minimization in games with partial monitoring,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 515.

[59]

S. Mannor, G. Stoltz and V. Perchet, Set-valued approachability, with applications to regret minimization in games with partial monitoring,, Manuscript., ().

[60]

S. Mannor and J. N. Tsitsiklis, Approachability in repeated games: Computational aspects and a Stackelberg variant,, Games Econom. Behav., 66 (2009), 315. doi: 10.1016/j.geb.2008.03.008.

[61]

D. McFadden, Conditional logit analysis of qualitative choice behavior,, Frontiers in econometrics, (1974), 105.

[62]

J.-F. Mertens, S. Sorin and S. Zamir, Repeated games,, CORE discussion paper, (1994), 9420. doi: 10.1057/9780230226203.3424.

[63]

J. Neveu, Bases Mathématiques du Calcul des Probabilités,, Masson et Cie, (1970).

[64]

J. Neveu, Martingales à Temps Discret,, Masson et Cie, (1972).

[65]

D. Oakes, Self-calibrating priors do not exist,, J. Amer. Statist. Assoc., 80 (1985), 339. doi: 10.1080/01621459.1985.10478117.

[66]

W. Olszewski, Calibration and expert testing,, in Handbook of Game Theory (eds. P. Young and S. Zamir), ().

[67]

V. Perchet, Calibration and internal no-regret with random signals,, Algorithmic learning theory, (5809), 68. doi: 10.1007/978-3-642-04414-4_10.

[68]

V. Perchet, Approchabilité, Calibration et Regret dans les Jeux à Observations Partielles (in French),, PhD thesis, (2010).

[69]

V. Perchet, Approachability of convex sets in games with partial monitoring,, J. Optim. Theory Appl., 149 (2011), 665. doi: 10.1007/s10957-011-9797-3.

[70]

V. Perchet, No-regret with partial monitoring: Calibration-based optimal algorithms,, J. Mach. Learn. Res., 12 (2011), 1893.

[71]

V. Perchet, Exponential weights approachability, applications to regret minimization and calibration,, Manuscript., ().

[72]

V. Perchet and M. Quincampoix, Purely informative game: Application to approachability with partial monitoring,, Manuscript., ().

[73]

A. Rakhlin, Lecture notes on online learning,, Manuscript., ().

[74]

A. Rakhlin, K. Sridharan and A. Tewari, Online Learning: Random Averages, Combinatorial Parameters, and Learnability,, in Neural Information Processing Systems, (2010).

[75]

A. Rakhlin, K. Sridharan and A. Tewari, Online learning: Beyond regret,, J. Mach. Learn. Res.: Workshop Conf. Proc., 19 (2011), 559.

[76]

W. Rudin, Real and Complex Analysis,, McGraw-Hill Series in Higher Mathematics, (1974).

[77]

A. Sandroni, R. Smorodinsky and R. V. Vohra, Calibration with many checking rules,, Math. Oper. Res., 28 (2003), 141. doi: 10.1287/moor.28.1.141.14264.

[78]

E. Seneta, Nonnegative Matrices and Markov Chains,, 2nd edition, (1981).

[79]

S. Sorin, A First Course on Zero-Sum Repeated Games,, Springer-Verlag, (2002).

[80]

S. Sorin, Lectures on Dynamics in Games,, Unpublished Lecture Notes, (2008).

[81]

S. Sorin, Exponential weight algorithm in continuous time,, Math. Program., 116 (2009), 513. doi: 10.1007/s10107-007-0111-y.

[82]

X. Spinat, A necessary and sufficient condition for approachability,, Math. Oper. Res., 27 (2002), 31. doi: 10.1287/moor.27.1.31.333.

[83]

G. Stoltz, Incomplete Information and Internal Regret in Prediction of Individual Sequences,, PhD thesis, (2005).

[84]

G. Stoltz and G. Lugosi, Internal regret in on-line portfolio selection,, Mach. Learn., 59 (2005), 125. doi: 10.1007/s10994-005-0465-4.

[85]

G. Stoltz and G. Lugosi, Learning correlated equilibria in games with compact sets of strategies,, Games Econom. Behav., 59 (2007), 187. doi: 10.1016/j.geb.2006.04.007.

[86]

N. Vieille, Weak approachability,, Math. Oper. Res., 17 (1992), 781. doi: 10.1287/moor.17.4.781.

[87]

Y. Viossat and A. Zapechelnyuk, No-regret dynamics and fictitious play,, J. Econom. Theory, 148 (2013), 825. doi: 10.1016/j.jet.2012.07.003.

[88]

V. Vovk, Aggregating strategies,, in Proceedings of the 3rd Annual Workshop on Computational Learning Theory, (1990), 372.

[89]

V. Vovk, I. Nouretdinov, A. Takemura and G. Shafer, Defensive forecasting for linear protocols,, in Algorithmic Learning Theory, (2005), 459. doi: 10.1007/11564089_35.

[90]

D. Walkup and R. J.-B. Wets, A lipschitzian characterization of convex polyhedra,, Proceedings of the American Mathematical Society, 23 (1969), 167. doi: 10.1090/S0002-9939-1969-0246200-8.

[91]

A. Zapechelnyuk, Better-reply dynamics with bounded recall,, Math. Oper. Res., 33 (2008), 869. doi: 10.1287/moor.1080.0323.

[92]

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent,, in Proceedings of the Twentieth International Conference on Machine Learning (ICML), (2003).

[1]

Barak Shani, Eilon Solan. Strong approachability. Journal of Dynamics & Games, 2014, 1 (3) : 507-535. doi: 10.3934/jdg.2014.1.507

[2]

Shie Mannor, Vianney Perchet, Gilles Stoltz. A primal condition for approachability with partial monitoring. Journal of Dynamics & Games, 2014, 1 (3) : 447-469. doi: 10.3934/jdg.2014.1.447

[3]

Richard Tapia. My reflections on the Blackwell-Tapia prize. Mathematical Biosciences & Engineering, 2013, 10 (5&6) : 1669-1672. doi: 10.3934/mbe.2013.10.1669

[4]

Saeid Abbasi-Parizi, Majid Aminnayeri, Mahdi Bashiri. Robust solution for a minimax regret hub location problem in a fuzzy-stochastic environment. Journal of Industrial & Management Optimization, 2018, 14 (3) : 1271-1295. doi: 10.3934/jimo.2018083

[5]

Weinan E, Weiguo Gao. Orbital minimization with localization. Discrete & Continuous Dynamical Systems - A, 2009, 23 (1&2) : 249-264. doi: 10.3934/dcds.2009.23.249

[6]

Quanyi Liang, Kairong Liu, Gang Meng, Zhikun She. Minimization of the lowest eigenvalue for a vibrating beam. Discrete & Continuous Dynamical Systems - A, 2018, 38 (4) : 2079-2092. doi: 10.3934/dcds.2018085

[7]

Mehdi Badsi, Martin Campos Pinto, Bruno Després. A minimization formulation of a bi-kinetic sheath. Kinetic & Related Models, 2016, 9 (4) : 621-656. doi: 10.3934/krm.2016010

[8]

M. Zuhair Nashed, Alexandru Tamasan. Structural stability in a minimization problem and applications to conductivity imaging. Inverse Problems & Imaging, 2011, 5 (1) : 219-236. doi: 10.3934/ipi.2011.5.219

[9]

María Andrea Arias Serna, María Eugenia Puerta Yepes, César Edinson Escalante Coterio, Gerardo Arango Ospina. $ (Q,r) $ Model with $ CVaR_α $ of costs minimization. Journal of Industrial & Management Optimization, 2017, 13 (1) : 135-146. doi: 10.3934/jimo.2016008

[10]

Naoufel Ben Abdallah, Irene M. Gamba, Giuseppe Toscani. On the minimization problem of sub-linear convex functionals. Kinetic & Related Models, 2011, 4 (4) : 857-871. doi: 10.3934/krm.2011.4.857

[11]

Xavier Dubois de La Sablonière, Benjamin Mauroy, Yannick Privat. Shape minimization of the dissipated energy in dyadic trees. Discrete & Continuous Dynamical Systems - B, 2011, 16 (3) : 767-799. doi: 10.3934/dcdsb.2011.16.767

[12]

Tai Chiu Edwin Cheng, Bertrand Miao-Tsong Lin, Hsiao-Lan Huang. Talent hold cost minimization in film production. Journal of Industrial & Management Optimization, 2017, 13 (1) : 223-235. doi: 10.3934/jimo.2016013

[13]

Zhi-Feng Pang, Yu-Fei Yang. Semismooth Newton method for minimization of the LLT model. Inverse Problems & Imaging, 2009, 3 (4) : 677-691. doi: 10.3934/ipi.2009.3.677

[14]

Xiaoli Yang, Jin Liang, Bei Hu. Minimization of carbon abatement cost: Modeling, analysis and simulation. Discrete & Continuous Dynamical Systems - B, 2017, 22 (7) : 2939-2969. doi: 10.3934/dcdsb.2017158

[15]

Hongwei Lou, Xueyuan Yin. Minimization of the elliptic higher eigenvalues for multiphase anisotropic conductors. Mathematical Control & Related Fields, 2018, 8 (3&4) : 855-877. doi: 10.3934/mcrf.2018038

[16]

Dirk Pauly. On Maxwell's and Poincaré's constants. Discrete & Continuous Dynamical Systems - S, 2015, 8 (3) : 607-618. doi: 10.3934/dcdss.2015.8.607

[17]

Anass Belcaid, Mohammed Douimi, Abdelkader Fassi Fihri. Recursive reconstruction of piecewise constant signals by minimization of an energy function. Inverse Problems & Imaging, 2018, 12 (4) : 903-920. doi: 10.3934/ipi.2018038

[18]

Weidong Bao, Wenhua Xiao, Haoran Ji, Chao Chen, Xiaomin Zhu, Jianhong Wu. Towards big data processing in clouds: An online cost-minimization approach. Big Data & Information Analytics, 2016, 1 (1) : 15-29. doi: 10.3934/bdia.2016.1.15

[19]

Zhong Wan, Chunhua Yang. New approach to global minimization of normal multivariate polynomial based on tensor. Journal of Industrial & Management Optimization, 2008, 4 (2) : 271-285. doi: 10.3934/jimo.2008.4.271

[20]

Riccardo Adami, Diego Noja, Nicola Visciglia. Constrained energy minimization and ground states for NLS with point defects. Discrete & Continuous Dynamical Systems - B, 2013, 18 (5) : 1155-1188. doi: 10.3934/dcdsb.2013.18.1155

 Impact Factor: 

Metrics

  • PDF downloads (6)
  • HTML views (0)
  • Cited by (4)

Other articles
by authors

[Back to Top]