doi: 10.3934/bdia.2018002

An application of PART to the Football Manager data for players clusters analyses to inform club team formation

Laboratory for Industrial and Applied Mathematics, York University, Toronto, Ontario, Canada, M3J 1P3

Published  October 2018

We aim to show how a neural network based machine learning projective clustering algorithm, Projective Adaptive Resonance Theory (PART), can be effectively used to provide data-informed sports decisions. We illustrate this data-driven decision recommendation for AS Roma player market in the Summer 2018 season, using the two separate databases of fourty-seven attributes taken from Football Manager 2018 for each of the twenty-four soccer player, with the first including players of the AS Roma squad 2017-18, and the second consisting of all players linked with transfer moves to AS Roma. This is high dimensional data as players should be grouped only in terms of their performance with respect to a small subset of attributes. Projective clustering analyses provide a purely data-driven analysis to identify critical attributes and attribute characteristics for a group of players to form a natural cluster (in lower dimensional data space) in an unsupervised way. By merging the two databases, our unsupervised clustering analysis provides evidence-based recommendations about the club team formation, and in particular, the decision to buy and sell players within the same clusters, under different scenarios including financial constraints.

Citation: Marco Tosato, Jianhong Wu. An application of PART to the Football Manager data for players clusters analyses to inform club team formation. Big Data & Information Analytics, doi: 10.3934/bdia.2018002
References:
[1]

Financial Fair Play club summary - Official document, retrieved from www.uefa.com.

[2]

Guide to Football Manager, retrieved from www.guidetofm.com.

[3]

C. C. Aggarwal, J. L. Wolf, P. S. Yu, C. Procopiuc and J. S. Park, Fast algorithms for projected clustering, Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM Press, (1999), 61-72.

[4]

R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, ACM Press (1998), 94-105.

[5]

J. Cao and J. Wu, Dynamics of projective adaptive resonance theory model: The foundation of PART algorithm, IEEE Transactions on Neural Networks, 2004.

[6]

J. Cao and J. Wu, Projective ART for clustering data sets in high dimensional spaces, Neural Networks, 2002.

[7]

C. H. Cheng, A. W. Fu and Y. Zhang, Entropy-based subspace clustering for mining numerical data, Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press (1999), 84-93.

[8]

G. Gan, C. Ma and J. Wu, Data clustering: Theory, algorithms and applications, SIAM, Philadelphia, ASA, Alexandria, VA, (2007), xxii+466 pp. doi: 10.1137/1.9780898718348.

[9]

J. D. Hunter, J. Wu and J. Milton, Clustering neural spike trains with transient responses, Proceedings of the 47th IEEE Conference on Decision and Control, 2008.

[10]

Director - L. Myles, An alternative reality: The football manager documentary, Sports Interactive, 2014.

[11]

L. Parsons, E. Haque and H. Liu, Subspace clustering for high dimensional data: A review, ACM, 2004.

[12]

H. Takahashi and H. Honda, Modified signal-to-noise: a new simple and practical gene filtering approach based on the concept of projective adaptive resonance theory (PART) filtering method, Bioinformatics, 2006.

[13]

H. Takahashi, T. Kobayashi and H. Honda, Construction of robust prognostic predictors by using projective adaptive resonance theory as a gene filtering method, Bioinformatics, 2005.

[14]

J. Wu, Projective adaptive resonance theory revisited with applications to clustering influence spread in online social networks, Data Analytics, 2015.

show all references

References:
[1]

Financial Fair Play club summary - Official document, retrieved from www.uefa.com.

[2]

Guide to Football Manager, retrieved from www.guidetofm.com.

[3]

C. C. Aggarwal, J. L. Wolf, P. S. Yu, C. Procopiuc and J. S. Park, Fast algorithms for projected clustering, Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM Press, (1999), 61-72.

[4]

R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, ACM Press (1998), 94-105.

[5]

J. Cao and J. Wu, Dynamics of projective adaptive resonance theory model: The foundation of PART algorithm, IEEE Transactions on Neural Networks, 2004.

[6]

J. Cao and J. Wu, Projective ART for clustering data sets in high dimensional spaces, Neural Networks, 2002.

[7]

C. H. Cheng, A. W. Fu and Y. Zhang, Entropy-based subspace clustering for mining numerical data, Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press (1999), 84-93.

[8]

G. Gan, C. Ma and J. Wu, Data clustering: Theory, algorithms and applications, SIAM, Philadelphia, ASA, Alexandria, VA, (2007), xxii+466 pp. doi: 10.1137/1.9780898718348.

[9]

J. D. Hunter, J. Wu and J. Milton, Clustering neural spike trains with transient responses, Proceedings of the 47th IEEE Conference on Decision and Control, 2008.

[10]

Director - L. Myles, An alternative reality: The football manager documentary, Sports Interactive, 2014.

[11]

L. Parsons, E. Haque and H. Liu, Subspace clustering for high dimensional data: A review, ACM, 2004.

[12]

H. Takahashi and H. Honda, Modified signal-to-noise: a new simple and practical gene filtering approach based on the concept of projective adaptive resonance theory (PART) filtering method, Bioinformatics, 2006.

[13]

H. Takahashi, T. Kobayashi and H. Honda, Construction of robust prognostic predictors by using projective adaptive resonance theory as a gene filtering method, Bioinformatics, 2005.

[14]

J. Wu, Projective adaptive resonance theory revisited with applications to clustering influence spread in online social networks, Data Analytics, 2015.

Figure 1.  List of players from the two data sets to be examined: the list contains information of their ages and market values
Figure 2.  An illustration of vector clustering using PART where each ball represents a dimension of the vector. At first there is a similarity check where the new vector is compared to the representative of the four clusters. Empty balls show irrelevant dimensions for the cluster ($z_{ij} \leq \zeta$) differently from full-coloured balls ($z_{ij}>\zeta)$. This group can be subdivided further into dimensions in which new vector is similar enough to the respective top-down weight $z_{ji}$ (distance is smaller than $\sigma$) depicted in green and when they are well separated (distance is greater than $\sigma$) depicted in red.
In this case, the algorithm starts by comparing the cluster that yields the highest similarity by computing the $T_j$'s as shown above, in this case cluster 3, and counts the number of similar relevant dimensions (green balls) to compare it with $\rho$. If $\rho \leq 5$ the vector is included in the third cluster; otherwise (if $\rho>5$) it is compared to all other clusters ordered based on their $T_j$'s; in this case since in every cluster the green balls are no more than five, the new vector cannot be categorized in the existing clusters and will form a new one of its own.
Moreover, in the first case ($\rho \leq 5$), the representative of cluster 3 will be updated by taking into consideration $\alpha$ % of the new value and the relevant dimensions will become only those relative to the green balls; in the second case it will form a new cluster where every dimension has equal importance and the representative of the fifth cluster formed would be the vector itself
Figure 3.  Unsupervised clustering of AS Roma first team players 2017/2018 ($\sigma = 2; \;\rho = 18; \; \alpha = 0.2 )$
Figure 4.  Unsupervised clustering of AS Roma market objectives for Summer 2018 ($\sigma = 3; \; \rho = 17; \; \alpha = 0.2)$
Figure 5.  Unsupervised clustering of the two data sets merged as a single data, with underlined indicative characteristic per cluster ($\sigma = 3; \; \rho = 20; \; \alpha = 0.2)$
[1]

Haixia Liu, Jian-Feng Cai, Yang Wang. Subspace clustering by (k,k)-sparse matrix factorization. Inverse Problems & Imaging, 2017, 11 (3) : 539-551. doi: 10.3934/ipi.2017025

[2]

Guojun Gan, Kun Chen. A soft subspace clustering algorithm with log-transformed distances. Big Data & Information Analytics, 2016, 1 (1) : 93-109. doi: 10.3934/bdia.2016.1.93

[3]

Robinson Duque, Alejandro Arbelaez, Juan Francisco Díaz. CP and MIP approaches for soccer analysis. Journal of Industrial & Management Optimization, 2018, 13 (5) : 1-30. doi: 10.3934/jimo.2018109

[4]

James H. Elder. A new training program in data analytics & visualization. Big Data & Information Analytics, 2016, 1 (1) : i-iii. doi: 10.3934/bdia.2016.1.1i

[5]

Michael Kiermaier, Reinhard Laue. Derived and residual subspace designs. Advances in Mathematics of Communications, 2015, 9 (1) : 105-115. doi: 10.3934/amc.2015.9.105

[6]

Yongbin Ou, Cun-Quan Zhang. A new multimembership clustering method. Journal of Industrial & Management Optimization, 2007, 3 (4) : 619-624. doi: 10.3934/jimo.2007.3.619

[7]

Dilek Günneç, Ezgi Demir. Fair-fixture: Minimizing carry-over effects in football leagues. Journal of Industrial & Management Optimization, 2018, 13 (5) : 1-13. doi: 10.3934/jimo.2018110

[8]

Heide Gluesing-Luerssen, Carolyn Troha. Construction of subspace codes through linkage. Advances in Mathematics of Communications, 2016, 10 (3) : 525-540. doi: 10.3934/amc.2016023

[9]

Ernst M. Gabidulin, Pierre Loidreau. Properties of subspace subcodes of Gabidulin codes. Advances in Mathematics of Communications, 2008, 2 (2) : 147-157. doi: 10.3934/amc.2008.2.147

[10]

Baolan Yuan, Wanjun Zhang, Yubo Yuan. A Max-Min clustering method for $k$-means algorithm of data clustering. Journal of Industrial & Management Optimization, 2012, 8 (3) : 565-575. doi: 10.3934/jimo.2012.8.565

[11]

Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i

[12]

Dominique Duncan, Thomas Strohmer. Classification of Alzheimer's disease using unsupervised diffusion component analysis. Mathematical Biosciences & Engineering, 2016, 13 (6) : 1119-1130. doi: 10.3934/mbe.2016033

[13]

Yaguang Huangfu, Guanqing Liang, Jiannong Cao. MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics. Big Data & Information Analytics, 2016, 1 (4) : 349-376. doi: 10.3934/bdia.2016015

[14]

Maria Gabriella Mosquera, Vlado Keselj. Identifying electronic gaming machine gambling personae through unsupervised session classification. Big Data & Information Analytics, 2017, 2 (2) : 141-175. doi: 10.3934/bdia.2017015

[15]

Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117

[16]

Yangyang Xu, Wotao Yin, Stanley Osher. Learning circulant sensing kernels. Inverse Problems & Imaging, 2014, 8 (3) : 901-923. doi: 10.3934/ipi.2014.8.901

[17]

Nicolás M. Crisosto, Christopher M. Kribs-Zaleta, Carlos Castillo-Chávez, Stephen Wirkus. Community resilience in collaborative learning. Discrete & Continuous Dynamical Systems - B, 2010, 14 (1) : 17-40. doi: 10.3934/dcdsb.2010.14.17

[18]

Qiao-Fang Lian, Yun-Zhang Li. Reducing subspace frame multiresolution analysis and frame wavelets. Communications on Pure & Applied Analysis, 2007, 6 (3) : 741-756. doi: 10.3934/cpaa.2007.6.741

[19]

Xin Zhao, Jinyan Fan. On subspace properties of the quadratically constrained quadratic program. Journal of Industrial & Management Optimization, 2017, 13 (4) : 1625-1640. doi: 10.3934/jimo.2017010

[20]

Thomas Honold, Michael Kiermaier, Sascha Kurz. Constructions and bounds for mixed-dimension subspace codes. Advances in Mathematics of Communications, 2016, 10 (3) : 649-682. doi: 10.3934/amc.2016033

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]