`a`
Big Data and Information Analytics (BDIA)
 

Why curriculum learning & self-paced learning work in big/noisy data: A theoretical perspective

Pages: 111 - 127, Volume 1, Issue 1, January 2016      doi:10.3934/bdia.2016.1.111

 
       Abstract        References        Full Text (5848.4K)       Related Articles       

Tieliang Gong - Institute for Information and System Sciences and Ministry of, Education Key Lab of Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China (email)
Qian Zhao - Institute for Information and System Sciences and Ministry of, Education Key Lab of Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China (email)
Deyu Meng - Institute for Information and System Sciences and Ministry of, Education Key Lab of Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China (email)
Zongben Xu - Institute for Information and System Sciences and Ministry of, Education Key Lab of Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China (email)

Abstract: Since being recently raised, curriculum learning (CL) and self-paced learning (SPL) have attracted increasing attention due to its multiple successful applications. While currently the rationality of this learning regime is heuristically inspired by the cognitive principle of humans, there still isn't a sound theory to explain the intrinsic mechanism leading to its effectiveness, especially on some successful attempts on big/noise data. To address this issue, this paper presents some theoretical results for revealing the insights under this learning scheme. Specifically, we first formulate a new learning problem aiming to learn a proper classifier from samples generated from the training distribution which is deviated from the target distribution. Furthermore, we find that the CL/SPL regime provides a feasible solving strategy for this learning problem. Especially, by first introducing high-confidence/easy samples and gradually involving low-confidence/complex ones into learning, the CL/SPL process latently minimizes an upper bound of the expected risk under target distribution, purely using the data from the deviated training distribution. We further construct a new SPL learning algorithm based on random sampling, which better complies with our theory, and substantiate its effectiveness by experiments implemented on synthetic and real data.

Keywords:  Curriculum learning, self-paced learning, learning theory, TRECVID MED/MER, classification.
Mathematics Subject Classification:  Primary: 68Q32, 68T05; Secondary: 68T15.

Received: May 2015;      Revised: August 2015;      Available Online: September 2015.

 References