This is an idea based on rigorous statistical thinking. I need to answer how promising it could be before I may possibly share more details. Some experiments are planned to run for both linear models and generalized linear models, so as to provide empirical evidences. I am also thinking about utilization of certain CNN bases/features for image classification, upon some further understanding of the underlying model and coding exercises.
Prof. Paul Embrechts is visiting HKU as a Hung Hing Ying Distinguished Visiting Professor in Science and Technology. He will give a public lecture next next Thursday. He is a co-author of the celebrated QRM book Quantitative Risk Management: Concepts, Techniques and Tools. For this book (now 2nd edition, published 2015), there is an excellent website called QRM Tutorial (and its GitHub repository), with slides and R codes available.
Today I happened to attend the biweekly time series seminars organized within the department, and for the first time seriously listened about GARCH model (cf. Francq and Zakoïan (2010)), Stationarity, Portmanteau test, etc. There is a great introduction to these concepts by the QRM book (Chapter 4: Financial time series; PDF slides).
PS: Prof. Paul Embrechts has authored another well-known/influential book in 1997, titled “Modelling Extremal Events for Insurance and Finance”, which I have not ever read yet.
A list of online resources with dynamic updates:
Data Science and Machine Learning
Optimization and Computing
- Convex Optimization (Fall 2016 Course) by Ryan Tibshirani
- Advanced Statistical Computing (Spring 2013 Course) by Eric Laber
Deep Learning, NLP, AI, etc
- Deep Learning book by Goodfellow, Bengio and Courville (2016)
- Stanford CS224d Deep Learning for Natural Language Processing
- Stanford CS231n: Convolutional Neural Networks for Visual Recognition
- Princeton COS 495: Introduction to Deep Learning
- Berkeley CS 294: Deep Reinforcement Learning, Fall 2015
- CUHK ELEG5040 Introduction to Deep Learning
- How do Convolutional Neural Networks work? by Brandon Rohrer (Microsoft)
Start with some general introduction, and
- US Dept. Education: briefing
Groups that work in this area:
- Stanford Lytics
- MIDAS, MIT, the Columbia EDM guy, etc
- Some online courses on learning analytics
- Learning Analytics by European Data Science Academy
New trend towards using modern decision-analytic approaches:
- Big amount of person-click data are generated from online platforms, including both LBS and MOOC systems
- Modern development of decision-analytic methods and tools, like matrix factorization, deep learning, sparse models, social network analysis
Finally come up a brief proposal with nice image(s):
Learning Analytics by Statistical and Machine Learning Techniques
- Person-click data structure: both behaviors and feedbacks
- Learning pathways through longitudinal and survival analysis: to measure activity and engagement (person-anchored)
- Dropout prediction and retention analysis through machine learning techniques
- Social network analysis of linked users and peer interactions
- Content analysis through ????, e.g. 6min micro-video effectiveness (content-anchored)
- Recommendation system for online quizzes
- Bienkowski, M., Feng M. and Means, B. (2012). Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief. U.S. Office of Educational Technology, Department of Education. FDF
- Guo, P.J., Kim, J. and Rubin. R. (2014). How Video Production Affects Student Engagement: An Empirical Study of MOOC Videos. ACM Conference on Learning at Scale, March 2014. PDF
Two brand new books are delivered today. One is published in 2006, the other is in 2011. They become relatively old in this fast-growing literature of statistics, machine learning and data science.