Category Archives: Research

Down-scaled Big Data Modeling

This is an idea based on rigorous statistical thinking. I need to answer how promising it could be before I may possibly share more details. Some experiments are planned to run for both linear models and generalized linear models, so as to provide empirical evidences. I am also thinking about utilization of certain CNN bases/features for image classification, upon some further understanding of the underlying model and coding exercises.

Paul Embrechts

Prof. Paul Embrechts is visiting HKU as a Hung Hing Ying Distinguished Visiting Professor in Science and Technology. He will give a public lecture next next Thursday. He is a co-author of the celebrated QRM book Quantitative Risk Management: Concepts, Techniques and Tools. For this book (now 2nd edition, published 2015), there is an excellent website called QRM Tutorial (and its GitHub repository), with slides and R codes available.

Today I happened to attend the biweekly time series seminars organized within the department, and for the first time seriously listened about GARCH model (cf. Francq and Zakoïan (2010)), Stationarity, Portmanteau test, etc. There is a great introduction to these concepts by the QRM book (Chapter 4: Financial time series; PDF slides).

PS: Prof. Paul Embrechts has authored another well-known/influential book in 1997, titled “Modelling Extremal Events for Insurance and Finance”, which I have not ever read yet.

Machine Learning Resources

A list of online resources with dynamic updates:

Data Science and Machine Learning

Optimization and Computing

Deep Learning, NLP, AI, etc

Learning Analytics

Start with some general introduction, and

  • US Dept. Education: briefing

Groups that work in this area:

New trend towards using modern decision-analytic approaches:

  • Big amount of person-click data are generated from online platforms, including both LBS and MOOC systems
  • Modern development of decision-analytic methods and tools, like matrix factorization, deep learning, sparse models, social network analysis

Finally come up a brief proposal with nice image(s):

Learning Analytics by Statistical and Machine Learning Techniques

  1. Person-click data structure: both behaviors and feedbacks
  2. Learning pathways through longitudinal and survival analysis: to measure activity and engagement (person-anchored)
  3. Dropout prediction and retention analysis through machine learning techniques
  4. Social network analysis of linked users and peer interactions
  5. Content analysis through ????, e.g.  6min micro-video effectiveness (content-anchored)
  6. Recommendation system for online quizzes


  1. Bienkowski, M., Feng M. and Means, B. (2012). Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief. U.S. Office of Educational Technology, Department of Education. FDF
  2. Guo, P.J., Kim, J. and Rubin. R. (2014). How Video Production Affects Student Engagement: An Empirical Study of MOOC Videos. ACM Conference on Learning at Scale, March 2014. PDF