The Perfect Gifts ,fashion and popular,Women’s Accessories,Earrings ,Bracelet,Necklaces ,Charms, RiNG ,Best Silver Jewelry ,Cheap gift, Cheap Jewelry ,Special Offer Gift .To friend , To me ,Give it to her . Abcdef shop, The Best Choice
This is an idea based on rigorous statistical thinking. I need to answer how promising it could be before I may possibly share more details. Some experiments are planned to run for both linear models and generalized linear models, so as to provide empirical evidences. I am also thinking about utilization of certain CNN bases/features for image classification, upon some further understanding of the underlying model and coding exercises.
Prof. Paul Embrechts is visiting HKU as a Hung Hing Ying Distinguished Visiting Professor in Science and Technology. He will give a public lecture next next Thursday. He is a co-author of the celebrated QRM book Quantitative Risk Management: Concepts, Techniques and Tools. For this book (now 2nd edition, published 2015), there is an excellent website called QRM Tutorial (and its GitHub repository), with slides and R codes available.
Today I happened to attend the biweekly time series seminars organized within the department, and for the first time seriously listened about GARCH model (cf. Francq and Zakoïan (2010)), Stationarity, Portmanteau test, etc. There is a great introduction to these concepts by the QRM book (Chapter 4: Financial time series; PDF slides).
PS: Prof. Paul Embrechts has authored another well-known/influential book in 1997, titled “Modelling Extremal Events for Insurance and Finance”, which I have not ever read yet.
A list of online resources with dynamic updates:
Data Science and Machine Learning
Optimization and Computing
- Convex Optimization (Fall 2016 Course) by Ryan Tibshirani
- Advanced Statistical Computing (Spring 2013 Course) by Eric Laber
Deep Learning, NLP, AI, etc
- Deep Learning book by Goodfellow, Bengio and Courville (2016)
- Stanford CS224d Deep Learning for Natural Language Processing
- Stanford CS231n: Convolutional Neural Networks for Visual Recognition
- Princeton COS 495: Introduction to Deep Learning
- Berkeley CS 294: Deep Reinforcement Learning, Fall 2015
- CUHK ELEG5040 Introduction to Deep Learning
- How do Convolutional Neural Networks work? by Brandon Rohrer (Microsoft)
Start with some general introduction, and
- US Dept. Education: briefing
Groups that work in this area:
- Stanford Lytics
- MIDAS, MIT, the Columbia EDM guy, etc
- Some online courses on learning analytics
- Learning Analytics by European Data Science Academy
New trend towards using modern decision-analytic approaches:
- Big amount of person-click data are generated from online platforms, including both LBS and MOOC systems
- Modern development of decision-analytic methods and tools, like matrix factorization, deep learning, sparse models, social network analysis
Finally come up a brief proposal with nice image(s):
Learning Analytics by Statistical and Machine Learning Techniques
- Person-click data structure: both behaviors and feedbacks
- Learning pathways through longitudinal and survival analysis: to measure activity and engagement (person-anchored)
- Dropout prediction and retention analysis through machine learning techniques
- Social network analysis of linked users and peer interactions
- Content analysis through ????, e.g. 6min micro-video effectiveness (content-anchored)
- Recommendation system for online quizzes
- Bienkowski, M., Feng M. and Means, B. (2012). Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief. U.S. Office of Educational Technology, Department of Education. FDF
- Guo, P.J., Kim, J. and Rubin. R. (2014). How Video Production Affects Student Engagement: An Empirical Study of MOOC Videos. ACM Conference on Learning at Scale, March 2014. PDF
A list of online resources with dynamic updates:
- R graph gallery: http://www.r-graph-gallery.com/
- R graph gallery: http://rgraphgallery.blogspot.hk/
- RStudio Online Learning: https://www.rstudio.com/online-learning/
- Harvard CS171 Visualizations: http://www.cs171.org/
- Majumder (2014): Introduction to data science
- Data Science with R: http://garrettgman.github.io/
Two brand new books are delivered today. One is published in 2006, the other is in 2011. They become relatively old in this fast-growing literature of statistics, machine learning and data science.
Three weeks ago when I moved from HKBU to HKU, one of first priority things for me is to purchase and setup a new high-performance computing server. The quoted price from DELL is as high as $70K+ in Hong Kong dollars. We then turned to MICROWARE for quotation but now still waiting for their response. Only about one week before the first week of class!
This morning I started to google AWS/EC2 and in particular the spot instances. I came across the so-called AWS Grants Program for Research and Education. See here. The application for AWS research grants (also called AWS Cloud Credits for Research) is open every 3 months, and the deadline for next round is September 30. One may also try AWS Educate by providing information like Institution and Course website. There might be about $200 credits (or $75 only, if non-member institution) for an educator.
Besides the possible credits to cover the low-cost AWS EC2 solution, we may also look for possibility of using research grants for covering AWS high-end servers (including clusters). Here is a letter I explained to our dept admin:
Alternatively, the “cloud server” provided by Amazon Web Service is much cheaper by paying the monthly rental/usage fees. It is of more or less equal computing power, and easier to maintain. The main difference is that such “cloud server” is a virtual machine, rather than a real computer machine like we see in the lab. Such virtual cloud server is more like a software service.
Good news is that some types of research grants are flexible! This is great! If it really works, we may probably try Cloud Cluster! Below is a tentative plan based on Amazon Elastic MapReduce (EMR):
Name: Amazon EMR with 1 master and 2 nodes
1 Master: m3.xlarge, 4 vCPU, 15GB Memory, 2x40GB SSD, 100% Utilization
2 Nodes: c4.4xlarge, each with 16 vCPU, 30GB Memory, 2×160 SSD, 5% Utilization (expected)
Description: Amazon EMR cluster for Big data analytics
I choose the region of “Asia Pacific (Singapore)” as it is fastest from Hong Kong in terms of the estimated latency from http://www.cloudping.info/.
Let’s wait and see if it can work out!
I will teach STAT3622 “Data Visualization” in the coming 2016/17 Fall semester. This would be the first-ever course I teach in HKU.
Some main references ranging from R to Python to D3 are selected and listed as follows:
- R Graphics Cookbook
By Winston Chang, O’Reilly 2013
Book website: http://www.cookbook-r.com/
- ggplot2: Elegant Graphics for Data Analysis
By Hadley Wickham, Springer (2009, 1ed; 2016, 2ed)
Book website: http://ggplot2.org/book/; http://hadley.nz/
- Learning IPython for Interactive Computing and Data Visualization
By Cyrille Rossant, Packt Publishing (2013, 1ed; 2015, 2ed)
Book website: http://ipython-books.github.io/minibook/
- Interactive Data Visualization for the Web: An Introduction to Designing with D3
By Scott Murray, O’Reilly 2013
Online Read: http://chimera.labs.oreilly.com/books/1230000000345
- Visualize This: The FlowingData Guide to Design, Visualization, and Statistics
By Nathan Yau, Wiley 2011
Book website: http://book.flowingdata.com/; http://as.wiley.com/WileyCDA/WileyTitle/productCd-0470944889.html
- Interactive Data Visualization: Foundations, Techniques, and Applications
By Matthew O. Ward, Georges Grinstein, Daniel Keim, CRC Press (2015, 2ed)
Book website: http://www.idvbook.com/teaching-aid/
Other thoughts: The sure thing is that we will use RStudio Server for weekly teaching. Some IT-like but useful skills like Github and AWS EC2 could be added to either lectures or tutorials. The LMS Moodle is preferred by the university, and we shall also design a course web page under statsoft/teaching.