Studying Data Science: edX MS Course on Data Sci and Machine Learning

To get to a destination we can take different routes though the end goal remains the same. I have changed course a bunch of times and as mentioned in previous posts it was due to new courses being offered or finding something that was a better fit for me, in terms of the investment in time and the final outcome. This is another reason for taking the courses (for free) on Udacity, without registering for the nanodegree (paid); I wanted to ensure I had a good foundation while taking the best possible courses in the meantime.

Therefore, I have decided to put my pursuit of the Udacity nanodegree on hold because of the offering of a course I really want to take. I am referring to “The Analytics Edge” offered on edX from April 12th. As stated in my first post in which I declared that I will pursue the Udacity nanodegree for the Data Analyst specialty, the only course that could derail the plan was “The Analytics Edge”. There are a number of reasons, that make me want to pursue this course. First of all, I have started it before but left it incomplete. But more important than that is the fact the course is offered by MIT (how many opportunities does one get in a lifetime to complete a course from MIT – unless of course you are/were a student there?). The course also has very good reviews which emphasize how it manages to relate the content to real life examples. After completing the first few modules last year this was something I have seen for myself, as each section uses a real example e.g. from a company or in sports to explore an analytics topic. Homework exercises are both long and challenging so I anticipate having to sit myself through longer sessions in front of the computer to complete them.

Until April 12th I am continuing “Data Science and Machine Learning Essentials” offered by Microsoft on edX. This is another course which appears to go beyond the basics. The instructors both carry tremendous experience in data science/statistics; one has 20 years of experience in R and predictive analytics, while the second instructor teaches Statistics at MIT and she leads the Prediction Analysis Lab at MIT. I emphasize this mostly because this differs compared to other courses offered by MS, as well as the fact that this signifies a noted increase in the difficulty level (edX lists this as an intermediate level course).

The course defines data science as follows:

Data Science is the exploration and quantitative analysis of all available structured and unstructured data to develop understanding, extract knowledge, and formulate actionable results.

and further defines the data science process:

The data science process involves:

  1. Data selection.
  2. Preprocessing.
  3. Transformation.
  4. Data Mining.
  5. Interpretation and evaluation.

It is an iterative process in which some, or all, steps may be repeated.

The course exercises are offered in both R and Python, and builds upon Microsoft’s cloud platform Azure. This has the added benefit of giving students the opportunity to familiarize themselves with working in the cloud, as well as avoiding installations and setup locally. I have yet to delve into the essence of the course, but aim to complete it before April 12th when “The Analytics Edge” begins. More detailed reviews will follow in the next two posts.

To recap, my study plan is now:

  1. edX: Data Science and Machine Learning Essentials (Microsoft) – ongoing
  2. edX: The Analytics Edge (MITx) – starts April 12th, duration 6 weeks
  3. Udacity Data Analyst nanodegree – will register end of June (upon completion of The Analytics Edge).

I have to admit a bias to edX courses. I love the platform and the diversity of courses offered from various providers. Initially, I believed that the addition of Microsoft would be simply more of the same – Windows and Office. However, I have to (again) give kudos to Microsoft for the new approach and strategy. Let me also state that if I believe that I can find the knowledge I require to reach my goal from other sources I will reconsider the pursuit of the Udacity nanodegree. What Udacity offers though is the projects, the completion of which will give you a (marketable) portfolio. Time (and experience) will tell. For now stay tuned; my next reviews will go into more details and provide tangible samples of work including graphs and completed exercises from the courses.

Michael Lazarou
Michael Lazarou
Michael Lazarou manages revenue assurance and fraud at Epic, a Cypriot telco, having joined their RA function in March 2011. His background includes a double major in Computer Science and Economics, as well as an MBA. Before being lured into the exciting world of telecoms he worked as a software developer.

Michael is interested to gain a better understanding of different aspects of RA and data analysis. He shares his insights on training courses he participates in with Commsrisk. Michael's accumulated experience of online training also led him to volunteer for the role of Coordinator of the RAG Learning online education platform.