Bhaskar Karambelkar's Blog

My Thoughts on Northwestern University's MSPA

 

Tags: MSPA Education Northwestern Predictive Analytics


This is my review of Northwestern University’s Masters in Predictive Analytics (MSPA) online degree. I enrolled in MSAP in the Summer of 2013 and finished in the Summer of 2016. Normally it shouldn’t take this long to finish this program, but I took a break after the first Q1/015 and resumed in Q1/2016. This blog post is a retrospective analysis of the program, what I got out of it, and what it meant to me.

I have not been paid a single cent nor been compensated by any other means for writing this review either by Northwestern University or by any other entity. This review is written with the sole aim of offering my perspective on this program to anyone who is interested in knowing more about the program from an actual student.

TL;DR

Overall I am very satisfied with the program. The program is very flexible, both in terms of scheduling as well as topics covered. There are some caveats that one needs to be aware of and I discuss them towards the end of this post. It gave me a broad overview and also in-depth understanding of topics related to predictive analytics. The program has been instrumental for me in moving up the career ladder, not to mention better monetary compensation (well deserved I hope). I would recommend this program for anyone who wishes to obtain a Master’s degree in this field, but also advice them to compare this program with similar programs offered by various other Universities. You can even learn and master most of these skills without being part of a formal graduate program.

This was just a teaser and the real juicy stuff is below. It is rather long but hopefully worth the full read, at least to some of you who dare proceed.

Long Read

What is MSPA?

Master’s in Predictive Analytics (MSPA) is a fully online graduate level degree program from Northwestern University’s School of Professional Studies (SPS). You can read about it more on Northwestern’s official site for the program. The program is geared towards working professionals with some years of experience under their belt, who are looking to learn more about data analytics. This program is certainly not for someone straight out of undergrad, and the average age of a student is well over 30. You will have a chance to interact with fellow students from various diverse fields and backgrounds, and this to me is a core strength of the program. Besides this program, SPS offers various other online degree and certificate programs a list of which can be found here. The overall cost of the program was just under 50,000 USD, of which most was paid by my employer/s. Lastly the program is open to international students to and due to its online format does not require you to have a student visa in case you want to take it from outside USA.

Why MSPA?

I have been a software-developer/tech-lead all my professional life. Building enterprise systems capable of handling tons of data is something I have been doing for a very long time. But in addition to building extremely fast and scalable data processing pipelines, I also wanted to actually analyze data. I had good enough programming background but not much in terms of data analysis. This is why I decided to pursue a formal education in the field. Mind you, there are tons of non-formal ways to learn data analysis. From some really good free MOOCs from Coursera, Udacity, Edex etc. to paid boot camps from the likes of General Assembly and Zipfian Academy. So a graduate program is not the only option you have.

In the program I met several students who had no programming background and came from a traditional management or actuarial background. So fear not if you don’t have coding skills. As long as you are willing to learn, you can make it. Also the program is geared more towards people who want managerial and leadership roles in analytics and as such is not very theory or coding heavy. You do code quite a bit in R/Python/SAS but there is enough help available so not having a coding background is in no way a deterrent.

Why Northwestern?

Back in 2012-2013 when I was researching for a educational institution offering a program in data analytics there were not as many options available as there are today. Of the options available back then Northwestern’s SPS certainly stood out due to the stellar reputation of some of its other schools like McCormick and Kellogg. That is not to say that Northwestern is the only option, and in fact a lot of good universities have started offering similar programs. Here is one list for similar programs. Having said that, Northwestern has put good resources into this program, and they are always willing to listen to feedback. The program has already undergone some changes, and while I was part of the older format the newer format looks even better.

My top 3 reasons for enrolling in this program were

  • Entire program was available online. This gave me the flexibility I needed.
  • Does not require GRE/GMAT.
  • Putting money down meant no excuse for slacking, which I had been guilty of when taking free MOOCs.

Now someone may frown upon the ‘entirely online’ and ‘does not require GRE/GMAT’ parts and take them as indicators of a not-worthy education. And indeed the entry barrier for this program seems very low. But in my opinion that does not take away from the quality of education received. After all the quality of education you receive depends also on your own efforts and sincerity towards your goals. Every student whom I interacted with as part of this program came into the program very motivated and as such contributed into making the courses and the program that much more worthwhile. For what it’s worth, Northwestern also offers the exact same courses as part of McCormick’s Master’s in Analytics program which is delivered on-campus.

MSPA structure

The program in its current format requires you to take 12 courses, 8 core, 2 electives, either the leadership or the project management course, and either a capstone or a thesis. I was part of the older format which required only 11 courses. In the new format a new course ‘Predict 400-DL: Math for Modelers’ was added as part of the core courses. I really like the contents of this course, and wish it was available when I started. Discrete math skills go a long way in data analytics. As far as electives are concerned you have a wide choice from some traditional domains like risk analytics, marketing analytics, to some newer domains like sports analytics, text analytics, web analytics etc.

Each course is 10 weeks long, and most of the professors have full time employments outside of academia. On one hand you may feel like that is not good but these professors have real work experience which they are very willing to share as part of the learning experience. There are no set days you meet as a class, but there are anywhere between 3 to 5 sync sessions per course. Many professors also have extensive video recordings made available. And all professors are available for scheduled chat through out the entire course duration. For every course there are discussion board topics that you must participate in and are graded accordingly. Each course has its own format for testing your skills. Some have a proctored exam at the end, while some require periodic submissions. Some courses require group activity and group submissions, while others are just individual study.

There are four terms in a year, winter, spring, summer and fall. You can take as many courses per term as you like, but most tend to take one and in some cases two. If you have a full time job + family doing two courses at a time is stressful enough so taking three courses per term is really out of the question. You can take breaks between terms, and in total you have 5 years to complete the whole program.

I must warn you that MSPA is largely self-study with enough pointers from professors and occasional interactions. If self study format does not work for you then this is not the right program for you. I have seen some students complain about this nature of the program. I don't blame them, as at times it does feel like you are paying the University for self-study.

The Program Materials

Each course will require you to work with at least one book and more like three books. I preferred to buy most of these books, a not so cheap option. But I would still recommend this as you will likely return to these books even after you are done with your program. You can save money if using ebooks instead of dead trees. Other ways of saving on books is buying international versions instead of US versions of the books (ebay/Amazon) or buying from ex-students who want to sell their books, or finally renting books for the duration of the course. Lastly as part of your student account you have access to almost entire Springer publications book series in PDF format for free. Use it!

You also need to buy some software (SAS/SPSS) albeit at discounted prices, and some software is either free due to it being open source (R/Python) or available free from Northwestern. You can use either Windows or Mac depending on your preference. You can even use Linux, but some courses require software that is available only under Windows or Mac, so you will need virtualization.

The courses

Here I have put some notes on individual courses I took as part of the program. For a full list of all the courses you can take see the official website. For each course I’ve given a rating on the scale of 1 to 10 in term of course content, professor engagement, overall value to the program, and overall value to me. Mind you the same course is offered by many different professors, so I highly encourage you to check individual professor evaluations available to you when you register for the program.

I was part of the old format and in the new format certain courses have been revised and/or renamed, and I have noted the revised contents/names for the benefit of the reader. Also if you have taken similar courses in a formal setting (read another University) you may be able to skip the course or substitute it with one of the electives. Check with the student advisor if such is the case.

401: Introduction to Statistics

  • Course Content: 910
  • Professor Engagement: 1010
  • Overall Value to the Program: 1010
  • Overall Value to me: 1010

This was the first course I took in Summer-2013 along with 475. Prior to taking this course I had very rudimentary understanding of probability theory and statistics. This was a very good first course to take, and it set the tone for the whole program. The professor was very engaging and made sure that we understood the concepts of probability theory and descriptive and inferential statistics well. Most of the coding was done in SPSS.

My only two complains were a) use of SPSS instead of R, which I believe is an option now but wasn’t when I took it, and b) no course on Bayesian analysis. I really wish there was a core course covering Bayesian analysis techniques in addition to the traditional frequentist techniques learned in this course.

475: Project Management

  • Course Content: 710
  • Professor Engagement: 1010
  • Overall Value to the Program: 510
  • Overall Value to me: 810

I took this course in Summer-2013 along with 401. In the newer format of the program you are required to take either this course or foundations of leadership course. I had to take both as part of the older format. I was doubtful about the value of this course for analytics purposes, but the professor delivered a very compelling course, and I came out very satisfied having taken this course. Al though I have worked very closely with project managers I had no formal training in PM before this course, and suffice to say I have much better understanding of the process of project management as result of this course. The course did not however contain Agile/Scrum methodology of PM. I still question the value of this course to the overall program though.

317: Database Management

Now revised and renamed to 420: Database Systems and Data Prep

  • Course Content: 210
  • Professor Engagement: 710
  • Overall Value to the Program: 210
  • Overall Value to me: 0/10

I took this course in Fall-2013, alone. As you have noticed I have given very dismal ratings to this course across every dimension except professor engagement. I have no complains about the professor under whom I took this course, but I do have major complains about this course’s contents and its usefulness. I have been working with databases for more than 15 years so I was very doubtful what I was going to learn new. And my doubts were not only confirmed but I was very disappointed that instead of teaching students how to work with databases using SQL the entire course was spent in how to design rudimentary database schemas and normalization. Don’t get me wrong, database schema design and normalization are very important skills, but just not as useful to have as SQL mastery when it comes to data analysis. A data analyst is not the one who should be designing database schemas and worry about normalization. Also the course did not cover any NoSQL technologies. I felt this was a very poorly designed course, and even for people who have no database experience, it will give a wrong impression to them of having learned something useful.

I would redesign this entire course and make it about mastering SQL and other NoSQL data access techniques like APIs/SDKs. These skills will be more beneficial to a data analyst/scientist than what this course currently covers. And in fact it seems like NU has just done that, the course has been revised and renamed to ‘420: Database Systems and Data Prep’, and looks like its content are exactly what I am preaching about.

435: Data Mining and Data Warehouse

Now revised and renamed to 422: Practical Machine Learning

  • Course Content: 910
  • Professor Engagement: 1010
  • Overall Value to the Program: 1010
  • Overall Value to me: 1010

I took this course in Winter-2014, alone. This was a stellar course. We used the WEKA software for this course, but the newer format is delivered in Python/R. Regardless it was a really good course to study, as far as learning data mining / machine learning is concerned. We covered techniques for association rule mining, classification, clustering, decision trees etc. The data warehouse part was a bit of a miss. I would move it out of this course and put it in the database course.

My only complain is that this course needs a follow up course ‘Advanced Machine Learning’ which covers Neural Networks, Deep Learning, Image/Text Analysis in depth. This course has since been revised and renamed to ‘422: Practical Machine Learning’ and the new contents look even more appealing and now it’s delivered in Python/R instead of WEKA.

410: Predictive Modeling 1

Now renamed to 410: Regression and Multi Analysis

  • Course Content: 910
  • Professor Engagement: 1010
  • Overall Value to the Program: 1010
  • Overall Value to me: 1010

I took this course in Winter-2014, alone. This is where things got serious. Regression analysis, variable selection, PCA, factor analysis, and cluster analysis were all given their due share. This was one tough course and required a lot of time and efforts. The professor made a good job in stressing the importance of this course as the foundation for courses to follow. I would advice anyone to take this course alone and not pair it up.

I do have some comments. Towards the end it felt like a lot was crammed in the short 10 weeks. I would have taken factor analysis and cluster analysis out and covered some advanced regression techniques. Also the course was delivered in SAS and I hated SAS from the get go. They should at least offer an option to take this course in R/Python. The course has since been renamed to ‘410: Regression and Multi Analysis’, but the contents look very similar to the old one.

411: Predictive Modeling 2

Now broken into 2 courses, 411:GLM & 413:Time Series and Forecasting

  • Course Content: 410
  • Professor Engagement: 210
  • Overall Value to the Program: 310
  • Overall Value to me: 310

I took this course in Winter-2014, alone. In terms of the entire program this was perhaps the most disappointing course even more so than 317: Database. The course had generalized linear models (GLM), logistic regression, panel data analysis, and time series analysis all crammed in. It meant a very superficial treatment of each subject and no proper foundation was laid down of each subject. On top of that the professor was extremely arrogant and delivered the course very poorly. I am not the only one with this opinion.

In short, what could have been a great course was ruined completely by too much content and an incompetent professor. Also the course was once again delivered in SAS, without the option for R/Python. Thankfully the course has now been broken down into two separate courses, 411: Generalized Linear Models and 413:Time Series and Forecasting. I believe NU did get a lot of feedback from students like me and decided to do the right thing. A pity I could not benefit from it, but hopefully future students will.

412: Advanced Predictive Modeling

Now renamed to 454: Advanced Modeling Techniques

  • Course Content: 910
  • Professor Engagement: 1010
  • Overall Value to the Program: 910
  • Overall Value to me: 1010

I took this course in Winter-2014, alone. This was my first elective. I took it before completing the other core courses as it felt like a natural progression from 410 and 411. This was a complete do-it-yourself course. We had to select a Kaggle competition and work on it as a team. The point is to use whatever learned in 435 (now 422), 410, 411 (Now 411 + 413) in solving real world analytical problems. I was very glad I took this elective as nothing beats practical use of your acquired knowledge. There was no test, but we had to submit a proper report of our work as a team. I would highly recommend this elective to everyone as one of their two elective.

481: Foundations of Leadership

  • Course Content: 810
  • Professor Engagement: 510
  • Overall Value to the Program: 510
  • Overall Value to me: 710

I took this course in Winter-2015, along with 402. This is one of those soft skills courses which I suspect might prove its worth in the long run although in the short term it feels like spending way too much money to read a book. Again I question the value of this course in this program and many other students did too. Hence NU has now made this course optional, you can either take this or 475: Project Management. But I question the value of either of these courses, and would have loved to see a course on Bayesian Analysis instead. The professor was not very engaging, and we did not receive grading or feedback in time.

402: Introduction to Predictive Analytics

  • Course Content: 610
  • Professor Engagement: 1010
  • Overall Value to the Program: 610
  • Overall Value to me: 610

I took this course in Winter-2015, along with 481. This was the course that the student advisor recommends you take before any other course, but I took it quite late in the program. This was another poorly designed course. Although the professor was very engaging, the contents were very disconnected and non-cohesive. Survey methodology, dashboard design, and analytics field guide were bundled together and I got the impression of not learning enough of either of the contents. As there is already a data visualization related elective, I would get rid of dashboard design and make it all about experiment design and survey methodology. That would make this course more interesting and engaging and cohesive.

450: Marketing Analytics

  • Course Content: 910
  • Professor Engagement: 1010
  • Overall Value to the Program: 810
  • Overall Value to me: 610

I took this course in Winter-2016, alone. This was by far the toughest course for me. I am not sure what lead me to take this elective as marketing analytics is not my domain. I guess I wanted to broaden my horizon a bit, but it felt like I bit more than I can chew. The course straight away jumped in to digital marketing analysis using bayesian techniques. This was sort of a double whammy for me, as I had no experience with either. And once again I wish NU offered a course on Bayesian analysis as part of its core offerings. This course is really tough if you have no exposure to the domain or the technique. I barely got through this one, and contemplated dropping out of the course almost every week.

498: Capstone

  • Course Content: 710
  • Professor Engagement: 1010
  • Overall Value to the Program: 810
  • Overall Value to me: 810

I took this course in Spring-2016, alone. The culmination of the program, the Capstone. This was complete hands on practical work from view point of an analytics consulting firm helping a fictitious bank in building analytics capabilities through out the organization. No coding was involved, mostly analytics related grunt work. Estimations, cost-benefit analysis (CBA), planning and timeline etc. It was a good experience and probably very valuable for someone who wants to work in a consulting capacity. The amount of team work here was very heavy. Almost all submissions were teamwork focused and even individual submissions required team efforts. The biggest challenge here was being punctual in your submission as each subsequent assignment was laid on top of the previous one, so there was no room of being late.

Summary of Courses

In summary I really enjoyed 401, 410, 412, 435, 450, 475, 481. I am neutral towards 402, 498 and completely hated 317 and 411. Most of the analytics heavy courses were in SAS which again I was not too thrilled about.. In hind sight there were more positives than negatives, and I got to interact with a lot of talented folks from different fields, which was a very rewarding experience. I am really glad that NU has listened to the feedback and revised many courses and also offered many of the courses in R/Python. The only thing missing is a through introductory course on Bayesian Analysis

My advice to anyone planing on taking the courses is.

  • Don’t take more than two courses at a time.
  • Take the heavy courses 410,411,413,422 (Prev 435) one at a time.
  • Plan about 15-20 hours per week for each course.
  • Augment NU courses with free MOOCs from Coursera/Udacity etc.
  • You can meet and exchange ideas with with some really talented folks in the discussion boards, so don’t ignore them.
  • You have about 2 weeks window between each term. Use that to prepare for the upcoming course/s.
  • Brush up or learn some coding chops if you’ve never coded before.
  • Concentrate on understanding the concept rather than the tools or techniques.

Post-MSPA

As I said in the TL;DR section the program has been beneficial to me. I was able to switch to a Data Scientist role within my org after just completing 5 courses. And then switch to a Data Science Lead role at a different organization. Besides the career and monetary benefits, the personal satisfaction of diligently following a plan and executing it successfully is no small matter. I have learned how to better manage my time and multi-task full time job, family and education, which I did not think was possible.

My domain of operation is information security, and I met just one other cyber guy in the program, which sucks. I would highly recommend more and more infosec pros to educate themselves in the art of data analysis. The program has given me enough understanding of analytical skills that I feel confident in able to tackle analytical problems not just in my domain but other domains too. Domain knowledge/expertise certainly plays a big role, but a lot of analytics skill sets are easily transferable from one domain to another, provided you are willing to learn the new domain.

Should you enroll?

  • Are you a mid-career person who is looking for a new challenge or something new to learn ?
  • Are you curious about data and would love to know about the many ways in which it can be used?
  • Do you believe that by adding data analysis skills to your repertoire you can contribute more effectively to your organization?
  • Are you looking to boost your career prospects and long term employability?
  • Do you feel that a structured learning environment is better for your educational needs?

I don’t need to tell you that if the answer to these questions is yes, then you owe it to yourself to build your analytical skills. Whether you do it through NU’s MSPA program or any other means is up to you. I hope I have given you enough food for thought if you are considering NU’s MSPA.

Onwards

Learning never stops and time is never enough. This program will give you a good insight into the world of analytics, but there is tons more to learn and explore. For starters I want to revisit some of my Math skills (learned ages ago in high school and under grad), so as to better understand some of the foundations of statistics and machine learning. Revisiting some of my comp-sci fundamentals like data structures and algorithms would go a long way too. I have fairly decent exposure to things like Hadoop, Spark etc and I’ve also built up my R and Python skills, but there are always new things happening in this space. I am very much interested in building some domain expertise outside my core domain of information security, especially in digital marketing analytics and risk analytics. Did I mention the need to master Bayesian analysis?

The list of things to learn is endless but I now have access to resources I need and the confidence to tackle these one by one. If you have read this nearly 5,000 words long essay I congratulate you and thank you for your patience. If you want me to answer any questions I would gladly do so, just reach out to me on Twitter (@bhaskar_vk) or LinkedIn (bhaskarvk).