Bhaskar Karambelkar's Blog

The 10 commandments for hiring Data Scientists


Tags: satire data scientist humor

As a Data Scientist (whatever it means), I get a lot of job offers over LinkedIn and other channels. Although I’m not actively looking for a job, I still go through them. One just because I’m curious to find out what exactly do organizations look for in a Data Scientist, and secondly to amuse myself. This post is about the later part, it amuses me to no end what some people want in a Data Scientist, and I’ve made a consolidated list for all the recruiters and organizations who are looking to hire one (or more).

Warning : If satire is not your cup of tea (coffee/soda) you should most certainly not refrain from not reading this article.

Rule 1: Thou shalt not have a freaking clue what Data Science is all about.

But you should still throw in terms like Artificial Intelligence, deep learning, Neural Networks, SVM (admit it, you don’t even know what it stands for). You are not concerned whether the applicant can apply his knowledge to solve the problem at hand, all you really want to know is whether he knows the difference between supervised and unsupervised learning.
In short don’t worry about what the applicant can do, just worry about how much he can memorize and regurgitate. Oh! and have him describe the Apriori algorithm over the telephonic interview.

Rule 2: Thou shalt use a fishing net to grab as many as you can, and figure out what to do with them later.

Have like 10 or 15 openings for the same post. Be very vague about what it is that you exactly expect these people to do. Better yet have a complete lack of understanding of what your problems are and how you think Data Scientists can help you solve it. Just be sure to mention you have tons of data. Yeah that’ll make them bite.

Rule 3: Thou shalt put ‘Data Scientist’ even if what you really want a code monkey.

Well why not? I mean you can’t attract developers to work for you with ‘We need you to work 12 hours a day, 7 days a week, 365 days a year’. But hey if you just change that position from Software Developer to Data-Scientist, lo and behold the bees come flying to your honeypot. And if they can fix your crappy website css code on the side while doing data science-y stuff it’s a win-win.

Rule 4: Thou shalt never mention salary range or benefits.

It’s not like Data-Scientists are in hot demand or anything. They should be grateful you even put up a job post for them to see and apply. And who do they think they are demanding top notch compensation for the efforts and handwork they put in acquiring their skills. And if you really think about it, free laundry is all the benefits someone needs anyways.

Rule 5: Thou shalt not care about the age v/s experience paradox.

We want a PhD. with 10 years of work experience, who’s young and has the zeal and energy of someone just out of the college, coz you know we need his skills to make more people click our in-your-face video pop-up ads. Plus we really can’t just say under-30 single male (that would get us sued), so we just go with young, energetic, likes to work in a startup environment, doesn’t mind staying up late in office (hey free pizza and sodas!).

Rule 6: Thou shalt extol the virtues of working in a startup in a way that would make a Bangladeshi garment factory owner blush.

  • Long never ending hours at work - CHECK
  • Jack of All trades job duties - CHECK
  • Low pay but promise of Stock Option - CHECK
  • No real usable health care coverage - CHECK
  • Foosball/Ping-Pong table - CHECK
  • Screwing over loyal employees by selling and cashing out - PRICELESS

Rule 7: Thine post shalt be scattered with worthless terms like web-scale, big data, well-funded startup.

As if terms like ‘leverage’, ‘synergy’ were not enough. Our dear data scientist must know how to work with ‘BIG DATA’. The more the meaningless and worthless terms in our job posting the better, it will allow us to hire the crème de la crème of analytics talent. Also throw in the fact that all the founders have PhDs in bio-informatics or AI or machine learning etc. coz you know that is so critical to have when it comes to effective leadership.
Also when was the last time someone advertised themselves as a ‘piss-poorly funded startup’?

Rule 8: Thine Data-Scientists must know every programming language under the sun (even the ones not invented so far).

Coz you know programming is where it’s at. If you can’t code you can’t do jack. And what do you mean you only know R or python ? All the cool kids are using Ruby or Node.js. And don’t tell us you can’t write enterprise applications using Java/J2EE/EJBs. Oh! and please do explain in great detail how a Hashtable works. No job interview is complete without it. In short if you see an IP address the first thing that should cross your mind is Visual Basic GUI.

Rule 9: Hadoop Hadoop Hadoop, wait I’m forgetting something, Ah! yes Hadoop.

GEORGE: “Why don’t they have Hadoop in the mix?”

JERRY: “What do you need Hadoop for?”

GEORGE: “Hadoop is now the number one data crunching framework in America.”

JERRY: “You know why? Because people like to say “Hadoop.” “Excuse me, do you have any Hadoop?” “We need more Hadoop.” “Where is Hadoop? No Hadoop?”

Rule 10: Thine Data-Scientist should be able to code, design web-apps, be an Agile Scrum master, Software Architect, Project Manager, Product Manager, Sales/Marketing Guru. Did I mention a unicorn ?

OK no satire on this one. Just straight up practical advice. Data scientists are not all knowing superhuman beings. Figure out what is it that your organization wants to do with data and hire well trained and decently experienced people who can solve your challenges. If looking for novice employees make sure the job has enough breathing room for them to grow into gradually. And lastly don’t go looking for unicorns because a) they don’t exist, and b) the best thing they can do is make 6-8 year old girls scream in high pitch.

footnote : The difference in percentage of animals harmed in writing this post and percentage of animals that would have been harmed had I not written this post is not statistically significant.