I often get asked by many people starting out in data science and analytics ‘What are the key capabilities of a successful data science and analytics professional - now and in the future’? What are the things I should learn if I want to become a really good data science and analytics professional?
When it comes to building a strong foundational skill set in data science and analytics, I normally suggest people to consider, what I call, the 4Ts of Data Science and Analytics skills.
The 4Ts of Data Science Skills revolves around four core competencies that every analyst should possess: Tool, Techniques, Tactics and Tact.
The first T relates to Tool. That is, what are the key tools or software I should learn in order to do my job now and in the future.
And here I usually recommend people to move away from the more common discussion around ‘this is the best tool to start my career…’ or ‘… a lot of people are starting to use this tool now, so I should follow suit and learn it too…’. Obviously, it’s important to read and keep on top of any new development in the space and also what are the most popular tools being used here and there. But, I also recommend to step back and look at how data science and analytics is normally delivered - which is… I’ve got a problem or an opportunity… and data can help me solve it… so I go on and analyse things… and then I can provide a recommendation or a solution.
This flow, this analytics workflow, or the steps from defining a problem to delivering the results is what could be also beneficial when deciding the tools analysts need to master. And it’s not one tool or another. I see as a combination of different tools.
For example, many of us know that the old spreadsheet is still a very much used tool for data analysis. Particularly when you work with small or medium businesses. When it comes to dealing with larger datasets, SQL knowledge is a must-have for any serious analyst. R and Python - and many other open source or commercial options by the way - in that space do a phenomenal job in the more sort of statistical and machine learning type tasks. And you can’t ignore the power and flexibility of some of the most popular commercial data visualisation tools out there.
So instead of saying ‘okay I’ve got to learn this and that tool - because I read somewhere or because it’s the flavour of the month’… step back and think of that general workflow that many organisations use and decide which tools are the most appropriate for each step and go on and pick the most suitable and dedicate yourself to learning them.
Obviously, you should also consider the infrastructure capabilities the organisation you are working with has, the expected learning curve of one tool vs another, cost of acquisition and maintenance, and many many other factors too.
The second T stands for Techniques. So once a problem or opportunity has been defined, what is the most appropriate technique I should apply to solve it or come up with a solution. Obviously, you’ve got to consider and take into account whether you’re applying data science and analytics to build products, for example, or you’re applying data science and analytics to provide business insights and recommendations to senior business leaders. Depending on the application, the techniques required will vary.
For example, when you’re trying to solve a marketing mix and attribution problem for marketers, I’ve found that the level of interpretability of your work is - at least initially - more important than having a remarkably accurate model. And many data science and analytics professionals know in the industry that there is a bit of a trade-off between interpretability and accuracy when building solutions.
In other applications you or your client may not really care that much about interpretability and what you’re trying to optimise for is really some sort accuracy metric - because you need your solution to give the right answer most - if not all - of the time.
Also, there may be cases where predictive analytics is not even the answer, sometimes in business particularly - people just need you to help them explore what the existing data tells them and provide them with some sort of historic view of their business a static report or interactive dashboard.
A good analyst should really understand the application and the setting they’re working on, so they can successfully pick the right technique for the job.
The third T is for Tactic. And Tactic here is about being able to define the level of formality that the work you’re performing requires. Is this a full-blown project that requires a lot of engagement with everyone in the organisation or is this a simple descriptive analytics type project involving reports and strategic insights?
Sometimes, all that it takes is a couple of days of work - sometimes you’ve got a well-defined problem and the data is well structured and easily accessible and the end product is really quick to develop.
And if it is a full-blown project, then it’s important to consider to what degree can the organisation or the team you’re working with accommodate a more agile form of delivery. I’ve found that in some cases, having a more iterative approach to building solutions is far more beneficial to the client and the analyst alike. You learn fast, you deliver faster and you also fail fast too - if that’s needed.
The last T stands for Tact. Tact is about having the ‘skill and sensitivity to deal with a variety of stakeholders, with external solution providers, business leaders and the ability to deal with difficult issues or difficult people’. And this ability is very important - if not - one of the most important skills for data science and analytics professionals now and in the future. Many surveys have found that business leaders are looking to hire analytics people who’s got a good balance between technical and also soft skills. The successful data science professional that wants to grow in their career will invariably have to learn how to manage stakeholders, how to deal with difficult situations, how to sell their ideas, how to document their work in a non-technical friendly way and so forth.
I also add to this the ability to tell stories, to ask questions, to find root cause. In fact, as we - as a profession - become more capable in dealing with large datasets, the differentiation will come from the ability to ask questions to interact with complex environments, to deal with ambiguity and ultimately tell a story. So here I suggest to those who are starting out or those who’ve been in the space for a while, to really get out of your comfort zone as much as possible.
I believe the proposed 4Ts above can give a broader perspective to consider when it comes to building data science and analytics capabilities, now and in the future, for those starting out or for those who are already working in the profession.