Exploring the Field of Data Science
Today we will explore the field of Data Science with a focus on Financial Industry sector. Navigating the data science market can be hard. There are too many variations of the role that is not always cleanly packaged. I had the opportunity to interview Smith (name changed), Chief Data Scientist for a Fortune 500 company. This presentation-style article summarizes the insights gained from the interview.
LinkedIn ranked the Data Scientist job as the best in 2021, with 41% growth year over year while Glassdoor ranked it second best. According to the US Bureau of Labor Statistics (BLS), it is predicted that computer scientists’ job growth rate in the next decade will be 15%, which is three times more than the growth rate of the American labor market.
So, what does it take to be successful in this field? First strong theoretical foundational. Next good grasp of algorithms and knowledge to aptly select along with practical, real-world experience in use cases. The ability to think beyond the obvious cannot be understated.
Next, let’s talk about the best way of getting started in this field. Most jobs require a Masters in Computer Science or related fields to learn about machine learning, AI, and analytics. Smith suggests picking a specialization such as robotics, visualization, or modeling. Some aspirants may be PhDs or have a mash-up of skills combining business, and social sciences. It is important to build a portfolio of personal projects based on real-world use cases ideally focused on a specific industry domain.
Let’s see the skills, knowledge, and training required. Smith shares domain knowledge of financial industry, payments technology and fraud management. Skills to refine risk models for any new anomalies and training in MapReduce, Spark, Kafka, Flink, Scala, and Snowflake.
Let’s discuss the approach Smith takes on Data Science projects. Ideas propagate from the business side. The Business Analytics team facilitates conversations via ideation workshops. They engage the team to understand why they have the data questions and what they are trying to uncover. Projects are then prioritized based on the business value as well as the complexity, feasibility, and timeline criteria.
Let’s see how Smith prepares and gathers project information. They start by understanding the business context and deep dive to frame the problem statement. Information needs are identified in collaboration with experts. Sometimes data is available internally, and sometimes it has to be sourced.
let’s look at the importance of communication. Smith shares that it is important that business and IT visions are aligned with leadership driving a data-driven culture and expressing a sense of urgency. Speak business language, set clear expectations on timeline, and be transparent. As interesting patterns are identified validate before any deep analysis. Finally, craft compelling stories and justify actionable recommendations.
Let’s see the different types of communications Smith describes. These include brainstorming, storytelling, facilitating, coaching, and influencing across executive and cross-functional teams.
Now, let’s see what’s involved in undertaking a data science project. The lifecycle starts with defining goals, gathering relevant data from identified sources, cleaning data & enriching it, exploring insights followed by creating models. Feedback loops are established to adjust the model. Once the model is launched, there is ongoing training with new data and fine-tuning so it does not degrade over time.
Smith shared a few emerging trends. These include use of natural language processing and move towards cloud computing. He shares that today there are more complex and curated datasets that blend and correlate demographic, social statistics, and even geospatial data. Finally, AI governance is taking a front seat, aligning engineering practices like DataOps, ModelOps, and DevOps.
Smith shares some of the best ways to gain knowledge. He feels that self-study works for him. He places high value on practicing what he learns in addition to staying connected with peers and mentors.
A typical workday for Smith starts off with reviewing the data pipeline and model pipeline, status checks, attending standup meetings, reviewing model designs, and performing regression testing. Communicating and collaborating with business makes up 50% of his day. He also reads articles pertaining to the payments industry in his spare time.
Smith shares a few golden nuggets of advice to someone aspiring to make a career in this field. He says that you first need to love data. It matters to always have an attitude of being a student regardless of how long you have been in the industry and keeping abreast of technology changes. He suggests being self-aware, keeping the ego in check, and surrounding yourself with people that will help you grow. He adds that one should not be emotionally attached to work done in the past and be open to change. Finally, work hard and balance life. That brings me to the end of my presentation. We learned about the ins and outs of a career in Data Science.
References
Level Up your Career — These are the Best Jobs for 2021 . 2021. https://www.glassdoor.com/research/best-jobs-in-america-for-2021/#
The Best Jobs in America in 2020. 2020. https://www.glassdoor.com/blog/the-best-jobs-in-america-2020/
US Bureau of Labor Statistics. 2021. https://www.bls.gov/ooh/computer-and-information-technology/computer-and-information-research-scientists.htm