Analysis of Internet Movie Database (source: Kaggle) for an imaginary strategic objective for expanding the Disney+ Platform.

About Disney+

Disney+ (pronounced Disney Plus) is an on-demand streaming service, a division of The Walt Disney Company. They distribute movies and TV series produced in-house in addition to the Fox catalog of both new and classic movies and shows.

Target audience & why it is apt for this presentation

The primary audience is Disney+ executives who are key decision-makers of Disney+ strategy and execution. The executives are looking for analysis of the IMDB database as inputs to their upcoming strategy planning exercise…

Understanding the sources of unfairness in data-driven decision

Image Source:


This paper shares my takeaways, best practices and mitigation steps to mitigate unfairness and bias while designing machine learning algorithms.

Data science projects go wrong either due to flawed models or insufficiently/ incorrectly trained algorithms or emergent bias on new/ unanticipated contexts. Fairness is a human, not a mathematical decision, grounded in shared ethical beliefs. While machine learning does not make decisions based on feelings and emotions, it does inherit a lot of human biases leading to disparate impact. In this era where consequential decisions are algorithm-based it is imperative that they are fair, not perpetuated without users knowledge. …

ARIMA Model, Holt-Winters Double Exponential Smoothing, Bayesian Linear Regression

Photo Credits: Hush Naidoo. Source:


This paper utilizes Fargo Health Group dataset to forecast the demand for heart examinations expected in 2014 for Abbeville Health Center. It outlines how a business problem can be solved using a data-driven decision making approach and explains the methodology, model leveraged, ethical implications and recommendations for Fargo Health Group.

1 Problem Statement

Fargo Health Group faces the following business problems:

  • Health Center (HC) staffing is not optimal to meet the overall demand. HCs do not have the capacity to meet the deadline of <30 days for examinations due to lack of examining physicians and human resource constraints.
  • Significant amount of revenue is…

Analysis of diamond dataset to discover patterns & behaviors based on categorical and continuous features

Photo Credits: Edgar Soto.

Diamond pricing involves a complex mechanism influenced by multiple factors such as carat, cut, color, and price. This article analyzes the correlation between these factors and depicts with visualizations.

Exploratory data analysis
R diamond.csv dataset includes approximately 54K observations with 10 variables including carat, cut, color, clarity, depth, table, price, x (length in mm), y (width in mm), and z (depth in mm). Overall a clean dataset with no missing values or messy data.

Structure of the dataset (R lang)

Data visualizations created for academic purposes in Python

PayPal has long been the go-to for online payments as well as in-store, with a wallet share of 48% in all payment categories.Nearest competitor Apple Pay has a 15% market share followed by Google Pay at 11%.

On the platform end, Android smartphone users dominate the market at 73%, followed by the affluent Apple users at 26%.


Photo Credits Mauro Mora

About Face Recognition Technology

Our face serves as our identity more like a fingerprint in the modern world. Face recognition technology has gained attention in the last decade. Face Recognition technology enables detecting faces (both humans and pets) in an image or any locations using biometric technologies, mimicking how humans recognize faces, classify genders and race. Artificial Intelligence (AI) enables detecting faces with cameras and comparing them with a searchable database of faces using sophisticated methods for analyzing the depth of eyes, angle of jawlines, and other facial traits. The photos are usually passport size used for drivers license, passport and other IDs. Additional…

Animated visualizations of the correlation of GDP per capita andLife Expectancy

Bubble Chart

Cloropleth Map

Understanding legal considerations and ethics in web scraping

This paper does a critical review of literature on the topic of ethics in Web Scraping.

Web Scraping Explained
Big Web Data is dynamic content including HTML tables, blog, tweets, photos, audio, videos, structured and unstructured data. It evolves at extreme velocity, has high volume and variety.

Web scraping, a revolutionizing research practice, is described as the automated method of extracting and harvesting publicly available web data (Luscombe et al., 2021). Macapinlac (2019) defines web scrapers as a bot, involving components of website analysis, web crawling and data organization. A web request to retrieve data is sent to the…

Concept introduction, application, approach, and methodology

Photo credits Piotr Łaskawski,

Most businesses have untapped volumes of structured, semi-structured, and unstructured text-based data from internal and external sources. In a small-shop setup, the owner/proprietor would eyeball such data to get a pulse of customer sentiments. It would however be inaccurate while expensive. Given the storm of data bought by Big Data, it is cumbersome, time-consuming, and nearly impossible for humans to do this manually.

What is Text Analysis, Text Analytics or Text Mining?

They are all the same terms used interchangeably. Text analysis is a machine learning technique that helps efficiently mine enormous volumes of data in a scalable, unbiased, and consistent fashion across extracting valuable insights, trends, and patterns…

Poonam Rao

Exec Director StratEx - I bring to the table blend of data science, finance and strategy management skills with 20+ years of experience in insurance & fintech.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store