Analysis of Internet Movie Database (source: Kaggle) for an imaginary strategic objective for expanding the Disney+ Platform.

About Disney+

Disney+ (pronounced Disney Plus) is an on-demand streaming service, a division of The Walt Disney Company. They distribute movies and TV series produced in-house in addition to the Fox catalog of both new and classic movies and shows.

Target audience & why it is apt for this presentation

The primary audience is Disney+ executives who are key decision-makers of Disney+ strategy and execution. The executives are looking for analysis of the IMDB database as inputs to their upcoming strategy planning exercise…


Understanding the sources of unfairness in data-driven decision

Image Source: https://www.infoclutch.com/installed-base/artificial-intelligence-big-data-both-together/

Abstract

This paper shares my takeaways, best practices and mitigation steps to mitigate unfairness and bias while designing machine learning algorithms.

Data science projects go wrong either due to flawed models or insufficiently/ incorrectly trained algorithms or emergent bias on new/ unanticipated contexts. Fairness is a human, not a mathematical decision, grounded in shared ethical beliefs. While machine learning does not make decisions based on feelings and emotions, it does inherit a lot of human biases leading to disparate impact. In this era where consequential decisions are algorithm-based it is imperative that they are fair, not perpetuated without users knowledge. …


ARIMA Model, Holt-Winters Double Exponential Smoothing, Bayesian Linear Regression

Photo Credits: Hush Naidoo. Source: https://unsplash.com/photos/yo01Z-9HQAw

Preface

This paper utilizes Fargo Health Group dataset to forecast the demand for heart examinations expected in 2014 for Abbeville Health Center. It outlines how a business problem can be solved using a data-driven decision making approach and explains the methodology, model leveraged, ethical implications and recommendations for Fargo Health Group.

1 Problem Statement

Fargo Health Group faces the following business problems:

  • Health Center (HC) staffing is not optimal to meet the overall demand. HCs do not have the capacity to meet the deadline of <30 days for examinations due to lack of examining physicians and human resource constraints.
  • Significant amount of revenue is…


Analysis of diamond dataset to discover patterns & behaviors based on categorical and continuous features

Photo Credits: Edgar Soto. https://unsplash.com/photos/gb0BZGae1Nk

Diamond pricing involves a complex mechanism influenced by multiple factors such as carat, cut, color, and price. This article analyzes the correlation between these factors and depicts with visualizations.

Exploratory data analysis
R diamond.csv dataset includes approximately 54K observations with 10 variables including carat, cut, color, clarity, depth, table, price, x (length in mm), y (width in mm), and z (depth in mm). Overall a clean dataset with no missing values or messy data.

Structure of the dataset (R lang)


Understanding legal considerations and ethics in web scraping

Purpose
This paper does a critical review of literature on the topic of ethics in Web Scraping.

Web Scraping Explained
Big Web Data is dynamic content including HTML tables, blog, tweets, photos, audio, videos, structured and unstructured data. It evolves at extreme velocity, has high volume and variety.

Web scraping, a revolutionizing research practice, is described as the automated method of extracting and harvesting publicly available web data (Luscombe et al., 2021). Macapinlac (2019) defines web scrapers as a bot, involving components of website analysis, web crawling and data organization. A web request to retrieve data is sent to the…


Concept introduction, application, approach, and methodology

Photo credits Piotr Łaskawski, Unsplash.com

Most businesses have untapped volumes of structured, semi-structured, and unstructured text-based data from internal and external sources. In a small-shop setup, the owner/proprietor would eyeball such data to get a pulse of customer sentiments. It would however be inaccurate while expensive. Given the storm of data bought by Big Data, it is cumbersome, time-consuming, and nearly impossible for humans to do this manually.

What is Text Analysis, Text Analytics or Text Mining?

They are all the same terms used interchangeably. Text analysis is a machine learning technique that helps efficiently mine enormous volumes of data in a scalable, unbiased, and consistent fashion across extracting valuable insights, trends, and patterns…


About AgAnalytics

The 4th technological revolution in agriculture is Big Data (the former being industrial, green and biotechnology). Today Smart Information Systems (SIS) are used in almost all agribusiness operations from seeding to harvesting and expected to play a big role in the future. This literature review evaluates the advantages, sociotechnical systems impacts and ethical issues of artificial intelligence and data-driven farming analytics on the agricultural industry & farming practices. Mark (2019) mentions that at the time of his writing, key journals[1] on agricultural and environmental ethics lacked published research on the domain.

Approximately 27% of the world population is…


People v. Collins was a 1968 American robbery trial noted for its misuse of probability and as an example of the prosecutor’s fallacy. (Wikepedia, 2021)

Trial (Source: Wikipedia, 2021)

After a mathematics instructor testified about the multiplication rule for probability, though ignoring conditional probability, the prosecutor invited the jury to consider the probability that the accused (who fit a witness’s description of a black male with a beard and mustache and a Caucasian female with a blond ponytail, fleeing in a yellow car) were not the robbers, suggesting that they estimated the odds as:

Black man with beard 1 in 10
Man with mustache1 in…


Jay Wennington. https://unsplash.com/s/photos/tfl

Major customers served by Transport for London include domestic and international passengers on all the modes of transportation for its overground, underground services from cycles to river services and everything in between. In addition, TfL customers include contract operators of a bus, river services, and tram.

With TfL’s slogan of “Every journey matters,” they have leveraged Big Data analytics to improve customer engagement. Few ideas include:

  • Guide customers on choosing the most efficient and cost-saving route, especially important for international visitors. Make it easier to get cash back on Oyster Card balance at the end of their trips.
  • Provide staff…

Poonam Rao

Exec Director StratEx - I bring to the table blend of data science, finance and strategy management skills with 20+ years of experience in insurance & fintech.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store