Analysis of Internet Movie Database (source: Kaggle) for an imaginary strategic objective for expanding the Disney+ Platform.
Disney+ (pronounced Disney Plus) is an on-demand streaming service, a division of The Walt Disney Company. They distribute movies and TV series produced in-house in addition to the Fox catalog of both new and classic movies and shows.
Target audience & why it is apt for this presentation
The primary audience is Disney+ executives who are key decision-makers of Disney+ strategy and execution. The executives are looking for analysis of the IMDB database as inputs to their upcoming strategy planning exercise…
This paper shares my takeaways, best practices and mitigation steps to mitigate unfairness and bias while designing machine learning algorithms.
Data science projects go wrong either due to flawed models or insufficiently/ incorrectly trained algorithms or emergent bias on new/ unanticipated contexts. Fairness is a human, not a mathematical decision, grounded in shared ethical beliefs. While machine learning does not make decisions based on feelings and emotions, it does inherit a lot of human biases leading to disparate impact. In this era where consequential decisions are algorithm-based it is imperative that they are fair, not perpetuated without users knowledge. …
This paper utilizes Fargo Health Group dataset to forecast the demand for heart examinations expected in 2014 for Abbeville Health Center. It outlines how a business problem can be solved using a data-driven decision making approach and explains the methodology, model leveraged, ethical implications and recommendations for Fargo Health Group.
Fargo Health Group faces the following business problems:
Diamond pricing involves a complex mechanism influenced by multiple factors such as carat, cut, color, and price. This article analyzes the correlation between these factors and depicts with visualizations.
Exploratory data analysis
R diamond.csv dataset includes approximately 54K observations with 10 variables including carat, cut, color, clarity, depth, table, price, x (length in mm), y (width in mm), and z (depth in mm). Overall a clean dataset with no missing values or messy data.
Structure of the dataset (R lang)
This paper does a critical review of literature on the topic of ethics in Web Scraping.
Web Scraping Explained
Big Web Data is dynamic content including HTML tables, blog, tweets, photos, audio, videos, structured and unstructured data. It evolves at extreme velocity, has high volume and variety.
Web scraping, a revolutionizing research practice, is described as the automated method of extracting and harvesting publicly available web data (Luscombe et al., 2021). Macapinlac (2019) defines web scrapers as a bot, involving components of website analysis, web crawling and data organization. A web request to retrieve data is sent to the…
Most businesses have untapped volumes of structured, semi-structured, and unstructured text-based data from internal and external sources. In a small-shop setup, the owner/proprietor would eyeball such data to get a pulse of customer sentiments. It would however be inaccurate while expensive. Given the storm of data bought by Big Data, it is cumbersome, time-consuming, and nearly impossible for humans to do this manually.
They are all the same terms used interchangeably. Text analysis is a machine learning technique that helps efficiently mine enormous volumes of data in a scalable, unbiased, and consistent fashion across extracting valuable insights, trends, and patterns…
The 4th technological revolution in agriculture is Big Data (the former being industrial, green and biotechnology). Today Smart Information Systems (SIS) are used in almost all agribusiness operations from seeding to harvesting and expected to play a big role in the future. This literature review evaluates the advantages, sociotechnical systems impacts and ethical issues of artificial intelligence and data-driven farming analytics on the agricultural industry & farming practices. Mark (2019) mentions that at the time of his writing, key journals on agricultural and environmental ethics lacked published research on the domain.
Approximately 27% of the world population is…
People v. Collins was a 1968 American robbery trial noted for its misuse of probability and as an example of the prosecutor’s fallacy. (Wikepedia, 2021)
After a mathematics instructor testified about the multiplication rule for probability, though ignoring conditional probability, the prosecutor invited the jury to consider the probability that the accused (who fit a witness’s description of a black male with a beard and mustache and a Caucasian female with a blond ponytail, fleeing in a yellow car) were not the robbers, suggesting that they estimated the odds as:
Black man with beard 1 in 10
Man with mustache1 in…
Major customers served by Transport for London include domestic and international passengers on all the modes of transportation for its overground, underground services from cycles to river services and everything in between. In addition, TfL customers include contract operators of a bus, river services, and tram.
With TfL’s slogan of “Every journey matters,” they have leveraged Big Data analytics to improve customer engagement. Few ideas include:
Exec Director StratEx - I bring to the table blend of data science, finance and strategy management skills with 20+ years of experience in insurance & fintech.