Enterprise-Intelligence-Development-CS-666

The purpose of this course is for students to gain a solid foundation in the most important tools and strategies for addressing the 3 most common challenges in enterprise business intelligence: 1) Reducing the time needed to produce insightful metrics and reports. 2) Freeing data trapped inside of legacy tools and federated datasources. 3) Providing a centralized framework for user interaction and consumption of analysis, reports and curated data.


Project maintained by akshayjadhav21 Hosted on GitHub Pages — Theme by mattgraham

Enterprise Intelligence Development Project

I have adressed two business problems,

  1. Regression Problem
  2. Classification Problem

The professional deliverable includes the following sections,

  1. Definition of the business problem
  2. Objectives
  3. Approach
  4. Data Import
  5. Data Analysis
  6. Data Visualization
  7. Findings valuable insights from data visualization
  8. Predictive modeling
  9. Challenges/Limitations
  10. Recommendations
  11. Future Scope

Regression Problem - Boston Housing Prices

Objectives

A real estate agent was asked to help the Sellers in selling their houses in Boston at reasonable prices based on given details of the houses such as,

Seller’s housing information:

Based on above details project forecasts incorporates immense, valuable research,

Approach

Data analysis and visualization on past Boston Housing data to find valuable insights from raw data and given data by Sellers

Building predictive algorithms to best recommend the house prices

Justification of recommended housing prices

Other necessary factors to consider for selling the house

Machine Learning Algorithms

Linear Regression

Optimization Technique

Ordinary Least Square Algorithm

Based on predicitive modeling, a real estate agent can provide the reasonable house prices which would be very useful for the sellers to sell their houses at good value in Boston area.

Boston Housing Dataset: Boston Housing Prices

Classification Problem - PIMA Indians Diabetes Data

Introduction

PIMA Indians are a group Native American people who lives in the Phoenix, Arizona. So many years, these people are are living with poor diet which where carbohydrate deficiency seems more and in turn, they are exposed to type 2 diabetes among children as well as adults.

To deal with such a huge and deadly disease, many Medicare organizations are trying to achieve their best possible solutions to diagnose the diabetes among children as well as adults to mitigate the risk of diabetes in the future, to reduce the period required for diagnosis with exact identification of people having diabetes as fast as possible, and to provide proper treatment for diabetic people for a reduction in the severity of complications associated with this disease.

Objectives

Client wants to make the best decision on a business problem of classifying the right number of people who have diabetes and who do not have diabetes to invest valuable time towards the proper treatment of diabetic patients rather than in diabetes testing.

Based on the business problem, project forecasts incorporates valuable research,

Approach

Data analysis and visualization on the PIMA Indians diabetes data to find valuable insights to provide more exposure to various important factors and their contribution

Building predictive algorithms to identify a person has diabetes or not

Justification and necessary medical attributes contributing to identifying a person as a diabetic

Recommendation for Client

Machine Learning Algorithms

  1. Logistic Regression
  2. K-Nearest Neighbor (KNN)
  3. Random Forest

Based on predicitive modeling, a client should be able to make a confident decision based on the performance of final model for identifying diabetic people from non-diabetic people.

PIMA Indians Diabetes Dataset: PIMA Indians Diabetes Dataset

Reference Books

1) Introduction to Statistical Learning (supervised learning models) by James, Witten, Hastie, and Tibshirani Available for free online: http://www-bcf.usc.edu/~gareth/ISL

2) Elements of Statistical Learning (deeper mathematical explanations): by Hastie, Tibshirani, and Friedman Also available online at : http://www.stat.stanford.edu/ElemStatLearn

3) R for Data Science Wicham and Grolemund: Available for free online: : https://r4ds.had.co.nz/

4) Automate the Boring Stuff with Python Al Sweigart Available for free online: https://automatetheboringstuff.com/

5) Think Python Allen B Downey Available for free online: http://greenteapress.com/thinkpython/html/index.html