Skip to main content

Featured

Skillset, topics, projects and virtual internships for DS

This post is for those who are beginner and do not have any idea about topics that they need as a beginner DATA SCIENCE/ DATA ANALYST.  I am also facing the same problem before a year ago and till date I have some relevant knowledge about data science and also have some projects.  People are saying the we need so many skills like Mathematics, Programming language, some cloud concepts too. Actually they are right. Being a Data Scientist is not like being a web developer or a front-end developer that have limited skill set.  In this post I will tell you the exact topics that you need to learn at beginner level. MATHEMATICS Descriptive Statistics, distributions, hypothesis testing and regression analysis. Bayesian Thinking, conditional probability, priors, maximum likely hood. Vectors and matrices Matrices operations Eigenvalues and eigenvectors Linear and non linear functions Multivariable calculus  PROGRAMMING LANGUAGE(Python or R)    Data types, String oper...

KPMG Data Science Virtual Internship Task 2 | KPMG virtual internship

I'm sharing the content of my PPT that I submitted as Task 2 solution for KPMG Virtual Internship.

Let me add some snapshots of my Presentation, ( It's not very good looking but consize all meaningful words)




Now what you have to do is just copy the below content and paste in a ms PowerPoint file.

In the place of Powerpoint, you can also make a document file.

-------------------------------
Sections

1.Data exploration
2.Model development
3.Interpretation and report


 Data Exploration

Understand the characteristics of given fields in the underlying data such as variable distributions, whether the dataset is skewed towards a certain demographic and the data validity of the fields. For example, a training dataset may be highly skewed towards the younger age bracket. If so, how will this impact your results when using it to predict over the remaining customer base.

There are some limitations  in the given datasets like some values are missing and some  data types are different according to their value.

Furthermore, transformation of required data so that it is in an appropriate format for analysis. This may include steps such as ensuring that the data types are appropriate and rolling data up to an aggregated level. Or, joining in already aggregated ABS data at a geographic level to create additional variables.

Document assumptions, limitations and exclusions for the data; as well as how you would further improve in the next stage if there was additional time to address assumptions and remove limitations.

Model Development

1.First of all, we have to determine a hypothesis related to the business question that can be answered with the help of existing data. Perform statistical testing to determine if the hypothesis is valid or not.

2.Create calculated fields based on existing data, for example, convert the D.O.B into an age bracket

3. Test the performance of the model using factors like residual deviance, AIC, ROC curves, R Squared). Appropriately according to the model performance, assumptions and limitations.

Interpretation and Report

Visualization and presentation of findings. This may involve interpreting the significant variables and co-efficient from a business perspective.

With the help of this slide, we get an idea around the business issue and support our case with quantitative and qualitative observations.

_______________________________


Thats how you will make your presentation.
Pleanse Note - Sorry for the inconvenience, I'm receiving number of emails regarding ppt and this is not possible for me to send ppt each and every one.

Thank you.

Comments

  1. The content in this PPT is simply a guide for virtual interns to use and not the solution they are seeking. For example, in the "Interpretation and Report" section, the PPT is supposed to include actual visualizations and representations, not a description of what interpretations and reports are...

    ReplyDelete

Post a Comment