Skip to main content

Featured

Skillset, topics, projects and virtual internships for DS

This post is for those who are beginner and do not have any idea about topics that they need as a beginner DATA SCIENCE/ DATA ANALYST.  I am also facing the same problem before a year ago and till date I have some relevant knowledge about data science and also have some projects.  People are saying the we need so many skills like Mathematics, Programming language, some cloud concepts too. Actually they are right. Being a Data Scientist is not like being a web developer or a front-end developer that have limited skill set.  In this post I will tell you the exact topics that you need to learn at beginner level. MATHEMATICS Descriptive Statistics, distributions, hypothesis testing and regression analysis. Bayesian Thinking, conditional probability, priors, maximum likely hood. Vectors and matrices Matrices operations Eigenvalues and eigenvectors Linear and non linear functions Multivariable calculus  PROGRAMMING LANGUAGE(Python or R)    Data types, String operations, Expressions and varia

Types of Data in Data Science

In my last post INTRODUCTION TO DATA SCIENCE, I used two terms for data that is structured data and unstructured data. In this post I'll give you the brief introduction about data generally we talked in Data Science.

There are basically 4 types of data.
1. Structured
2. Unstructured
3. Continuous
4. Discrete




Structured Data

The data which is given in well formed as you can easily access it and easily perform all algorithm(searching algorithm or anything else) is called structured data.
We can say like that
"Structured data is just like a form of formatted repository just like a database so that its elements are more addressable for further procces or analysis."

Example
Name of person, age, number of books etc all data given in the form of rows and columns.


Unstructured Data

It is opposite of structured data as in case of unstructured data we can't perform any algorithms easily it will take some time and more complexity there...
We can say like that
"Information that doesn't have predefine model or is not organized in pre-defined manner. Unstructured data typically text heavy, but may contain data such as dates , numbers and facts as well."

Example
Image, videos , mp3 files etc..


Continuous Data

Data which have infinite number of possible values and when we plot of graph with these data there is not any separation or split between curve...




Discrete Data

Data which have finite number of possible values and during graph represent there is a space between curves or line...



With this I'm concluding this post and I'm trying my best to be consize and be clear what are these data and we have to interact with these data in Data Science field...


Thanking you.


If any query, feel free to connect
https://www.instagram.com/kavyansh.pandey

Comments

Post a Comment

Popular Posts