Exploratory Data Analysis

EDA (Exploratory Data Analysis)

Abhishek Mehta
2 min readFeb 21, 2023

What is EDA (Exploratory Data Analysis)?

EDA (Exploratory Data Analysis) is the process through which we extract data from a website, and save it in a form which is easy to read, to understand and to work on.

When we say ‘Easy to work on’, we mean to say that the data thus extracted can be used to get a lot of useful insights and answer a lot of questions, finding answers to which would not be such an easy task, if we did not have that data stored with us in a simple and sorted manner, i.e. generally in an Excel File or a CSV file.

Exploratory Data Analysis, or EDA, is an important step in any Data Analysis or Data Science project. EDA is the process of investigating the dataset to discover patterns, and anomalies (outliers), and form hypotheses based on our understanding of the dataset.

EDA is basically used to see what data can reveal beyond the formal modelling or hypothesis testing task and provides better understanding of data set variables and the relationship between them. Originally developed by American mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used method in the data discovery process today.

EDA can help us deliver great business results, by improving our existing knowledge and can also help in giving out new insights that we might not be aware of.

Tools Used

  • Downloading Dataset:

opendatasets (Jovian library to download a Kaggle dataset)

  • Data cleaning:

1. Pandas

2.Numpy

  • Data Visualization:

1.Matplotlib

2.Seaborn

3.plotly

4.folium

--

--

Abhishek Mehta
Abhishek Mehta

Written by Abhishek Mehta

Studied Data Science with focus on Building Machine Learning Models. Worked as Business Analyst, I enjoy Machine Learning & to Procure solutions for businesses.

No responses yet