Exploratory Data Analysis
Contents
Exploratory Data Analysis#
Pandas Profiling#
Automate your exploratory data analysis (EDA) analysis with one line of code!
When I start working on any new dataset, I subconsciously follow the same set of EDA steps to understand the data better.
The pandas’
#!python df.describe()
function is too basic for serious EDA work, so I manually create many plots and summary statistics. Most of the time, I need to Google for some specific syntax or solve some coding bug. All this makes the whole EDA process quite tedious at times.
Enter Pandas profiling!
Pandas profiling is an open-source Python library that automates many of those “best-known methods” in EDA to prepare a detailed interactive report with just 1 line of code!
You can then click through the various tabs and analyze the results without manually creating everything yourself. What a time saver!
Specifically, Pandas profiling automatically calculates column statistics, plots histograms, correlation coefficients, etc.
pip install pandas-profiling
🌟 Github: https://github.com/ydataai/pandas-profiling
▶️ Play with it on Binder