This is Part III of my series on Data Science.

- Part I — EDA on Titanic Survival Problem
- Part II — EDA on Iowa Housing Prices Prediction
- Part III — Model Building, Evaluation, and Ensembling

Parts I and II cover Exploratory Data Analysis on Titanic Survival Problem where the target variable ‘Survived’ is binary and Iowa Housing Prices Prediction in which the target variable ‘SalePrice’ is a continuous variable. My analysis included Data Cleaning, Feature Engineering, Correlation Study, Univariate, Bivariate, and Multivariate Analysis, using some basic statistics, a few pandas operations, and a whole bunch of visualization tools from seaborn…

In this write-up, we tackle the problem of predicting the sale price of houses located in Ames, Iowa, using 79 explanatory variables that explain almost every aspect of the house. This is Part II of my series of Getting Started with Data Science. It covers EDA on Iowa Housing Prices data from the Kaggle competition — House Prices: Advanced Regression Techniques.

- Part I — EDA on Titanic Survival Problem
- Part II — EDA on Iowa Housing Prices Prediction
- Part III — Model Building, Evaluation, and Ensembling

A detailed notebook on comprehensive EDA can be found at reyscode/start-datascience.

EDA is a statistical approach for visualizing and analyzing data before making up a hypothesis or modeling.

This is **Part I** of my series on Getting Started with Data Science.

- Part I — EDA on Titanic Survival Problem
- Part II — EDA on Iowa Housing Prices Prediction
- Part III — Model Building, Evaluation, and Ensembling

In this write-up, we will explore Titanic — Machine Learning from Disaster competition from Kaggle. I present a summary of my analysis and insights in this blog. A detailed notebook containing a comprehensive study of the Titanic data is available at github — reyscode/start-datascience.

With the advent of powerful machine learning algorithms, more and more of our applications are becoming data-driven. The largest source of information humankind has managed to accumulate is the internet. When we think of the internet as a source of useful data, we think of scraping text, images, and other valuable information from web pages. The acquired data would be wrangled, cleaned, and transformed into a format suitable for further analysis and predictive modeling. A big chunk of the feature engineering process had been made obsolete by Deep Learning.

The internet is full of structured data. When we scrape the…

This writeup explains how to solve a polynomial equation of any degree.

We do not have any predefined formula to solve a polynomial of degree more than 3. Numerical methods are mathematical tools that can be used to find the approximate roots of a polynomial.

So, we are basically implementing an algorithm that combines a numerical method and synthetic division to solve a polynomial equation. Check out my code here.

Our implementation takes a list of coefficients of a polynomial as input and gives all roots of it.

We can make use of the Bisection method to find a root…

Aspiring Data Scientist, actively looking for job opportunities...