Day 7: House Price Predictor Using Machine Learning In Python

Hello to all, how was your day six of the challenge?? Comment down fast, first. With my #7daysofml day7.

I hope it was great. Here’s how my house price predictor turned out to be.

#7daysofml

Goal For The Day

Make the house price predictor model come into a working condition.

That’s it!

Sounds simple doesn’t it?

But today, it seemed like I would fail on the last day. Until writing this article, I thought I couldn’t make it.

Experience Of The Day

I wanted to finish up the challenge right away in the morning.

But, surprise, I was sick. So, I delayed it to the original time after completing my day’s task right in the evening.

When I started putting the model well together, all I got for more than 4 hours was not the house price predictor but, a lot of errors.

As I am new to machine learning it all seemed more difficult than it was.

At some point it felt like I would have to write: Sorry guys I couldn’t complete the challenge.

But, as I kept working everything started falling in place. Of course, Bard helped a lot. (Thank you, Bard!!)

It is important what’s the quality of the code right now. It’s about what I learnt in these 7 days.

Learnings From The Seven Days

Yes, I did get a very crisp and a bit depth idea of how machine learning works, the ins and outs.

And a lot more around that.

I learned a lot of other skills around this challenge:

Time management: Ask anyone who knows me and they will say, I am never on time anywhere. But, this wow.
Prioritizing Myself and My Goals: I am an always yes sort of person. I can’t say no. Cause I think it is my responsibility to help everyone around me. Not bad but, this everyone didn’t include me.🥺 This time it was different.
Giving up the Perfection: I am a crazy perfectionist. And that’s not good. I am too scared to do anything most of the time. What if someone says it’s not good enough? With this, I understood, that constructive criticism is all that’s needed to take things to the level of perfection.

There is a lot more but I am sure you are bored right now.

And I have made a lot of blunders in this while too. Can’t share them as they are just as normal even without the challenge.

So, tell me how it went for you??

Had house price predictor or list of errors?

House Price Predictor Complete Source Code

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression

# Load the housing data
housing = pd.read_csv("housing.csv")

# Separate features (X) and target variable (y)
X = housing.drop("median_house_value", axis=1)
y = housing["median_house_value"]

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a pipeline for numerical and categorical feature preprocessing (with OneHotEncoder)
num_pipeline = Pipeline([
    ('imputer', SimpleImputer(strategy="median")),
    ('std_scaler', StandardScaler()),
])

cat_pipeline = Pipeline([
    ('onehot', OneHotEncoder()),
])

full_pipeline = ColumnTransformer([
    ("num", num_pipeline, list(X_train.select_dtypes(include=[np.number]))),
    ("cat", cat_pipeline, ["ocean_proximity"]),
])

# Fit the pipeline to training data
X_train_prepared = full_pipeline.fit_transform(X_train)

# Train the model
model = LinearRegression()
model.fit(X_train_prepared, y_train)

# Function to get user input and make predictions (handling categorical input)
def predict_housing_value():
    try:
        # Get user input for each feature
        features = []
        for col in X.columns:
            if col == "ocean_proximity":
                print("Valid categories for ocean_proximity:", housing["ocean_proximity"].unique())
                value = input(f"Enter value for {col} (one of the categories): ")
            else:
                value = input(f"Enter value for {col}: ")
                try:
                    value = float(value)
                except ValueError:
                    print("Can't convert to float")
            features.append(value)

        # Preprocess the input features using the pipeline
        input_df = pd.DataFrame([features], columns=X_train.columns)  # Use column names from training data
        input_prepared = full_pipeline.transform(input_df)

        # Make prediction using the trained model
        prediction = model.predict(input_prepared)[0]
        print("Predicted median house value:", prediction)
    except ValueError as e:
        print(e)
        print("Invalid input. Please enter numerical values for numerical features and valid categories for ocean_proximity.")

# Get user input and make prediction
predict_housing_value()