Fill missing data using SimpleImputer

If you get the below error while predicting missing values, You should use SimpleImputer instead of Imputer.

DeprecationWarning: Class Imputer is deprecated; Imputer was deprecated in version 0.20 and will be removed in 0.22. Import impute.SimpleImputer from sklearn instead.
  warnings.warn(msg, category=DeprecationWarning)

A sample code that show how to use SimpleImputer is given below.

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('my_data.csv')
# Get all independent data marix, all columns except the last one
Independent = dataset.iloc[:, :-1].values
# Our original matrix has 5 columns, so the last column is #dependent and index of that one is 4
Dependent = dataset.iloc[:, 4].values

# Fill the missing data using SimpleImputer

# Earlier Imputer was uisng for this, but now we have to use # SimpleImputer
# from sklearn.preprocessing import Imputer
from sklearn.impute import SimpleImputer

# Old code that is deprecated now.
#imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
# new code that will work
imputer = SimpleImputer(missing_values=np.nan,strategy='mean')

imputer = imputer.fit(Independent[:, 1:4])
Independent[:, 1:4] = imputer.transform(Independent[:, 1:4])
print(Independent)
Fill missing data using SimpleImputer

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top