I have a data file that can be downloaded from here: https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data
I want to define a function that reads and loads the data and returns dataset numpy arrays. Dataset should have 14 columns corresponding to the 13 attributes of housing property x and housing price value y.
def loadData(filename):
dataset = None
file = open(filename, "r")
data = file.read()
print(data)
x = np.genfromtxt(filename, usecols = [0,1,2,3,4,5,6,7,8,9,10,11,12])
y = np.genfromtxt(filename, usecols = 13)
print("x: ", x)
print("y: ", y)
dataset = np.concatenate((x,y), axis = 1)
return dataset
My y output seems to be alright. However, my x output is wrong as seen below:
Part of the output of x should contain the values below, as part of an np array:
What am I doing wrong?
edit: the above question has been answered and resolved. However, I just wanted to ask how would I ensure that the output is in float64.
I have edited the np.genfromtxt line to have type = np.float64 as shown:
x = np.genfromtxt(filename, usecols = [0,1,2,3,4,5,6,7,8,9,10,11,12], dtype = np.float64)
y = np.genfromtxt(filename, usecols = 13, dtype = np.float64)
I have also tried dataset.astype(float64) but neither has worked. Would appreciate some help again. Thank you!




You have already read the data from file in
datavariable. Usedatavariable instead offilenameingenfromtxt()as below instead of filename: