I have a dataset consisting of student marks in 2 subjects and the result if the student is admitted in college or not. I need to perform a logistic regression on the data and find the optimum parameter θ to minimize the loss and predict the results for the test data. I am not trying to build any complex non linear network here.
I have the loss function defined for logistic regression like this which works fine
predict(X) = sigmoid(X*θ)
loss(X,y) = (1 / length(y)) * sum(-y .* log.(predict(X)) .- (1 - y) .* log.(1 - predict(X)))
I need to minimize this loss function and find the optimum θ. I want to do it with Flux.jl or any other library which makes it even easier. I tried using Flux.jl after reading the examples but not able to minimize the cost.
My code snippet:
function update!(ps, η = .1)
for w in ps
w.data .-= w.grad .* η
print(w.data)
w.grad .= 0
end
end
for i = 1:400
back!(L)
update!((θ, b))
@show L
end
You can use either GLM.jl (simpler) or Flux.jl (more involved but more powerful in general). In the code I generate the data so that you can check if the result is correct. Additionally I have a binary response variable - if you have other encoding of target variable you might need to change the code a bit.
Here is the code to run (you can tweak the parameters to increase the convergence speed - I chose ones that are safe):
And here are the results: