Coding logistic SGD manually in a for loop in Python -- am I interpreting this correctly?

Question

Coding logistic SGD manually in a for loop in Python -- am I interpreting this correctly?

38 Views Asked by user10443249 At 18 December 2023 at 11:26

I'm following the steps of SGD, but unsure if I am interpreting the steps correctly.

Let's say there are two w terms:

$equation1$

where x is tensors and w are two logistic function params.

$equation2$

and the goal is to find argmin(lambda)

And SGD formula is given as:

$equation3$

This is the code I have so far, with library constraints:

import numpy as np
import torch

some_data = torch.rand(1000, 2)
x = some_data[:,0]
y = (x + 0.4 * some_data[:,1] > 0.5).to(torch.int) # doesn't work with bool?

x = torch.split(x, 3) # batch
y = torch.split(y, 3)

# optim problem
w1 = torch.autograd.Variable(torch.tensor([0.1]), requires_grad=True)
w2 = torch.autograd.Variable(torch.tensor([0.1]), requires_grad=True)

lambda_min = torch.zeros(1)

for epoch in range(10):
  #print(f'epoch: ', epoch)
  for xx, yy in zip(x, y):
    
    p_x = 1 / (1 + torch.exp(-w1 - (torch.mul(w2, x))))
    
    # NLL
    lambda = torch.sum(yy * torch.log(p_x) + (1 + yy) * torch.log(1 - p_x))
    
    # Gradient???
    if lambda < lambda_min :
      lambda_min = lambda # Q: loss value, right?
      # Q: update params...?
      w2 = w1 - torch.mul(0.01, lambda_min)
      print(f'w2 = ', w2)
    else:
      pass

I would like to confirm the parts with "Q:" Then how would I plot this with y and x to look like a regular logistic function with tensors, similar to below, but with all the 1s and 0s in the graph?

Original Q&A

There are 1 best solutions below

**ForceBru** · Answer 1 · 2023-12-18T11:41:59.847000

There are multiple issues with this code.

You never calculate any gradients, so this is not gradient descent.
Gradient descent updates values of parameters always, not just when lambda < lambda_min. BTW, lambda is a keyword and can't be used as a variable name.
w2 = w1 - torch.mul(0.01, lambda_min) is not a gradient descent step:
1. Gradient descent sets new w2 equal to old w2 plus some adjustment. You code sets it equal to w1 (!) plus some adjustment.
2. That adjustment is supposed to be the scaled negative gradient. In your code it's the scaled value of the loss function, which it not the gradient.
According to your logistic function formula, (1 + yy) * torch.log(1 - p_x) should be (1 - yy) * torch.log(1 - p_x) (note the "1 minus yy" part).

Coding logistic SGD manually in a for loop in Python -- am I interpreting this correctly?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TORCH

Trending Questions

Popular # Hahtags

Popular Questions