1

EDIT: The data set is the MNIST data set from the Homework of Week 4 of Andrew Ng's Machine Learning Course

I've checked the question on scipy optimize but I still couldn't figure out what is wrong with my code. I am trying to optimize theta for the oneVsAll question on the Andrew Ng coursera course.

Here is the relevant code

def sigmoid(x):
a = []
for item in x:
    a.append(1/(1+math.exp(-item)))
return a

def hypothesis(x, theta):
    return np.array(sigmoid(np.dot(x, theta)))

def costFunction(theta, x, y, lamba_):
    m = X.shape[0]

    part1 = np.dot(y.T, np.log(hypothesis(x, theta)).reshape(m,1))
    part2 = np.dot((np.ones((m,1)) - y).T, np.log( 1 - hypothesis(x, theta)).reshape(m,1))

    summ = (part1 + part2)

    return -summ[0]/m

def gradientVect(theta, x, y, lambda_):
    n = X.shape[1]
    m = X.shape[0]
    gradient = []

    theta = theta.reshape(n,1)

    beta = hypothesis(x, theta) - y

    reg = theta[1:] * lambda_/m

    grad = np.dot(X.T, beta) * 1./m

    grad[1:] = grad[1:] * reg

    return grad.flatten()


from scipy import optimize

def optimizeTheta(x, y, nLabels, lambda_):

    for i in np.arange(0, nLabels):
        theta = np.zeros((n,1))
        res = optimize.minimize(costFunction, theta, args=(x, (y == i)*1, lambda_), method=None,
                       jac=gradientVect, options={'maxiter':50})
        print(res)
    return result

but running

optimizeTheta(X, y, 10, 0) # X shape = 401, 500

Gives me the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-247-e0e6e4c1eddd> in <module>()
      3 n = X.shape[1]
      4 
----> 5 optimizeTheta(X, y, 10, 0)

<ipython-input-246-0a15e9f4769a> in optimizeTheta(x, y, nLabels, lambda_)
     54         theta = np.zeros((n,1))
     55         res = optimize.minimize(costFunction, x0 = theta, args=(x, (y == i)*1, lambda_), method=None,
---> 56                        jac=gradientVect, options={'maxiter':50})
     57         print(res)
     58     return result

//anaconda/lib/python3.5/site-packages/scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
    439         return _minimize_cg(fun, x0, args, jac, callback, **options)
    440     elif meth == 'bfgs':
--> 441         return _minimize_bfgs(fun, x0, args, jac, callback, **options)
    442     elif meth == 'newton-cg':
    443         return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,

//anaconda/lib/python3.5/site-packages/scipy/optimize/optimize.py in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options)
    859     gnorm = vecnorm(gfk, ord=norm)
    860     while (gnorm > gtol) and (k < maxiter):
--> 861         pk = -numpy.dot(Hk, gfk)
    862         try:
    863             alpha_k, fc, gc, old_fval, old_old_fval, gfkp1 = \

ValueError: shapes (401,401) and (2005000,) not aligned: 401 (dim 1) != 2005000 (dim 0)

And I can't figure out why the shapes are not aligned.

Thanks!

Community
  • 1
  • 1
Daniel Pereira
  • 167
  • 3
  • 12
  • The vector returned by `gradientVec` must have exactly the same number of elements as in `theta`, since it is the gradient of a scalar function vs theta. Apparently, what your function computes is not the gradient. So you need to debug this function. – pv. Nov 10 '16 at 19:40
  • It has, but the documentation says that the gradient must be an 1D array so I flattened it (401,500 --> 2005000). That is why it looks like a different thing – Daniel Pereira Nov 10 '16 at 20:35
  • http://stackoverflow.com/questions/25880634/logistic-regression-objects-are-not-aligned; http://stackoverflow.com/questions/38837155/how-to-get-dimensions-right-using-fmin-cg-in-scipy-optimize – hpaulj Nov 10 '16 at 23:18
  • This code is very confusing. For each function, document the expected shape of each input, and the shape of the output. Pay particular attention to the shape of `gradientVec` output. The error message implies that it should be `(401,)`. Why is it 401*500*10? – hpaulj Nov 11 '16 at 05:52
  • @Daniel Pereira: if the shape is (401,500), and theta does not have 401*500 elements (which seems to be the case in your code, although this is difficult to say because the code appears inconsistent and is not runnable), then it is not a gradient. – pv. Nov 11 '16 at 19:28

1 Answers1

0

So I realized what was wrong with my question. The problem was the sigmoid function returning a list and not an integer and therefore it messed up the matrixes multiplications afterwards. The new sigmoid function is

def sigmoid(z):
return(1 / (1 + np.exp(-z)))
Daniel Pereira
  • 167
  • 3
  • 12