Neural Net - Initial Values
I'm attempting to train a perceptron given some training data and have run a problem. I understand that we generally set the weights of the inputs to be randomly values and then have it go from there. My problem mainly has to do with the activation function I'm using. Say I have only two inputs with values 200,150 and random initial weights 0.25 and 0.5. Now my input to the neuron will be 200*0.25 + 150*0.5 = 125, I'm using the learning function Wj = Wj + learning_rate * error * g'(in) * xj (as mentioned they do in AI: A Modern Approach). Using the sigmoid derivative my value for g'(in) will be 0.0 since my in value was so large. Now if my g'(in) is 0 then the whole learning algorithm breaks down to Wj = Wj and no learning happens. How should I handle this? Seems to me like I either have to better select my initial weights or maybe do some normalization on the inputs. Though it does feel like I'm missing something pretty obvious about how this should all work, the book or any examples I've seen around don't seem to go into much detail.
Typically you'd normalize the inputs to [-1, +1] and set the weights to random range like [-0.5, +0.5]. Then you hit the "best" part of your sigmoid for the error derivative.
(Note there's a positive only sigmoid function, and one that's symetrical around zero... I prefer the second one, but it's up to you I guess.)
(Note there's a positive only sigmoid function, and one that's symetrical around zero... I prefer the second one, but it's up to you I guess.)
Join us in Vienna for the nucl.ai Conference 2015, on July 20-22... Don't miss it!
I've encountered a similar issues in the past with some of the datasets I've been using. As indicated in the previous response, just normalize the input values. Also, if the output values for your training set is outside an acceptable range, you'll need to normalize this as well. Just keep track of your normalization factors you use so that you can output the data with the correct order of magnitude.
Mike
Mike
I know very little about neural nets, but is it really worth randomizing the initial state?
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement