Training Error vs Testing Error
Hi all,
I have trained three different kinds of networks for a Time series problem, Real Time Recurrent Learning (RTRL), 1 hidden layer Feedforward Network, 1 Hidden layer Elman Network.
Upon training these nets I find that after the training process, the MSE comes out to be in the range of 1e-5~1e-4. But no matter how much I train, and irrespective of the order of presentation of the inputs (it is randomized), the error on the test data is kind of stagnant at abt 1e-3. Sometimes when I train my RTRL longer, the error on the test data actually increases (as vs lesser training). Is this because of the weights getting stuck at some local minima? I may also add that the data I am using (currency exchange rate data) doesnt really seem to have any underlying pattern (this is what I gathered from a plot of the data). Also, the data has *not* been preprocessed. My particular investigation is really abt Neural net structure, and preprocessing the data might bias my findings. So I havent done anything abt it. What do you recommend?
thanks
Sidhant
If you're noticing performance degrade as the length of training increases, then that suggests your network is over-fitting the training data.
As for the RMSE of your test data, you should expect it to be worse than your training data as a function of the cumulative distance between each training set datum and test set datum. That is, you don't expect the network to represent the data perfectly (particularly in your domain where there is a significant element of noise in the data), so you expect that it will not perform as well on data it has not seen before.
Since you're dealing with time series data, you might consider a hybrid approach whereby you pre-filter the data and the filter parameters are chosen so as to minimise the RMS of the test set.
Cheers,
Timkin
As for the RMSE of your test data, you should expect it to be worse than your training data as a function of the cumulative distance between each training set datum and test set datum. That is, you don't expect the network to represent the data perfectly (particularly in your domain where there is a significant element of noise in the data), so you expect that it will not perform as well on data it has not seen before.
Since you're dealing with time series data, you might consider a hybrid approach whereby you pre-filter the data and the filter parameters are chosen so as to minimise the RMS of the test set.
Cheers,
Timkin
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement