Ever wonder how this machine learning thing works?
I've always stuck with Finate State Machines or Behavior Trees because they were concepts that could be understood.
Neural Nets have been some magic black box that didn't make a whole lot of sense.
When you're in the “From Scratch” crowd, there is always something else to occupy your time.
But time doesn't stop and eventually now it would be nice to build some of these concepts for the toolbox.
Starting my adventure with a single lonely neuron with one input.
What can that do, you ask?
Consider a training set with pairs of int values.
For each pair, the second value is a common scale of the first. This scalar value is what is attempted to be determined.
What weights, what bias value, where do you start. Who cares….
Let the system figure it out, and that it does. Below, the starting values are set with random values.
A couple of very small magic numbers are introduced to keep the guess nudging stable and limit potential oscillation.
Basically, this model converges to a solution of the first value of the pair, times an unknown value, then equaling the second value with a number of other samples that fit the same model.
It was pretty cool to see it find an almost perfect value solution even after changing the training set to something similar. This was a nice exercise to get my feet wet even though I know it's not scalable.
We'll be trying that next with proper layers, connections and structure.
Want to talk about it? Jump into the comment section and remember to stay away from the revShare weirdos.
// references
// https://stackoverflow.com/questions/19665818/generate-random-numbers-using-c11-random-library
// https://m.youtube.com/watch?v=PGSba51aRYU Machine Learning in C (Episode 1) - Tsoding Daily
//
// My last brain cell
// * * * *
// * _____ *
// weight * \ *
// x ---------> * >- * ---------> w
// bias * /____ *
// * y*y *
// * * * *
//
// task:
// with a single neuron find the value from the vector of pairs that approaches the equation
// first * unknown_value = second (feel free to change the training data to describe another value to evaluate)
//
#include <iostream>
#include <algorithm>
#include <random>
std::vector<std::pair<int, int>> training_data = { {0, 0}, {1, 5}, {2, 10}, {3, 15}, {4, 20} }; // w=5 i.e 1*5=5, 2*5=10, 3*5=15
double eval(double weight, double bias)
{
double result = 0;
for (auto data_item : training_data)
{
double grad_x = (double)data_item.first; // two dimensional gradient
double grad_y = (grad_x * weight) + bias; //
double distance = grad_y - data_item.second; //
result += (distance * distance); // amplify error
}
return (result / training_data.size());
}
int main()
{
std::random_device rd; // prepare a random number device
std::mt19937 mt(rd()); // seed mersenne twister engine
std::uniform_real<double> rndf; // 0 => 1
double w = rndf(mt); // start state of solution guess (anywhere)
double epsilon = 1e-6;
double learn_rate = 1e-2;
double b = rndf(mt); // start bias (anywhere)
// drive the evaluated cost towards zero
for (size_t i = 0; i < 2400; ++i) { // iterate to approach a best fit for w
double c = eval(w, b); // current evaluated cost / loss
double dw = (eval(w + epsilon, b) - c) / epsilon; // current evaluated weight difference
double db = (eval(w, b + epsilon) - c) / epsilon; // current evaluated bias difference
w -= dw * learn_rate; // nudge guess closer to the actual solution
b -= db * learn_rate;
//printf("cost=%f weight=%f bias=%f \n", c, w, b); // print training result this iteration
}
printf("%lf\n", w);// + epsilon); // print final aproximation
}
When debugging a larger model, tried experimenting with graph visualization in 3D, because…why not. 🙂
This was fun, but a common topology overview would be more appropriate. That's next.