Advertisement

ANN - learning AND / OR

Started by August 25, 2005 07:49 PM
3 comments, last by fadeh 19 years, 3 months ago
Hi all, my first time here so start with: I apologize for my bad english but i'm not a native english speaker... i'll try to do my best.. :D I wrote a little AAN with this characteristics: - It has to learn bit to bit AND and OR, so it takes two input (x1,x2) and calculate d = x1*w1 + x2*w2; (w1,w2) -> weights.. - the treshold i choosed is 0 and it is compared with d = x1 * w1 + x2 * w2, then if(d <= 0) d = 0; else d = 1; - weights are modified by this formulas wi(t+1) = wi(t) + n * (p - d) * xi with: n = learning rate, p = expected value, d = aim value. . - the program takes in inupt a file (exercise book) with this syntax (eg): 1 0 0 0 0 1 ecc and it means (first line): 1: first operand; 0: second operand; 0: expected result. And so &#111;n. All this works fine if this simple feed-forward perceptron try to learn how manage OR (learning rate 0.2… less than 20 example are enough), but it doesn't work if i try to give it an AND exercise book… The source is something like that (again apology: ugly coding &#115;tyle, don't try this at home! :D) <!–STARTSCRIPT–><!–source lang="cpp"–><div class="source"><pre> <span class="cpp-comment">//main.cpp</span> <span class="cpp-directive">#include</span> <span class="cpp-literal">"main.h"</span> <span class="cpp-directive">#include</span> <span class="cpp-literal">"training.h"</span> <span class="cpp-keyword">struct</span> exercise ex[ESEMPI]; <span class="cpp-keyword">struct</span> weight w; FILE *book = NULL; <span class="cpp-comment">// exercise book</span> <span class="cpp-keyword">int</span> main(<span class="cpp-keyword">int</span> argc, <span class="cpp-keyword">char</span> *argv[]) { <span class="cpp-keyword">int</span> n = <span class="cpp-number">0</span>; <span class="cpp-keyword">double</span> ris; <span class="cpp-keyword">char</span> filename[<span class="cpp-number">20</span>]; w.wa = wrand(); w.wb = wrand(); cout &lt;&lt; <span class="cpp-literal">"Init weights: "</span> &lt;&lt; w.wa &lt;&lt; <span class="cpp-literal">" "</span> &lt;&lt; w.wb &lt;&lt; endl; cout &lt;&lt; <span class="cpp-literal">"Book filename: "</span>; cin &gt;&gt; filename; book = fopen(filename, <span class="cpp-literal">"r"</span>); <span class="cpp-keyword">if</span>(book == NULL) { cout &lt;&lt; <span class="cpp-literal">"fopen error"</span> &lt;&lt; endl; exit(-<span class="cpp-number">1</span>); } <span class="cpp-keyword">for</span>(<span class="cpp-keyword">int</span> i = <span class="cpp-number">0</span>; i &lt; ESEMPI; i++) n = getexercise(&amp;ex<span style="font-weight:bold;">, &amp;w, n); fclose(book); cout &lt;&lt; <span class="cpp-literal">"Final weight: "</span> &lt;&lt; w.wa &lt;&lt; <span class="cpp-literal">" "</span> &lt;&lt; w.wb &lt;&lt; endl; <span class="cpp-keyword">for</span>(;;) { cout &lt;&lt; endl &lt;&lt; <span class="cpp-literal">"First operand: "</span>; cin &gt;&gt; ex[<span class="cpp-number">0</span>].a; cout &lt;&lt; endl &lt;&lt; <span class="cpp-literal">"Second operand: "</span>; cin &gt;&gt; ex[<span class="cpp-number">0</span>].b; ris = (ex[<span class="cpp-number">0</span>].a * w.wa) + (ex[<span class="cpp-number">0</span>].b * w.wb); <span class="cpp-keyword">if</span>(ris &lt; <span class="cpp-number">0</span>) ris = <span class="cpp-number">0</span>; <span class="cpp-keyword">if</span>(ris &gt; <span class="cpp-number">0</span>) ris = <span class="cpp-number">1</span>; cout &lt;&lt; endl &lt;&lt; ex[<span class="cpp-number">0</span>].a &lt;&lt; <span class="cpp-literal">" OP "</span> &lt;&lt; ex[<span class="cpp-number">0</span>].b &lt;&lt; <span class="cpp-literal">" = "</span> &lt;&lt; ris &lt;&lt; endl; } <span class="cpp-keyword">return</span> <span class="cpp-number">0</span>; } </pre></div><!–ENDSCRIPT–> <!–STARTSCRIPT–><!–source lang="cpp"–><div class="source"><pre> <span class="cpp-comment">//main.h</span> <span class="cpp-directive">#include</span> &lt;stdlib.h&gt; <span class="cpp-directive">#include</span> &lt;stdio.h&gt; <span class="cpp-directive">#include</span> &lt;iostream&gt; <span class="cpp-directive">#define</span> ESEMPI <span class="cpp-number">20</span> <span class="cpp-comment">// book size</span> <span class="cpp-keyword">using</span> <span class="cpp-keyword">namespace</span> std; </pre></div><!–ENDSCRIPT–> <!–STARTSCRIPT–><!–source lang="cpp"–><div class="source"><pre> <span class="cpp-comment">//training.cpp</span> <span class="cpp-directive">#include</span> <span class="cpp-literal">"training.h"</span> <span class="cpp-keyword">double</span> wrand() <span class="cpp-comment">// return a random number to initialize the weights</span> { <span class="cpp-keyword">return</span> (rand() / ((<span class="cpp-keyword">double</span>)RAND_MAX + <span class="cpp-number">1</span>)); } <span class="cpp-keyword">int</span> getexercise(<span class="cpp-keyword">struct</span> exercise *ex, <span class="cpp-keyword">struct</span> weight *w, <span class="cpp-keyword">int</span> n) { <span class="cpp-comment">// open an exercise book and fills an array of </span> <span class="cpp-comment">// struct exercise </span> <span class="cpp-keyword">extern</span> FILE *book; ex-&gt;a = (fgetc(book) - <span class="cpp-number">48</span>); fseek(book, n+=<span class="cpp-number">2</span>, SEEK_SET); ex-&gt;b = (fgetc(book) - <span class="cpp-number">48</span>); fseek(book, n+=<span class="cpp-number">2</span>, SEEK_SET); ex-&gt;r = (fgetc(book) - <span class="cpp-number">48</span>); fseek(book, n+=<span class="cpp-number">3</span>, SEEK_SET); learning(ex, w); <span class="cpp-keyword">return</span> n; } <span class="cpp-keyword">void</span> learning(<span class="cpp-keyword">struct</span> exercise *ex, <span class="cpp-keyword">struct</span> weight *w) { <span class="cpp-keyword">double</span> d; <span class="cpp-keyword">double</span> deltaa; <span class="cpp-keyword">double</span> deltab; <span class="cpp-keyword">const</span> <span class="cpp-keyword">double</span> n = <span class="cpp-number">0</span>.<span class="cpp-number">2</span>; d = (ex-&gt;a * w-&gt;wa) + (ex-&gt;b * w-&gt;wb); <span class="cpp-keyword">if</span>(d &lt;= <span class="cpp-number">0</span>) d = <span class="cpp-number">0</span>; <span class="cpp-keyword">else</span> d = <span class="cpp-number">1</span>; <span class="cpp-keyword">if</span>(d != ex-&gt;r) { deltaa = n * (ex-&gt;r - d) * ex-&gt;a; deltab = n * (ex-&gt;r - d) * ex-&gt;b; } <span class="cpp-keyword">else</span> { deltaa = <span class="cpp-number">0</span>; deltab = <span class="cpp-number">0</span>; } w-&gt;wa = w-&gt;wa + deltaa; w-&gt;wb = w-&gt;wb + deltab; } </pre></div><!–ENDSCRIPT–> <!–STARTSCRIPT–><!–source lang="cpp"–><div class="source"><pre> <span class="cpp-comment">//training.h</span> <span class="cpp-directive">#include</span> <span class="cpp-literal">"main.h"</span> <span class="cpp-keyword">struct</span> exercise { <span class="cpp-keyword">double</span> a; <span class="cpp-comment">// first operand</span> <span class="cpp-keyword">double</span> b; <span class="cpp-comment">// second operand</span> <span class="cpp-keyword">double</span> r; <span class="cpp-comment">// expected result</span> }; <span class="cpp-keyword">struct</span> weight { <span class="cpp-keyword">double</span> wa; <span class="cpp-comment">// weight a</span> <span class="cpp-keyword">double</span> wb; <span class="cpp-comment">// weight b</span> }; <span class="cpp-keyword">double</span> wrand(); <span class="cpp-keyword">int</span> getexercise(<span class="cpp-keyword">struct</span> exercise *ex, <span class="cpp-keyword">struct</span> weight *w, <span class="cpp-keyword">int</span> n); <span class="cpp-keyword">void</span> learning(<span class="cpp-keyword">struct</span> exercise *ex, <span class="cpp-keyword">struct</span> weight *w); </pre></div><!–ENDSCRIPT–> If anyone has some hints it would be very appreaciated… Thanx, fadeh <!–EDIT–><span class=editedby><!–/EDIT–>[Edited by - fadeh on August 26, 2005 3:29:10 AM]<!–EDIT–></span><!–/EDIT–>
It's been a while since I worked with ANNs, but if I'm not mistaken, it's possibly not a coding problem.

You'll never get it working with that threshold. You either add a bias to the node (you know, an input which is allways 1: d = x1*w1 + x2*w2 + 1*w3) or change the threshold to something positive (I would make the neuron biased).

If I'm wrong in my assumptions, I'm sure someone here will correct me.
Advertisement
It just occurred to be maybe I should try to explain. I'm not very good at explaining but let's try:

You want d to be non-positive if the answer is false (A AND B = false)

So that means A * w1 + B * w2 must be equal or less than zero. Because it has to be less than zero for A = 0 and B = 1 or A = 1 and B = 0, both w1 and w2 will have to be non-positive (do the math. None of them can be positive because 1 and 0 must return the same answer as 0 and 1).

So we need w1 and w2 <=0. How will we ever get a positive d? It's not possible.

Now suppose you have a bias. Then you can get an answer with something like

w1 = 0.6; w2 = 0.6; w3 = -1;

Many thanx man, now it works perfectly (adding a bias as u explained)... but another question went out: why doesn't work with xor? :D
If i had understood what you said the things don't work because:
0 XOR 1 = 1, 1 XOR 0 = 1, so wa > 0, wb > 0 but in that case the things don't work with 1 XOR 1 = 0... so if wc < 0 and |wc| > |wa + wb| the things are done for 1 XOR 1 but messed up for 0 XOR 1, 1 XOR 0... so let's choose wa < 0, wb < 0, wc > 1... this way 0 XOR 1, 1 XOR 0, XOR 1 are ok, but 0 XOR 0 no.... so it seems that the bias in this case doesn't work or i missing up something...?

Thanx

fadeh

Ok, found an answer here: http://talkabout.editthispage.com/stories/storyReader$377 (if someone got my problem)..

Bye,

fadeh

This topic is closed to new replies.

Advertisement