Implementing a vintage XOR solution in Python with an FNN and backpropagation
I'm still thinking that implementing XOR with so little mathematics might not be that simple. However, since the basic rule of innovating is to be unconventional, I write the code.
To stay in the spirit of a 1969 vintage solution, I decide not to use NumPy, TensorFlow, Theano, or any other high-level library. Writing a vintage FNN with backpropagation written in high-school mathematics is fun.
This also shows that if you break a problem down into very elementary parts, you understand it better and provide a solution to that specific problem. You don't need to use a huge truck to transport a loaf of bread.
Furthermore, by thinking through the minds of children, I went against running 20,000 or more episodes in modern CPU-rich solutions to solve the XOR problem. The logic used proves that, basically, both inputs can have the same parameters as long as one bias is negative (the elder reasonable critical child) to make the system provide a reasonable answer.
The basic Python solution quickly reaches a result in 3 to 10 iterations (epochs or episodes) depending on how we think it through.
The top of the code simply contains a result matrix with four columns. Each represents the status (1=correct, 0=false) of the four predicates to solve:
#FEEDFORWARD NEURAL NETWORK(FNN) WITH BACK PROPAGATION SOLUTION FOR XOR
result=[0,0,0,0] #trained result
train=4 #dataset size to train
The train variable is the number of predicates to solve: (0,0), (1,1),(1,0),(0,1). The variable of the predicate to solve is pred.
The core of the program is practically the sheet of paper I wrote, as in the following code.
#II hidden layer 1 and its output
def hidden_layer_y(epoch,x1,x2,w1,w2,w3,w4,b1,b2,pred,result):
h1=(x1*w1)+(x2*w4) #II.A.weight of hidden neuron h1
h2=(x2*w3)+(x1*w2) #II.B.weight of hidden neuron h2
#III.threshold I,a hidden layer 2 with bias
if(h1>=1):h1=1;
if(h1<1):h1=0;
if(h2>=1):h2=1
if(h2<1):h2=0
h1= h1 * -b1
h2= h2 * b2
#IV. threshold II and OUTPUT y
y=h1+h2
if(y<1 and pred>=0 and pred<2):
result[pred]=1
if(y>=1 and pred>=2 and pred<4):
result[pred]=1
pred is an argument of the function from 1 to 4. The four predicates can be represented in the following table:
That is why y must be <1 for predicates 0 and 1. Then, y must be >=1 for predicates 2 and 3.
Now, we have to call the following function limiting the training to 50 epochs, which are more than enough:
#I Forward and backpropagation
for epoch in range(50):
if(epoch<1):
w1=0.5;w2=0.5;b1=0.5
w3=w2;w4=w1;b2=b1
At epoch 0, the weights and biases are all set to 0.5. No use thinking! Let the program do the job. As explained previously, the weight and bias of x2 are equal.
Now the hidden layers and y calculation function are called four times, one for each predicate to train, as shown in the following code snippet:
#I.A forward propagation on epoch 1 and IV.backpropagation starting epoch 2
for t in range (4):
if(t==0):x1 = 1;x2 = 1;pred=0
if(t==1):x1 = 0;x2 = 0;pred=1
if(t==2):x1 = 1;x2 = 0;pred=2
if(t==3):x1 = 1;x2 = 0;pred=3
#forward propagation on epoch 1
hidden_layer_y(epoch,x1,x2,w1,w2,w3,w4,b1,b2,pred,result)