Applying the FNN XOR solution to a case study to optimize subsets of data
The case study described here is a real-life project. The environment and functions have been modified to respect confidentiality. But, the philosophy is the same one as that used and worked on.
We are 7.5 billion people breathing air on this planet. In 2050, there will be about 2.5 billion more. All of these people need to wear clothes and eat. Just those two activities involve classifying data into subsets for industrial purposes. Grouping is a core concept for any kind of production. Production relating to producing clothes and food requires grouping to optimize production costs. Imagine not grouping and delivering one t-shirt at a time from one continent to another instead of grouping t-shirts in a container and grouping many containers (not just two on a ship). Let's focus on clothing, for example.
A brand of stores needs to replenish the stock of clothing in each store as the customers purchase their products. In this case, the corporation has 10,000 stores. The brand produces jeans, for example. Their average product is a faded jean. This product sells a slow 50 units a month per store. That adds up to 10,000 stores x 50 units = 500,000 units or stock keeping unit (SKU) per month. These units are sold in all sizes grouped into average, small, and large. The sizes sold per month are random.
The main factory for this product has about 2,500 employees producing those jeans at an output of about 25,000 jeans per day. The employees work in the following main fields: cutting, assembling, washing, laser, packaging, and warehouse. See Chapter 12, Automated Planning and Scheduling, for Amazon's patented approach to apparel production.
The first difficulty arises with the purchase and use of fabric. The fabric for this brand is not cheap. Large amounts are necessary. Each pattern (the form of pieces of the pants to be assembled) needs to be cut by wasting as little fabric as possible.
Imagine you have an empty box you want to fill up to optimize the volume. If you only put soccer balls in it, there will be a lot of space. If you slip tennis balls in the empty spaces, space will decrease. If on top of that, you fill the remaining empty spaces with ping pong balls, you will have optimized the box.
Building optimized subsets can be applied to containers, warehouse flows and storage, truckload optimizing, and almost all human activities.
In the apparel business, if 1% to 10% of fabric is wasted while manufacturing jeans, the company will survive the competition. At over 10%, there is a real problem to solve. Losing 20% on all the fabric consumed to manufacture jeans can bring the company down and force it into bankruptcy.
The main rule is to combine larger pieces and smaller pieces to make optimized cutting patterns.
Optimization of space through larger and smaller objects can be applied to cutting the forms which are patterns of the jeans, for example. Once they are cut, they will be assembled at the sewing stations.
The problem can be summed up as:
- Creating subsets of the 500,000 SKUs to optimize the cutting process for the month to come in a given factory
- Making sure that each subset contains smaller sizes and larger sizes to minimize loss of fabric by choosing six sizes per day to build 25,000 unit subsets per day
- Generating cut plans of an average of three to six sizes per subset per day for a production of 25,000 units per day
In mathematical terms, this means trying to find subsets of sizes among 500,000 units for a given day.
The task is to find six well-matched sizes among 500,000 units. This is calculated by the following combination formula:
At this point, most people abandon the idea and just find some easy way out of this even if it means wasting fabric. The problem was that in this project, I was paid on a percentage of the fabric I would manage to save. The contract stipulated that I must save 3% of all fabric consumption per month for the whole company to get paid a share of that. Or receive nothing at all. Believe me, once I solved that, I kept that contract as a trophy and a tribute to simplicity.
The first reaction we all have is that this is more than the number of stars in the universe and all that hype!
That's not the right way to look at it at all. The right way is to look exactly in the opposite direction. The key to this problem is to observe the particle at a microscopic level, at the bits of information level. This is a fundamental concept of machine learning and deep learning. Translated into our field, it means that to process an image, ML and DL process pixels. So, even if the pictures to analyze represent large quantities, it will come down to small units of information to analyze:
Today, Google, Facebook, Amazon, and others have yottabytes of data to classify and make sense of. Using the word big data doesn't mean much. It's just a lot of data, and so what?
You do not need to analyze inpidual positions of each data point in a dataset, but use the probability distribution.
To understand that, let's go to a store to buy some jeans for a family. One of the parents wants a pair of jeans, and so does a teenager in that family. They both go and try to find their size in the pair of jeans they want. The parent finds 10 pairs of jeans in size x. All of the jeans are part of the production plan. The parent picks one at random, and the teenager does the same. Then they pay for them and take them home.
Some systems work fine with random choices: random transportation (taking jeans from the store to home) of particles (jeans, other product units, pixels, or whatever is to be processed) making up that fluid (a dataset).
Translated into our factory, this means that a stochastic (random) process can be introduced to solve the problem.
All that was required is that small and large sizes were picked at random among the 500,000 units to produce. If six sizes from 1 to 6 were to be picked per day, the sizes could be classified as follows in a table:
Smaller sizes= S={1,2,3}
Larger sizes=L=[4,5,6}
Converting this into numerical subset names, S=1 and L=6. By selecting large and small sizes to produce at the same time, the fabric will be optimized:
Doesn't this sound familiar? It looks exactly like our vintage FNN, with 1 instead of 0 and 6 instead of 1. All that has to be done is to stipulate that subset S=value 0, and subset L=value 1; and the previous code can be generalized.
If this works, then smaller and larger sizes will be chosen to send to the cut planning department, and the fabric will be optimized. Applying the randomness concept of Bellman's equation, a stochastic process is applied, choosing customer unit orders at random (each order is one size and a unit quantity of 1):
w1=0.5;w2=1;b1=1
w3=w2;w4=w1;b2=b1
s1=random.randint(1,500000)#choice in one set s1
s2=random.randint(1,500000)#choice in one set s2
The weights and bias are now constants obtained by the result of the XOR training FNN. The training is done; the FNN is simply used to provide results. Bear in mind that the word learning in machine learning and deep learning doesn't mean you have to train systems forever. In stable environments, training is run only when the datasets change. At one point in a project, you are hopefully using deep trained systems and are not just stuck in the deep learning process. The goal is not to spend all corporate resources on learning but on using trained models.
Deep learning architecture must rapidly become deep trained models to produce a profit or disappear from a corporate environment.
For this prototype validation, the size of a given order is random. 0 means the order fits in the S subset; 1 means the order fits in the L subset. The data generation function reflects the random nature of consumer behavior in the following six-size jean consumption model.
x1=random.randint(0, 1)#property of choice:size smaller=0
x2=random.randint(0, 1)#property of choice :size bigger=1
hidden_layer_y(x1,x2,w1,w2,w3,w4,b1,b2,result)
Once two customer orders have been chosen at random chosen at random in the right size category, the FNN is activated, the FNN is activated and runs like the previous example. Only the result array has been changed because no training is required. Only a yes (1) or no (0) is expected, as shown in the following code:
#II hidden layer 1 and its output
def hidden_layer_y(x1,x2,w1,w2,w3,w4,b1,b2,result):
h1=(x1*w1)+(x2*w4) #II.A.weight of hidden neuron h1
h2=(x2*w3)+(x1*w2) #II.B.weight of hidden neuron h2
#III.threshold I,a hidden layer 2 with bias
if(h1>=1):h1=1;
if(h1<1):h1=0;
if(h2>=1):h2=1
if(h2<1):h2=0
h1= h1 * -b1
h2= h2 * b2
#IV. threshold II and OUTPUT y
y=h1+h2
if(y<1):
result[0]=0
if(y>=1):
result[0]=1
The number of subsets to produce needs to be calculated to determine the volume of positive results required.
The choice is made of six sizes among 500,000 units. But, the request is to produce a daily production plan for the factory. The daily production target is 25,000. Also, each subset can be used about 20 times. There is always, on average, 20 times the same size in a given pair of jeans available.
Each subset result contains two orders, hence two units:
R=2 x 20 = 120
Each result produced by the system represents a quantity of 120 for 2 sizes.
Six sizes are required to obtain good fabric optimization. This means that after three choices, the result represents one subset of potential optimized choices:
R = 120 x 3 subsets of 2 sizes= 360
The magic number has been found. For every 3 choices, the goal of producing 6 sizes multiplied by a repetition of 20 will be reached.
The production per day request is 25,000:
The number of subsets requested = 25000/3=8333. 333
The system can run 8333 products as long as necessary to produce the volume of subsets requested. In this case, the range is set to 1000000 products because only the positive results are accepted. The system is filtering the correct subsets through the following function:
for element in range(1000000):
if(result[0]>0):
subsets+=1
print("Subset:",subsets,"size subset #",x1," and ","size subset #",x2," result:",result[0],"order #"," and ",s1,"order #",s2)
if(subsets>=8333):
break
When the 8333 subsets have been found respecting the smaller-larger size distribution, the system stops, as shown in the following output.
Subset: 8330 size subset # 1 and size subset # 0 result: 1 order # and 53154 order # 14310
Subset: 8331 size subset # 1 and size subset # 0 result: 1 order # and 473411 order # 196256
Subset: 8332 size subset # 1 and size subset # 0 result: 1 order # and 133112 order # 34827
Subset: 8333 size subset # 0 and size subset # 1 result: 1 order # and 470291 order # 327392
This prototype proves the point. Not only was this project a success with a similar algorithm, but also the software ran for years in various forms on key production sites, reducing material consumption and generating a profit each time it ran. The software later mutated in a powerfully advanced planning and scheduling application.
Two main functions, among some minor ones, must be added:
- After each choice, the orders chosen must be removed from the 500,000-order dataset. This will preclude choosing the same order twice and reduce the number of choices to be made.
- An optimization function to regroup the results by trained optimized patterns, by an automated planning program for production purposes, as described in Chapter 12, Automated Planning and Scheduling.
Application information:
- The core calculation part of the application is less than 50-lines long
- When a few control functions and dataset tensors are added, the program might reach 200 lines maximum
- This guarantees easy maintenance for a team
It takes a lot of time to break a problem down into elementary parts and find a simple, powerful solution. It thus takes much longer than just typing hundreds to thousands of lines of code to make things work. The simple solution, however, will always be more profitable and software maintenance will prove more cost effective.