## Neural Network “Machine Vision” using Singular Value Decomposition for Feature Extraction

This article discusses a simple case of using a Neural Network to interpret an image, which contains a traffic light system, with the objective being to correctly identify if the light is red, yellow, or green. Here is a video link which goes over the basic running of the software. If you send a donation to my PayPalMe along with your email address and I’ll send you a copy of the codebase, which runs in Matlab or Octave.

Summary

The basic technical approach is as follows:

1. The Levenberg-Marquardt (LM) optimization algorithm is used for training of the Neural Networks – in this case, the excellent Pyrenn library (click here to access the site) provides the LM engine – their source code software is available in either Matlab or Python formats.
2. As with all Neural Network applications, it’s best to start with the very basics of a learning solution (the training and test sets) in order to see if the methodology is going to work. Thus basic traffic light images are drawn using PowerPoint.
3. Extract the R, G, and B matrices from each image, and then perform Singular Value Decomposition (SVD) on the R, G, and B matrices for each image. The SVD values are inputs to the Neural Network.
4. Assign values to the Neural Network outputs that map to “Yellow”, “Green”, and “Red”. In this case, an output value of -0.5 is assigned to “Yellow”, an output value of 0.0 is assigned to “Red”, and an output value of +0.5 is assigned to “Green”.
5. Train and test multiple Neural Networks on the processed images (using the top principal SVD values as inputs), and then select the best performer.

Following the first rule, the Neural Network will be trained and tested on images that depict PowerPoint-drawn traffic light systems – each training and test image displays a green, yellow, or red light in the normal corresponding locations on the traffic light box.

Eighteen sets of drawings of traffic lights were created – thus eighteen data sets to be used for training and test sets. Half of the images (9) were used for training of a Neural Network, and the other half (the remaining 9) were used for testing purposes. This may seem like a small data set, but it’s best to start small and only add more images to the data set if it is likely to increase the performance (it’s best to start with a minimal amount of data instead of too much data).

When performing image object recognition, it’s important to maximize the available image information while also using some kind of feature extraction, also called Principal Component Analysis, or PCA, to extract the key features, and thus minimize the inputs to the Neural Network.

In this case, two approaches were used to extract the important image information:

1) the R (Red), G (Green), and B (Blue) matrices were extracted from each image, and

2) Singular Value Decomposition, or SVD, was used as the PCA technique to draw out the key image features from each of the three extracted R, G, and B matrices, and the top SVD values from each of the R, G, and B matrices were used as inputs to the Neural Network.

One of the top-performing Neural Networks scored very well with low error values. It was able to correctly identify whether an image contained a red, yellow, or green light in all of the test images as shown below in Figure 1. Note that +0.5 on the Y-axis (the Neural Network output) represents a green light, 0.0 represents a red light, and -0.5 represents a yellow light.

Technical Discussion

The rest of the article discusses the technical path, start to finish.

The first step is to build the training and test images. A rectangle was constructed with a gray background, and circles were inserted in a similar manner to that of a traffic light. The circle would either be black or one of the three possible colors – red, yellow, or green, as shown below. Each training image contained the traffic light objects at different locations, within the image, and at different angles. Thus the objective was for the Neural Network to correctly interpret the light color despite the location of the traffic light in the image, and despite the angle of the traffic light. The training images are shown below in Figure 2.

The test images are shown below in Figure 3.

The process from feature extraction to input data set consists of the following steps:

1. Reduce the image size for all of the training and test images – in this case the reduced image was 0.3 the size of the original image. The reason for doing this is that it reduces the time to process the images without sacrificing important resolution – this is critical when doing real-time applications.

2. Extract the three R, G, and B matrices from each of the reduced images.

3. Perform the SVD on each of the R, G, and B matrices for each image.

4. Harvest the top 30 SVD values for each of the R, G, and B matrices and load these into an array to be used as input to a Neural Network.

The basic process is shown below in Figure 4.

The outputs are mapped such that “Yellow” corresponds to a -0.5 output value, “Red” corresponds to a to 0.0 output value, and “Green” corresponds to a +0.5 output value. Thus the Neural Network should output around 0.0 when observing a red light, a -0.5 when observing a yellow light, and +0.5 when observing a green light, as shown below in Figure 5.

In Figure 6, the Neural Network data structure layout is shown on the right side. The details are reviewed on the left side.

The code base layout is shown below in Figure 7. There are four primary functions:

1. loadData.m – this function is run first. It loads the image files, reduces the image sizes, extracts the R, G, and B matrices from each image, and performs SVD on each image.

2. buildTrnTstDataSets.m – this function is run next. It takes the arrays built in the first function and sorts them into input training and test arrays, and output training and test arrays.

3. runNet.m – this is run last. This is the function that will use the training data to build a Neural Network (using the LM optimization algorithm to generate the weights or gains), then it will test the Neural Network’s performance on the test input and output data.

4. monteCarlo.m – this function is used to train and test multiple Neural Networks. It is currently set to keep going until it has 20 Neural Networks that have passed the acceptable test criteria (those Neural Networks that don’t pass the test are discarded).

In summary, a Neural Network can be trained to recognize a red, yellow, or green light from PowerPoint images, with just a few training images. An example of a top performing Neural Network is shown below in Figure 8.

The next step would be to use real images of traffic lights, with the traffic lights close in to the image as was the case with the PowerPoint-drawn images. Spoiler Alert … it works with real traffic light images as well. That will be the next article.

## Use Matlab API to Increase Performance of Java, C++, or Python Applications

MATLAB (https://www.mathworks.com/) – hereon referred to as “Matlab” – has come a long ways since the early days – especially regarding the execution time for matrix and other types of operations. It’s almost unparalleled when compared to the available math libraries for other coding languages. As such, it has become another important toolbox for the engineer, no matter the software development language he or she is using for the specific application.

MathWorks has made it very easy to interface Java, C, C++, Python, and FORTRAN applications with standard and custom (user-developed) Matlab functions. This article shows an example of a Java application test bench which performs matrix multiplication using: 1) Matlab’s lighting-fast matrix multiplication capability, 2) the very fast Efficient Java Matrix Library (EJML), and 3) the conventional method of matrix multiplication. Matlab is one to two orders of magnitude faster than EJML which is one to two orders of magnitude faster than the conventional method.

A video overview is shown below – followed by a similar written discussion. At the end of this article, you can watch the code walk-through video as well as download the source code configured in an Apache NetBeans 12.2 Java project environment. The only item you will need to provide is your own Matlab “engine.jar” which is in your licensed copy of Matlab.

A simple high-level block diagram, of the application-to-Matlab interface, is shown below. Note that the software application can access not only the standard Matlab functions, but any of the Matlab toolbox functions (for those toolboxes that the engineer has purchased), AND any custom scripts, functions, and classes that were developed by the engineer.

The following is an example of a simple Java application which performs matrix multiplication (the user selects the matrix size) with three methods: 1) the conventional way of multiplying matrices, 2) the Efficient Java Matrix Library (EJML), and 3) the Matlab workspace to multiply the two matrices (“A * B”). While the EJML matrix multiplication algorithm is very fast, the Matlab algorithm is much faster. I suspect that MathWorks uses multithreading / parallelization to achieve the blistering speeds. Note that the CPU was an Intel i9, 8-core processor.

The following is the same plot but with specific numbers for the multiplication of matrices with 1,000 x 1,000 elements, 2,500 x 2,500 elements, and 5,000 x 5,000 elements. For example, for a 5,000 x 5,000 matrix multiplication operation, the conventional approach takes almost 1,064 seconds. The EJML algorithm is much faster at 76 seconds. But the Matlab algorithm is even faster at just over 2 seconds.

You may say “well I don’t multiply large matrices very often …”. But you may want to perform real-time machine vision (perception) applications such that you have to process large matrices of image pixel elements in very short amounts of time. Ah – now I have your attention. That will be the next article.

The following is a simple Java test bench project layout for demonstrating the interface with the Matlab engine API. It’s a simple setup – there’s driver class, a timer class, a class for building the two matrices to be multiplied, and the three different matrix multiplication classes (conventional, EJML, and Matlab). The source code is available as a download at the end of this article.

The Java-to-Matlab interface (not a Java Interface) class, which connects to the Matlab workspace, invokes four basic calls to the Matlab engine, as shown below.

The Java method that performs the direct interface with the Matlab workspace is shown below.

Note that after launching the Matlab engine, there is a “warm-up” period such that some type of optimization of the algorithm is performed. Thus the Matlab algorithm needs to be run several times to allow the optimization process to be performed. In the plot below, the Matlab engine is allowed to “warm-up” for 20 cycles, and then all three algorithms are tested together for the next 40 cycles. Once the “warm-up” process has been performed, the Matlab algorithm will continue to perform with blistering speeds up until the Matlab engine is shut down (the algorithm could be run for weeks at a time with the same high performance) – thus the “warm-up” phase is only required at the very beginning after the Matlab engine has been activated.

The following Java console output shows part of the run-time results from the Java test bench project.

The following is a code walk-through video of the Java bench test code (you can also see it here on YouTube).

The following is the link for downloading the Java source code as an Apache NetBeans 12.2 project. Note that the Matlab “engine.jar” file is missing since you will have to use your own “engine.jar” file that is tied directly to the licensed version of Matlab running on your computer.

If you have any questions, feel free to contact me at mikescodeprojects@protonmail.com.

## Consulting for AI Projects

I’m currently open to consulting on AI projects – my AI resume can be reviewed here and my “Visual Resume” can be reviewed here. The best way to contact me is by email at first – mikescodeprojects@protonmail.com.

Send me a description of what you want to accomplish and we can start discussing the issues around it. If I don’t think I can perform the work (e.g., it will take too long for your budget, my gut tells me that the likelihood of success is low, etc.) then I’ll be very up front with you and explain why. Otherwise I’ll work up a schedule with milestones and we can iterate until both of us agree.

## Neural Network Stock Selector

I’ve been developing this code base for about 6 years – even longer in a casual manner. Over the past 6 months I’ve been upgrading the code base from a very old version of Matlab to Matlab R2017b (4 years old but still reasonably recent).

In a nutshell, the system develops Neural Networks to analyze a large number of stock profiles and then predict the movement of the stock prices over the next year – so it’s forecasting for a year time period. The key component is the one in which the Neural Networks carefully select stocks that are highly likely to surge in the upcoming year.

During the code upgrade process, I periodically run full end-to-end tests to make sure the system architecture integrity has been preserved. Below are the results from a test case run this morning. The Neural Network system analyzes a list of companies over 10 years and makes predictions for each year. Part of the process is to “team” the Neural Networks – that is that a stock only makes it on a list if at least a specified number of Neural Networks have selected it. So for a teaming number of 20, the system would show the companies that were selected by at least 20 Neural Networks.

Below is a set of plots from this test run (the forecast time period is 2008 through 2017) – a teaming number of 20 was selected – the companies were used in simulated purchases and sales of the selected stocks.

The vertical yellow bar was added in to highlight the teaming number 20 performance results. The performance of the system is shown below in a table format.

The Neural Network system abstained from selecting any stocks in 2008, 2014, and 2015.  For the other years the team of 20 selected various stocks.  The only bad year was 2017 – the Neural Networks selected two stocks that were sold at an automatic -10% loss limit. The average Return-On-Investment (ROI) for the Dow Jones Industrial Average (DJIA) was 7.9%. The average ROI for the Neural Network system was 43.5%.

The bottom line is that over 10 years, the Neural Network system outperformed the DJIA by a factor of 5.5.  Below is the result of a short script that computed an initial investment of \$100 for the DJIA and the Neural Network system – it then adds in the return on investment for each year.

After 5 years, the Neural Network would have put the investor ahead by a factor of 6.1 over the DJIA and by the end of 10 years, the investor would have made 12.2 times the amount by using the Neural Network system.

Below is an example of the selection output for a teaming number of 20 for the year 2016. The aggregate return on investment was 97.6%.

The stock chart profiles for the selected companies, LEE and MTZ, are shown below. The forecast period was 2016.

The current objective is to get the code upgrade finished and then configure it to test for weekly predictions. That is that the system will select certain stocks that should do well over the next week (purchased on Monday and sold on Friday or before). I’ll be running real-time tests – that is that predictions will be published on my YouTube channel so that an absolute time-stamp is attached to the predictions. Then we’ll see how the selected stocks fair.

## What are Super Nets?

Let’s start with the example of students in a medical school. There are 1,000 students and the top 100 students (the top 10%) are getting straight A’s because they are bright and they have studied diligently. Would you say that all of these 100 students (the top 10 percenters) will do equally well out in the real world since they all scored similar grades in medical school? Would you say that all 100 students will become world-renowned brain and heart surgeons that create modern and ground-breaking methods of surgery?

No – you intuitively know that only a handful of the 100 students will “set the world on fire” while the rest of the 100 students do good jobs as doctors and surgeons but they don’t do anything Earth-shattering.

Why is that? After all they all scored the same in all of their tests so we’d assume that they’d all set the world on fire, correct? Well of course not because we know that despite their having scored similarly in their tests (A’s), their brains are wired differently and some of those brains are particularly suited for being creative and innovative in the real world. But we can’t measure that capacity in the university (med school) with tests – we don’t find out until “the rubber hits the road” and each of these graduated students goes out into the real world and begins tackling challenges in their fields.

Well the same applies to Neural Networks … First you train many Neural Networks and only select the top performers (each takes the same series of tests – just as with the students in med school). So out of 1,000 trained Neural Networks, perhaps only 10% (100) of them score above a specified threshold. Do you assume that all of these Neural Networks will perform equally as well in “the real world” – the full application domain space of your application? No – you must test them in this application domain space, and … just a handful of Neural Networks will be the “renowned brain and heart surgeons” (speaking figuratively of course) and the remaining Neural Networks out of the original 100 will perform in an average way.

These super high-performing Neural Networks (the “brilliant and innovative brain surgeons”) are called Super Nets. These are the ones that demonstrate the blistering performance outside of the original training regime but you won’t discover them until you fully test the 10% high achieving Neural Networks in the full application domain space.

In the video below, Neural Networks are being trained (using Matlab’s extremely fast Levenberg-Marquardt optimization algorithm) for stock market prediction purposes (specifically to predict a company’s future stock performance based on its previous history – one could call it Neural Network Technical Analysis). The Neural Networks that achieve a prediction ROI (Return On Investment) of greater than 50% are saved as part of the high-performer group (similar to the 10% of the med school student group). Thus when you see the “SUCCESSUL!!” text, this is high-achiever Neural Network that has done very well with a test set of companies.

However, the real test is when these high-achieving Neural Networks are tested against a 10 year rolling forecast data set – that is they must make predictions for each year of a 10 year time span. Those that score the highest ROI with the lowest standard deviation (and there are just a few) are the Super Nets – the Super Star performers.

Autonomous Driving Application

With this application, many Neural Networks would be trained on the sensor inputs (many different images), with the outputs being the appropriate driving commands. The Neural Networks which surpassed a specified threshold of correctly issuing the correct commands would be saved. So for simplicity we’ll say that 1,000 Neural Networks were trained but only 100 Neural Networks scored above the specified threshold.

The next step is to test those high-scoring 100 Neural Networks on the open road in the autonomous vehicle and each Neural Network is tested and scored on performance. From the road tests of these 100 Neural Networks the top 10 performing Neural Networks are derived. These top 10 Neural Networks are considered to be the “final product” – they are the Super Nets which will be performing the autonomous control of the vehicle.

These Super Nets will form a team such that a “consensus solution” is used – all 10 Super Nets are constantly processing the road images and issuing correction commands. However, the solution is taken from the consensus – so, for example, if 7 out of 10 Super Nets agree that a “slow down, turn right” command should be issued, then that is the one selected.

For most cases, we can assume that all 10 Super Nets will issue the same command or set of commands. However, for cases where there are ambiguities (i.e. a situation is encountered for which they were not trained – maybe a tilted road with fog at night, etc.), the teaming approach will produce a good solution since it is by consensus.

## A Software Profiler is Your Best Friend

One of the key assets in your suite of software testing tools is the Profiler, and you should get to know it well. The Profiler is “standard equipment” with most software development environments and has a wide array of capabilities to help point out weak areas of the code, to demonstrate the bottlenecks where most of the time is being spent, to find memory leaks, etc.

A few days ago, I needed to use the Matlab Profiler for some code I was working on and decided that it would be a good article because of the amount of run-time that the Profiler helped me save by pointing out a specific function that was, unnecessarily, taking the most time.

I’d been updating some old Matlab code – this particular function processes (parses, manipulates, assembles, etc.) a large number of ASCII text files (over three hundred files with 7-8,000 rows in each file) which can be time-intensive (much slower than just crunching numbers). But in this case, the run time of over 1,200 seconds seemed excessive – as shown below. Keep in mind that this code is running on a relatively new desktop with an SSD drive and an Intel i7-9700 processor with 8 cores – so I expected the function to run faster (a gut feeling).

Given that I didn’t know where to start looking, I decided it was best to run the Matlab Profiler to see if there were any functions that were slowing down the process unnecessarily. The quick way in Matlab to launch the Profiler is to simply have the function opened and then, in the Editor tab, click on “Run and Time” as shown below.

The run-time with the Profiler, shown below, was longer than the original run because of time spent by the Profiler performing its measurements.

At the end of the run, a Profiler summary was generated, and is shown below. mn is the main function in this example – note that Profiler shows that the function dateCheck, called by mn, seemed to be the resource hog as it was used for 1,183 seconds out of the total of 1,473 seconds. So, the first step was to click on mn to dig down into the Profiler trace.

After clicking on the mn function (above) the next level down is shown below. The top line in the mn function diagnostic page (below), shows that [ds] = dateCheck(ds) is taking up 80.6% of the run-time. Thus the function dateCheck, called by mn, is the culprit and the next step is to click on dateCheck lower down in the diagnostic page (see the red arrow below) and dig further down.

The Profiler summary then takes us to the next level down into the dateCheck function diagnostic page – and the line of code, in the dateCheck function, that uses the most resources is at the top of the diagnostic page (shown below). The children functions are shown below that section and the main culprit is the Matlab function datenum (see red arrow below)

So the issue is the Matlab datenum function, which is used in my dateCheck function.

Now we go to my dateCheck function in the Matlab source code file and find the line – currDateNum = datenum(tDate) – as shown below. That is the culprit, which is apparently causing a big drain of resources.

The next step is search the forums for a solution – the question is, why does this function take an excessive amount of time? A quick search found the very useful solution shown below. The answer is that the function datenum works much more efficiently when the date format is specified in the datenum argument list (instead of the function having to figure out the format itself).

With answer in-hand, the next step is to implement the solution – that is, specify the date format in the datenum argument, as shown below.

With the solution implemented, the final step is to re-run the software and see how much time was saved with this solution. As shown below, the run time was 571 seconds vs the original run time of 1,218 seconds!!

Now you understand that the Profiler is your best friend!! And keep in mind that it can not only save you a lot of time with your software runs but it can help you debug other issues as well.

Most software development environments or toolchains have profilers built into them. The example below is for NetBeans running a Java project. In this case I selected specific methods to be tested (profiled) and the percentage of run-time displayed. Profilers, such as Valgrind, used in Linux for C/C++ applications, are commonly used to detect memory leaks.

## Good Data Means Fast Neural Network Training Times

The Pyrenn Levenberg-Marquardt training algorithm for Feed-Forward Neural Networks is extremely fast – 0.140 seconds for a Neural controller which must simultaneously balance an inverted pendulum, mounted on a cart, while moving the cart back to the origin – watch the short video below.

In the vast majority of the Neural Network applications that I’ve developed, the training time was a tiny fraction of my time spent on the project. The major time hit on these kinds of efforts is not training time – instead it is the time spent to develop quality and robust training and test data sets (thinking it through and careful analysis of the data – many times this is an iterative process). If this is done correctly, the resultant Neural Networks are built extremely quickly and yield high performance.

## Neural Network Performance Shaping Preview

This is just a quick preview of what will be coming on my first Patrons-only post on my Patreon account (sometime in the next 2 weeks) – https://www.patreon.com/realAI. A Neural Network was trained on a single pass of the behavior of a cart with inverted pendulum system being controlled by a conventional controller. The Performance Shaping technique was then implemented which allows the user to command the Neural controller to either quickly minimize the Pendulum angle error (and maintain the minimum error) or quickly minimize the Cart position error. This is a powerful technique that allows you to use a single data set while building in the ability to modulate the performance of the Neural controller in favor of the Pendulum or in favor of the Cart.

The video shows first the Neural controller being commanded to quickly minimize the Cart position error, while keeping the inverted pendulum upright. Then the Neural controller is commanded to quickly minimize the Pendulum angle error – it does this and slowly walks the Cart back to the zero reference point (thus zeroing out the Cart position error). The horizontal red arrows are the Neural controller commanded forces acting on the Cart. A set of plots are shown at the end of the video.

## Keep It Simple

Neural Networks don’t always require complex frameworks and other mathematical algorithms to support them – it’s always best to start simple and only increase the complexity when absolutely needed.

A case in point is this Neural Network control system that was designed to control one specific RC helicopter airframe and yet … was able to fly several different types of RC helicopters with different airframes and different powerplants (gas, electric, and jet). In addition, the Neural Network control system could easily handle sling-loads and gusting / turbulent winds – two nonlinear disturbances that were never part of the training and test sets.

The flight software, with Neural Network functions, was:
1) coded in C,
2) used procedural, not object-oriented, programming,
4) ran in the DOS 6.22 operating system.

It was uncomplicated yet highly effective. The flight software executed the following functions:
1) Sensor and actuator checks were performed during the start-up mode and the flight software would refuse to execute the take-off maneuver if anything was off.
2) RS-232 messages were received and processed from the onboard RC data link via another IO processor board – these were the pilot’s basic commands such as “take-off”, “hover”, “ascend”, “forward-flight”, etc.
3) RS-232 messages were received and processed from an onboard 900 MHz data link. These were also the pilot commands plus various commands for autonomous flight. In addition, the flight software also performed a telemetry function by sending out flight and system data to the 900 MHz data link so that the operators on the ground could visually monitor the geographic location of the helicopter and the health statuses on the ground control station.
4) All sensor messages – direct RS-232 from the sensor and RS-232 messages from an IO processor board, were processed and the servo actuator positions were monitored.
5) It performed all of the flight control functions such as hover, transition to forward flight and forward flight, velocity-set, take-off, landing and also managed the execution of an autonomous flight plan (setting up the flight modes on its own). Thus it continuously commanded all of the servo actuators.
6) If the datalink was lost for a period of time, the flight software would execute the “Return Home” mode and fly back autonomously to its original takeoff point (including landing).
7) It recorded all pertinent flight and system data and continuously wrote it out to a binary file which could be reviewed later as a diagnostic tool if there were any observed anomalies.

And despite the simplicity, the Neural Network flight control performance was extremely powerful. The Neural Networks easily handle different airframes, different powerplants, gusting winds, etc.

The video (approximately 9 minutes) shows all of the different airframes performing various maneuvers – the same Neural Network control system stabilized and guided each of them.  There are four slides in the beginning and the rest of the video shows flight maneuvers.

This is not to say that you shouldn’t use modern tools and processes – but don’t overcomplicate the process. In the beginning it’s really important to keep things simple and only use what is needed to execute the objectives.

If you’d like to learn about building Neural Network applications, consider becoming a Patron on my Patreon site. I will be posting articles on a monthly basis with specific applications that will include source code, documentation, and video discussions.

## New Patreon Site for Learning to Apply AI

My new Patreon site is now up and running – https://www.patreon.com/realAI.

It can be very intimidating when seeing all of the requirements for Data Scientists and Machine Learning engineers (multiple languages, frameworks, etc.). Thus, the intent of the Patreon effort will be for me to help you lose your fear of attempting to use Neural Networks for real-world applications and to get you up to speed on basic methods and techniques. These tutorials will teach you the important core fundamentals that you need to know in order to: 1) understand and code up the application, and 2) form a good understanding of the solution in order to tailor and build a high-performance Neural Network.

The coding language for each project will either be Matlab / Octave script or Java. Eventually Python may be added to the mix. No purchase of tools will be necessary – Octave and Java Integrated Development Environments (IDEs) and the Software Development Kits (SDKs) can be downloaded at no cost from the internet.

The first lesson will be published for subscribers sometime in mid February. I’m excited and passionate about this new path and will do my best to provide a superior and satisfying product for my subscribers. I want you to learn and become cutting-edge AI engineers.

There are two subscription tiers, as discussed below.

Tier 1 (\$5 per month):