09 – A short introduction to Artificial Neural Networks
Are you ready for a new post?!
Here we go then!
Today I’m going to introduce you to a more technical topic. In fact, we are going to talk about machine learning and Artificial Neural Networks (ANNs) in particular.
Machine learning is a subfield of computer science which aims to give computers the ability to do something without being explicitly programmed for doing that. Originally, it comes from the study of pattern recognition and computational learning theory in artificial intelligence. Exploring the study and construction of algorithms that can learn from experience (historical data), the algorithm operates by building a model from example inputs in order to make data-driven predictions or decisions.
ANN is just a branch of Machine Learning. They are data processing paradigms inspired by the way the biological nervous system process information in human beings (Biological Neural Networks, BNNs). Usually they are used to estimate or approximate functions that can depend on a large number of inputs that are generally unknown.
Actually, there is no single formal definition of what an artificial neural network is. However, a class of statistical models may commonly be called "neural" if it contains sets of adaptive weights, like numerical parameters tuned by an algorithm, and it is capable of approximating the output of the analysis from a bunch of inputs (based on previous training and cumulated experience).
So crew, does it means that we can solve any problem we’d like to solve in statistics just creating a neural network? Well, theoretically yes, but it depends on the kind of problem you’d like to solve and on the quality and amount of the information that you are able to collect (and analyse). Moreover you probably noticed that I talked about “approximating”, but you know, sometimes in Engineering as in other disciplines a good estimation is more than enough for taking decisions reducing risks to an acceptable minimum.
ANNs work as pure black-boxes. This is their main disadvantage. They give results (and usually good results) but without explaining why. Or, more accurately, I should probably say: it is difficult to understand why. Their ability to learn by example makes them very flexible and powerful but they need lots of training (existing data, information and examples). Therefore they require also high computational power.
|Fig. 1 – An example of Artificial Neural Network (en.wikipedia.com)|
Is to have the correct results without understanding why we obtain them worth it? Well, it depends. In Science it is very important to be able to answer to the question “why?”. However, in statistics and computer science in general, sometimes the number of variables to control in the problem and their relationship is too complicated. In these cases, to make a good prediction and therefore to obtain results without having a deep understanding of how they were obtained is necessary.
However, both in Engineering and Science, a solid theoretical preparation is fundamental. For this reason, in order to have a better understanding of what a neural network is, how it works and which are its capabilities, strengths and weaknesses, let’s start from the beginning giving some examples:
Have you ever answered the question “What’s 2+2?”? Easy, right? But have you ever done any calculation when you gave the answer? No, right? We all just know that 2+2 is 4 and we can make calculation quite easy, but simply we do not. It is, and really it doesn’t matter if you calculate it or you guess: the answer is correct.
What about when you want to go to the kitchen for eating a sandwich and the door is closed? Subconsciously, you just know (from experience) that you have to turn down the handle of the door and then pull the door to open it. Right? You just do it without thinking every time “What I should do now?”.
And what about recognising the writing of your good friend “Jack” from your own handwriting or the one of your brother? If the three of us write a six on a paper we all recognise that it is a six also if it is written in slightly different ways by each one, right?
Fig. 2 – Sixes written by hand in 5different ways.
You can easily recognise they are all six right?
That’s the way biological neural networks work in real life in humans (as well as in animals). Based on previous experience we react to certain inputs very quickly and precisely without performing any difficult calculation or time demanding analysis.
Much is still unknown about how the brain trains itself to process information, but crew, you should know that: in human brains, neurons collect signals from others through a host of fine structures called dendrites. Each neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches. At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity from the axon into electrical effects that inhibit or excite activity in the connected neurones. It is the neural network itself.
|Fig. 3 – Basic illustration of a biological neuron (Brown & Benchmark, 1995)|
When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon. Learning occurs by changing the effectiveness of the synapses so that the influence of one neuron on another changes.
Similarly to what happen in nature, an artificial neuron is a device with many inputs and one output. The neuron has two operation modes: the training mode and the using mode. In the training mode, the neuron learns to fire (or not), for particular input patterns. In the using mode, when a recognised input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not. Firing rules can be established applying for examples, filters or thresholds to data values or using more complex techniques as well.
In my project I’m working on assessing the impact of road pavement conditions on truck fleet fuel consumption at network level under real driving conditions. The quantity of available data may seem huge, but the quantity of variables acting on truck fleet fuel economy is even bigger and the resolution of some of the data threatens to ruin any attempt done using traditional approaches.
In fact, computers traditionally follow a list of instructions in order to solve a problem (algorithmic approach). In this case, unless the specific steps that the computer needs to follow are known, the computer cannot solve the problem. ANNs are extremely useful in this case. In fact they give results just comparing previous evidences with new data available. An ANN creates its own experience and, as human beings do, given an input they produce an estimation of the output based on it.
Nowadays, the range of applications of ANNs is very wide. For example they are used in spam filtering, computer vision, optical character recognition (by pixel comparison), search engines (Google for example), or in economics, predicting what happens in the stock market (based on historical data), amongst others.
In my project, first I’m going to assess the impact of road conditions on truck fleet fuel consumption based on the Big Data approach, then using this technique, a model predicting the value of fuel consumption based on the actual road conditions will be generated and validated.
At the moment a multivariable linear model has been generated based on available data and suggestions taken from the literature (have a look at the main references provided here). From this first analysis of data, it was seen that although it is possible to say that a weak relationship between the variables exists. However performing the analysis on different data, results match what previous authors have found out using different approaches (Zaabar and Chatti, 2010). This gives more confidence in the undertaken “Big Data” approach. The fact that the model gives low correlation between the predicted value and the collected data can be due to the relationship between the variables not being linear or other variables (which are not considered in this preliminary study, including the payload and the aerodynamic force for example) having significant influence on the output of the model. Given that previous studies usually considered a linear relationship between the variables, and they assessed the impact of variables like the payload and the aerodynamic resistance to be higher compared to road conditions (which are just a part of the rolling resistance) (Sandberg, 1990, Beuving, 2004, Zaabar and Chatti 2010), probably the second reason is the main issue of the generated model.
I’m sorry crew but I cannot give you more information about it at the moment but apparently results from this preliminary study may be published soon! ;)
So, what’s the plan for the future of the project? Well:
Last week I started a 3 months secondment in Microlise (one of the partners from industry involved in the project). It is a very good opportunity for me and I really want to learn and get the most from it. In Microlise I continue developing the project in a different environment than academia. Microlise collects the data used in the project about truck performances and they are experts in data mining and Big Data analysis. Usually they use these data for helping truck fleet managers in decision making process about drivers training requirements and vehicle maintenance. In there a model estimating the payload of trucks in motion based on the current performances of the vehicle will be developed and validated collecting some more data. At the same time more frequent and more precise measurements about fuel consumption will be collected and analysed as well. Using this new information, the multilinear model could be updated including the variables that were missing in its first version. Finally, the ANNs technique will be implemented in the code in order to generate more precise estimations.
It sounds a good plan, right?
More information will follow soon crew!
So: just stay tuned!
Beuving, E., De Jonghe, T., Goos, D., Lindhal, T., and Stawiar-ski, A., 2004. Environmental Impacts and Fuel Efficiency of Road Pavements. Industry report. Eurobitume & EAPA BRUSSELS.
Brown & Benchmark, 1995. Introductory Psycholgy – Electronic Image Bank. Times Mirror Higher Education Group, Inc.
Sandberg, Ulf S. I. 1990. Road Macro- and Megatexture Influ-ence on Fuel Consumption. ASTM STP 1031 pp. 460-479.
Stergiou C. & Siganos D. 2011. Neural Networks. At: https://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/
Wikipedia, 2016 – Artificial neural network, Wikipedia, The Free Encyclopedia, at: https://en.wikipedia.org/wiki/ Artificial_neural_network
Zaabar, I. & Chatti, K., 2010. Calibration of HDM-4 models for estimating the effect of pavement roughness on fuel consumption for U. S. conditions. Transportation Research Record, (2155), pp.105–116.