09 – A short introduction to Artificial Neural Networks
Hey crew!
Are you ready for a new post?!
Here we go then!
Today I’m going to introduce you to a more technical topic.
In fact, we are going to talk about machine learning and Artificial Neural Networks
(ANNs) in particular.
Machine learning is a subfield of computer science which aims
to give computers the ability to do something without being explicitly
programmed for doing that. Originally, it comes from the study of pattern
recognition and computational learning theory in artificial intelligence.
Exploring the study and construction of algorithms that can learn from
experience (historical data), the algorithm operates by building a model from
example inputs in order to make data-driven predictions or decisions.
ANN is just a branch of Machine Learning. They are data
processing paradigms inspired by the way the biological nervous system process
information in human beings (Biological Neural Networks, BNNs). Usually they
are used to estimate or approximate functions that can depend on a large number
of inputs that are generally unknown.
Actually, there is no single formal definition of what an
artificial neural network is. However, a class of statistical models may
commonly be called "neural" if it contains sets of adaptive weights,
like numerical parameters tuned by an algorithm, and it is capable of
approximating the output of the analysis from a bunch of inputs (based on
previous training and cumulated experience).
So crew, does it means that we can solve any problem we’d
like to solve in statistics just creating a neural network? Well, theoretically
yes, but it depends on the kind of problem you’d like to solve and on the
quality and amount of the information that you are able to collect (and
analyse). Moreover you probably noticed that I talked about “approximating”,
but you know, sometimes in Engineering as in other disciplines a good
estimation is more than enough for taking decisions reducing risks to an
acceptable minimum.
ANNs work as pure black-boxes. This is their
main disadvantage. They give results (and usually good results) but without
explaining why. Or, more accurately, I should probably say: it is difficult to
understand why. Their ability to learn by example makes them very flexible and
powerful but they need lots of training (existing data, information and
examples). Therefore they require also high computational power.
Fig. 1 – An example of Artificial Neural Network (en.wikipedia.com) |
Is to have the correct results without understanding why we
obtain them worth it? Well, it depends. In Science it is very important to be
able to answer to the question “why?”. However, in statistics and computer
science in general, sometimes the number of variables to control in the problem
and their relationship is too complicated. In these cases, to make a good
prediction and therefore to obtain results without having a deep understanding
of how they were obtained is necessary.
However, both in Engineering and Science, a solid
theoretical preparation is fundamental. For this reason, in order to have a
better understanding of what a neural network is, how it works and which are
its capabilities, strengths and weaknesses, let’s start from the beginning
giving some examples:
Have you ever answered the question “What’s 2+2?”? Easy,
right? But have you ever done any calculation when you gave the answer? No,
right? We all just know that 2+2 is 4 and we can make calculation quite easy,
but simply we do not. It is, and really it doesn’t matter if you calculate it
or you guess: the answer is correct.
What about when you want to go to the kitchen for eating a
sandwich and the door is closed? Subconsciously, you just know (from
experience) that you have to turn down the handle of the door and then pull the
door to open it. Right? You just do it without thinking every time “What I
should do now?”.
And what about recognising the writing of your
good friend “Jack” from your own handwriting or the one of your brother? If the
three of us write a six on a paper we all recognise that it is a six also if it
is written in slightly different ways by each one, right?
Fig. 2 – Sixes written by hand in 5different ways.
|
You can easily recognise they are all six right?
That’s the way biological neural networks work in real life
in humans (as well as in animals). Based on previous experience we react to
certain inputs very quickly and precisely without performing any difficult
calculation or time demanding analysis.
Much is still unknown about how the brain trains
itself to process information, but crew, you should know that: in human brains,
neurons collect signals from others through a host of fine structures called
dendrites. Each neuron sends out spikes of electrical activity through a long,
thin stand known as an axon, which splits into thousands of branches. At the
end of each branch, a structure called a synapse converts the activity from the
axon into electrical effects that inhibit or excite activity from the axon into
electrical effects that inhibit or excite activity in the connected neurones. It
is the neural network itself.
Fig. 3 – Basic illustration of a biological neuron (Brown & Benchmark, 1995) |
When a neuron receives excitatory input that is sufficiently
large compared with its inhibitory input, it sends a spike of electrical
activity down its axon. Learning occurs by changing the effectiveness of the
synapses so that the influence of one neuron on another changes.
Similarly to what happen in nature, an artificial neuron is
a device with many inputs and one output. The neuron has two operation modes:
the training mode and the using mode. In the training mode, the neuron learns
to fire (or not), for particular input patterns. In the using mode, when a
recognised input pattern is detected at the input, its associated output
becomes the current output. If the input pattern does not belong in the taught
list of input patterns, the firing rule is used to determine whether to fire or
not. Firing rules can be established applying for examples, filters or
thresholds to data values or using more complex techniques as well.
In my project I’m working on assessing the impact of road
pavement conditions on truck fleet fuel consumption at network level under real
driving conditions. The quantity of available data may seem huge, but the
quantity of variables acting on truck fleet fuel economy is even bigger and the
resolution of some of the data threatens to ruin any attempt done using
traditional approaches.
In fact, computers traditionally follow a list of
instructions in order to solve a problem (algorithmic approach). In this case,
unless the specific steps that the computer needs to follow are known, the
computer cannot solve the problem. ANNs are extremely useful in this case. In
fact they give results just comparing previous evidences with new data
available. An ANN creates its own experience and, as human beings do, given an
input they produce an estimation of the output based on it.
Nowadays, the range of applications of ANNs is very wide.
For example they are used in spam filtering, computer vision, optical character
recognition (by pixel comparison), search engines (Google for example), or in
economics, predicting what happens in the stock market (based on historical
data), amongst others.
In my project, first I’m going to assess the impact of road
conditions on truck fleet fuel consumption based on the Big Data approach, then
using this technique, a model predicting the value of fuel consumption based on
the actual road conditions will be generated and validated.
At the moment a multivariable linear model has been
generated based on available data and suggestions taken from the literature (have
a look at the main references provided here). From this first analysis of data,
it was seen that although it is possible to say that a weak relationship
between the variables exists. However performing the analysis on different data,
results match what previous authors have found out using different approaches
(Zaabar and Chatti, 2010). This gives more confidence in the undertaken “Big
Data” approach. The fact that the model gives low correlation between the
predicted value and the collected data can be due to the relationship between
the variables not being linear or other variables (which are not considered in
this preliminary study, including the payload and the aerodynamic force for
example) having significant influence on the output of the model. Given that previous
studies usually considered a linear relationship between the variables, and they
assessed the impact of variables like the payload and the aerodynamic
resistance to be higher compared to road conditions (which are just a part of
the rolling resistance) (Sandberg, 1990, Beuving, 2004, Zaabar and Chatti
2010), probably the second reason is the main issue of the generated model.
I’m sorry crew but I cannot give you more information about
it at the moment but apparently results from this preliminary study may be
published soon! ;)
So, what’s the plan for the future of the project? Well:
Last week I started a 3 months secondment in Microlise (one
of the partners from industry involved in the project). It is a very good opportunity
for me and I really want to learn and get the most from it. In Microlise I continue
developing the project in a different environment than academia. Microlise collects
the data used in the project about truck performances and they are experts in
data mining and Big Data analysis. Usually they use these data for helping
truck fleet managers in decision making process about drivers training
requirements and vehicle maintenance. In there a model estimating the payload
of trucks in motion based on the current performances of the vehicle will be
developed and validated collecting some more data. At the same time more
frequent and more precise measurements about fuel consumption will be collected
and analysed as well. Using this new information, the multilinear model could
be updated including the variables that were missing in its first version.
Finally, the ANNs technique will be implemented in the code in order to
generate more precise estimations.
It sounds a good plan, right?
More information will follow soon crew!
So: just stay tuned!
Cheers,
FP13
References:
Beuving,
E., De Jonghe, T., Goos, D., Lindhal, T., and Stawiar-ski, A., 2004.
Environmental Impacts and Fuel Efficiency of Road Pavements. Industry report.
Eurobitume & EAPA BRUSSELS.
Brown &
Benchmark, 1995. Introductory Psycholgy – Electronic Image Bank. Times Mirror
Higher Education Group, Inc.
Sandberg,
Ulf S. I. 1990. Road Macro- and Megatexture Influ-ence on Fuel Consumption.
ASTM STP 1031 pp. 460-479.
Stergiou C.
& Siganos D. 2011. Neural Networks. At: https://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/
cs11/report.html
Wikipedia,
2016 – Artificial neural network, Wikipedia, The Free Encyclopedia, at: https://en.wikipedia.org/wiki/
Artificial_neural_network
Zaabar, I.
& Chatti, K., 2010. Calibration of HDM-4 models for estimating the effect
of pavement roughness on fuel consumption for U. S. conditions. Transportation
Research Record, (2155), pp.105–116.
Comments
Post a Comment