04 – Just a brief introduction to R
Hello
crew, how is it going? “R” you ready for a new post?
Today
I’ll introduce the software that I’m using for the data analysis in my project.
Have
you ever heard about it? Yes, when I arrived in NTEC I knew it just because my
brother uses to work with it. My brother is a mathematician (strange people,
but I love my brother anyway). By the way, I knew how to handle computer
programming languages such as MATLAB®, C++, Python and Microsoft® Visual Basic for Applications (VBA)so,
for this reason, it was not a problem to learn a new one. Futhermore, R is
quite easy to learn and its syntax is very similar to Python. Yes, it can be confusing
at times because it is possible to write
certain commands in another language instead of the one that you are using, but
basically once you know how to handle one of them, you know (almost) all. Just
change the syntax of the command or the keyword which recall a specific
function that you need. Learning by practice is the best (and probably the only)
way in these cases.
R
is a powerful, freeware but very light software generally used for statistical
computing. It offers a wide range of software facilities for data manipulation,
calculation and graphical display. The fact of being freeware is its major
strength, as many new libraries and
functions get added daily by users from all over the world regarding any kind
of issue and/or topic.
Why
I’ve chosen R instead of another software for the data analysis – apart from
being freeware – is because its versatility. R is, in fact, able to handle different
type of data without any difficulty, quickly and using just few lines of code. The
latter is another major advantage of the software thanks to the large variety
of libraries and functions that it offers.
At
the same time, using R, I canhandle any kind of file treating it as a matrix or
a data frame (sort of database). Microsoft® Excel files as well as “.csv”,
“.txt”, among many others file formats can easily be read and written as well. SQL
databases can also be accessed and queried. Furthermore georeferenced data can
be handled and quickly mapped too. It is perfect for the aim of my project.
Essentially,
R born with a bunch of pre-installed basic functionalities. Once R has been
installed, it can be readily used for doing some data analysis and graphs.
However, as mentioned before, the software can be expanded just by downloading
packages that add specific capabilities (functions) to R. A package is
basically a set of functions (scripts or part of them) that helps to perform
specific tasks. In this sense a function can be seen as a procedure which can
be used (called) whenever it is needed and that perform always a fixed step-by-step
sequence of activities as a routine. Everybody can create his/her own package
ready for other people to use. All the packages are stored into a single
database known as CRAN. The CRAN is central for using R because from it every
user can download (or upload) packages in order to customize his/her own R. The
idea of needing to add packages to the software might seems odd but it gives
the possibility to users of downloading just packages he/she really needs, and
it allows the software to remain quick and light as well as powerful. For this
reason IT technicians use to call software like R: “modular software”. The fact
that R is a “modular software” gives it enormous flexibility and every time a
new issue needs to be solved or new statistical techniques are developed,
contributors can quickly react producing and uploading to the CRAN a new R
package.
The
main disadvantage of the software is that in R you work just by the command
line and no GUI (Graphical User Interface) is available to help users. Moreover
the debugger system is not so advanced as it can be in MATLAB®, in Spider
(Python) or in other similar software and for this reason it is difficult to
find out where and why an error occurs.
It
is possible to install R onto your PC just by visiting the website of the R
project (https://www.r-project.org/) and following the next
steps:
1) On the left-hand side of the website right-click with your mouse on the link labelled as CRAN;
2) Choose the mirror closer to your country by right-clicking on it with your mouse;
3) Once you have chosen the mirror you prefer you will be asked to choose the platform of your PC (Windows in most of cases, or Linux and MacOS alternatively);
4) Then, right-click with the mouse on the “base” link if you are downloading R for your first time.
5) Finally, download the setup of the latest version of R and just follow the instruction of the .exe file.
1) On the left-hand side of the website right-click with your mouse on the link labelled as CRAN;
2) Choose the mirror closer to your country by right-clicking on it with your mouse;
3) Once you have chosen the mirror you prefer you will be asked to choose the platform of your PC (Windows in most of cases, or Linux and MacOS alternatively);
4) Then, right-click with the mouse on the “base” link if you are downloading R for your first time.
5) Finally, download the setup of the latest version of R and just follow the instruction of the .exe file.
What
else? Just practice!
R
is easy to learn and very quick in data analysis, so have a trial and let me
know what you think about it!
You
can find nice tutorials about R on the internet just googling keywords like:
“R”, “R data analysis”, “R tutorial”, etc.
Or
you can visit some of these websites which I found very useful:
But
videos on YouTube® and many others websites are available on the Internet about
programming in R. Just search for them if you got interested and want to try
this software.
That’s
all folks for this post! Waiting for the next post enjoy the Easter break!
In
the next post – it will be probably released immediately after Easter – I will show
you how I intend to process the data we collected. So, one more time: stay
tuned!
Cheers,
FP13
Comments
Post a Comment