R – What Is It?

I have blogged often on the subject of R, but have not previously addressed what R is and why you should use it. In the blog post, I will set out what R is, why you should use R and how you can learn R.

What Is R?

R is a powerful tool for statistical programming statistics and graphics. There are lots of software available which can do all of these things: spreadsheet applications like Excel; point-and-click applications like SPSS; data mining applications like SSAS; and so on. But what sets R apart from applications like those listed?

R is a free and open source application. Because it is free you don’t have to worry about subscription fees, usage caps or licence managers. Just as importantly, R is open. You can inspect the source code and tinker with it as much as you want.

Leading academics and researchers use R to develop latest methods in statistics, machine learning and predictive modelling. These methods are stored in packages which can be accessed by anyone for free! There are thousands of packages available to download and use.

R is an interactive language. In R you do analysis by writing functions and scripts, not by pointing and clicking. As an interactive language (as opposed to a data-in-data-out black box), R promotes experimentation and exploration, which improves data analysis and sometimes leads to discoveries that would not have been made otherwise. Scripts document all your work, from data access to reporting, which can be re-run at any time.This makes it easier to update results when the data changes. Scripts also make it easy to automate a sequence of taks that can be integrated into other processes, such as an ETL.

One of the design principles of R was that visualtision of data through charts and graphs is an essential part of data analysis. As a result, it has excellent tools for creating graphics, from staples like bar charts to brand new graphics of your own design.

With R you are not restricted to choosing a rigid set of routines and procedures. YOu can use code and packages contributed by others in the community, or extend R with your own functions and packages. R is also excellent for mash-ups with other applications. For example you can build it into your SSIS routine, or take advantage of the new integration in SQL Server 2016.

We’ve covered, briefly, what R is. But why do you want to use it?

Why Use R?

There is a vibrant community built around R. With thousands of contributors and millions of users around the worldif you have a question about R chances are someone has answered it, or can answer it.

It is quickly becoming an integral part of the Microsoft BI stack. Since Microsoft’s acquisition of Revolution Analytics, R has been featuring in the more recent releases in the Microsoft BI world. From Power BI to SQL Server, Visual Studio to Azure ML; R is becoming an integral component of the BI stack.

Where can you learn R?

There are various books and online courses on R which can be used to quickly skill you up in this powerful language. These are some which I recommend:

EdX: Introduction to R for Data Science

Book: R Cookbook by Paul Teetor

Book: R Programming for Data Science by Roger Peng

Here, at Adatis, we run internal R training courses which mean that all of our employees have the opportunity to learn from internal subject matter experts and improve their knowledge and skills.

You can also try out the R tutor from Revolution Analytics, which is a package built for R.