I have blogged often on the subject of R, but have not previously addressed what R is and why you should use it. In the blog post, I will set out what R is, why you should use R and how you can learn R.
What Is R?
R is a powerful tool for statistical programming statistics and graphics. There are lots of software available which can do all of these things: spreadsheet applications like Excel; point-and-click applications like SPSS; data mining applications like SSAS; and so on. But what sets R apart from applications like those listed?
R is a free and open source application. Because it is free you don’t have to worry about subscription fees, usage caps or licence managers. Just as importantly, R is open. You can inspect the source code and tinker with it as much as you want.
Leading academics and researchers use R to develop latest methods in statistics, machine learning and predictive modelling. These methods are stored in packages which can be accessed by anyone for free! There are thousands of packages available to download and use.
R is an interactive language. In R you do analysis by writing functions and scripts, not by pointing and clicking. As an interactive language (as opposed to a data-in-data-out black box), R promotes experimentation and exploration, which improves data analysis and sometimes leads to discoveries that would not have been made otherwise. Scripts document all your work, from data access to reporting, which can be re-run at any time.This makes it easier to update results when the data changes. Scripts also make it easy to automate a sequence of taks that can be integrated into other processes, such as an ETL.
One of the design principles of R was that visualtision of data through charts and graphs is an essential part of data analysis. As a result, it has excellent tools for creating graphics, from staples like bar charts to brand new graphics of your own design.
With R you are not restricted to choosing a rigid set of routines and procedures. YOu can use code and packages contributed by others in the community, or extend R with your own functions and packages. R is also excellent for mash-ups with other applications. For example you can build it into your SSIS routine, or take advantage of the new integration in SQL Server 2016.
We’ve covered, briefly, what R is. But why do you want to use it?
Why Use R?
There is a vibrant community built around R. With thousands of contributors and millions of users around the worldif you have a question about R chances are someone has answered it, or can answer it.
It is quickly becoming an integral part of the Microsoft BI stack. Since Microsoft’s acquisition of Revolution Analytics, R has been featuring in the more recent releases in the Microsoft BI world. From Power BI to SQL Server, Visual Studio to Azure ML; R is becoming an integral component of the BI stack.
Where can you learn R?
There are various books and online courses on R which can be used to quickly skill you up in this powerful language. These are some which I recommend:
EdX: Introduction to R for Data Science
Book: R Cookbook by Paul Teetor
Book: R Programming for Data Science by Roger Peng
Here, at Adatis, we run internal R training courses which mean that all of our employees have the opportunity to learn from internal subject matter experts and improve their knowledge and skills.
You can also try out the R tutor from Revolution Analytics, which is a package built for R.
Celebrating International Women’s Day: from Classroom to Code
As we celebrate International Women’s Day, I want to share my journey of breaking stereotypes
Mar
Pretty Power BI – Adding Pagination to Bar Charts
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb
Pretty Power BI – Creating Dynamic Histograms
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb
Top Tips to Pass the Databricks Certified Data Engineer Professional Exam
Having recently passed the Databricks Certified Data Engineer Professional exam, this blog post covers some
Jan
Python vs. PySpark Navigating Data Analytics in Databricks – Part 1
Introduction When it comes to conquering the data analytics landscape in Databricks, two heavyweights, Python
Jan
Impact of AI on Business Analysis
Artificial intelligence (AI) is rapidly transforming our world, and this blog post concentrates on the
Jan
Creating Clickbait Using Python
In 2023, about 5 billion people used the internet. With so many people contributing and
Dec
A Brief Overview of Security in Microsoft Fabric
Where Fabric Sits in the Hierarchy As you are probably aware, Microsoft Fabric is Microsoft’s
Dec