Spark is all the rage at the moment (and has been for a while) in the Big Data and Analytics communities, seeing application for all aspects of working with data, from Streaming to Data Science. It offers a very performant, multi-purpose scalable platform with a very strong user community. In this series I’ll be looking at setting Spark up on a local machine for learning purposes, working with the Jupyter notebook environment for data wrangling, mungeing and visualisation. We’ll also take a quick look at cloud platform offerings and some of the basics of the extensions to the core Spark platform such as Spark Structured Streaming and Spark ML. Spark is a very large subject and I won’t be going into too much depth, just enough to give readers a taster for capabilities and ease of use. There are some fantastics sources of information out there in the Spark community for those interested in a deeper understanding, which I’ll provide references to along the way.
Part 1: Installing Spark on Windows
Part 3: Installing Jupyter Notebook Kernels
Part 4: Spark on Azure HDInsight
Part 5: Spark on Azure Databricks
Part 6: Spark Core
Part 7: Spark SQL
Part 8: Spark Structured Streaming
Part 9: Spark ML
Part 10: Spark GraphX
Pareto Charts in Power BI and the DAX behind them
The Pareto principle, commonly referred to as the 80/20 rule, is a concept of prioritisation.
Apr
Databricks: Cluster Configuration
Databricks, a cloud-based platform for data engineering, offers several tools that can be used to
Apr
AI Assistance in Microsoft Fabric
The exponential growth of Large Language Models (LLMs) couples with Microsoft’s close partnership with OpenAI
Apr
10 reasons why it’s worth the effort to understand the value of your data
“If leaders really want to create a data driven culture, the journey starts with them!
Apr
Content Safety in Azure AI Studio
Azure AI Content Safety is a solution designed to identify harmful content, whether generated by
Apr
Model Benchmarks in Azure AI Studio
In the constantly changing field of artificial intelligence (AI) and machine learning (ML), choosing the
Apr
Celebrating International Women’s Day: from Classroom to Code
As we celebrate International Women’s Day, I want to share my journey of breaking stereotypes
Mar
Pretty Power BI – Adding Pagination to Bar Charts
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb