The first blog ‘Part 1 – Introduction to Geospatial data’ gave an overview into geospatial data analysis and its increasing importance in big data analytics. This blog will outline an architecture sample solution to manage the ingestion of geospatial datasets, analytical processing, and display of geospatial data. A high-level overview will be given.
Many companies and organisations have challenges in dealing with geospatial data at scale. The massive proliferation in geospatial data coupled with technical requirements have overwhelmed traditional storage and processing systems. Pressures due to data volume, storage costs, and redundant databases processing has led to organisations underutilising Geospatial analytics. Additionally, few have the experience in the technology architecture to prepare complex geospatial datasets for analytics.
The architecture diagram below illustrates an architecture solution for managing large volumes of geospatial data:
It is rare in modern data Warehouse analytics for the end user to only utilise geospatial data or analytical business data on its own. Both types of data when combined can provide powerful insights and delivery meaningful conclusions. The architecture diagram illustrates both geospatial and analytical data processes to form a Composite model in Power BI.
Raw geospatial data can be in various formats such as Vector or Raster. Both the geospatial and analytical data will be ingested via Azure Data Factory (ADF) which is an orchestration service.
Once ingested the geospatial data is stored in Data Lake Storage and copied into Azure PostgreSQL. PostgreSQL is a Paas database solution similar to Azure SQL DB. PostGIS is an extension to PostgreSQL that supports many spatial functions and spatial data types. Since PostGIS is a spatial database, it contains a geometry column with the data being in a specific format called spatial reference identifier (SRID). This reference system identifies geometric types such as coordinate systems. Analytical data types can be analysed using Azure Synapse Analytics which combines big data analytics, data warehousing and data integration into a comprehensive unified service.
Once transformation and analysis of the data has been carried out, the data can be visualised using tools such as PowerBI via ArcGIS and geospatial functionalities. Alternatively, Azure maps can be used to provide geographic context and location intelligence. Data Explorer can be a used to provide insightful visualisations. Azure Data Explorer utilises geospatial functionalities such as creating scatterplots from geospatial data.
One alternative to PostgreSQL is Azure Cosmos DB, which is a non-relational database. Cosmos DB can be used to support indexing and querying of spatial data represented by the GeoJSON specification. The benefit of this is that GeoJSON data structures do not require specialised tools or libraries. Data that is queried in Cosmos DB can be brought into Synapse for data enrichment and big data analytics.
Part 3 of the blog series will give a more detailed and in-depth explanation of the technical solutions to process and analyse geospatial data. To make sure you don’t miss part 3, follow us on LinkedIn!
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr
Pretty Power BI – Adding GIFs
Good UX design is critical in enabling stakeholders to maximise the key insight that they
Apr
Pareto Charts in Power BI and the DAX behind them
The Pareto principle, commonly referred to as the 80/20 rule, is a concept of prioritisation.
Apr
Databricks: Cluster Configuration
Databricks, a cloud-based platform for data engineering, offers several tools that can be used to
Apr
AI Assistance in Microsoft Fabric
The exponential growth of Large Language Models (LLMs) couples with Microsoft’s close partnership with OpenAI
Apr
10 reasons why it’s worth the effort to understand the value of your data
“If leaders really want to create a data driven culture, the journey starts with them!
Apr
Content Safety in Azure AI Studio
Azure AI Content Safety is a solution designed to identify harmful content, whether generated by
Apr