In the world of BI, the SQL Server system databases (with the exception of tembdb) are often forgotten about. You won’t be surprised to learn that I am a BI Consultant and until recently, I was also in this boat. System database are usually left to a DBA, whilst we develop ETL packages, model data or produce fancy reports.
After attending Ryan Adam’s session at SQL Pass 2016, my mindset has changed. The msdb database used to be the hub for SSIS logging but since 2012, this is all stored in the SSISDB database. However, there are still a number of processes and functions that are logged in msdb and we need to manage this, to prevent the database from becoming too large and impacting storage costs.
This blog will provide a brief overview of what is stored and logged in msdb, as well as a little trick of how to manage SQL Agent Jobs.
What is Stored in msdb?
Before I provide a brief list of what is in msdb, one obvious bit of advice is it should be backed up regularly. We all have backup strategies for our OLTP or data warehouse database, but the system ones are often neglected. All this logging and auditing history in msdb is useful!
The msdb database consists of the following:
o SSIS – Only if packages are stored in SQL Server (not the SSIS Catalog).
o Backup and Restore
o Maintenance Plans
o Database Mirroring
o Policy Based Management
o Log Shipping
o Service Broker
o Database Engine Tuning Advisor
SQL Agent Jobs
Whilst all of the logging tables in msdb are important for a DBA, one real area of interest in BI is around SQL Agent jobs. There can be many SSIS packages being called from the Catalog, spanning across multiple jobs. It is important that logging is available, yet we don’t want job history to be stored forever.
Out of the box, the logging for msdb looks like the below. Right click on SQL Server Agent properties in SSMS and navigate to the ‘History’ tab.
Let’s work through a use case, extracted from Ryan Adam’s Pass presentation:
1. Imagine you have 10 database backup jobs running every hour in your production environment.
a. 10 Databases * 24 rows per day = 240 Total rows per day.
b. Total rows for all jobs would be maxed out in 4 days and 4 hours.
Ok, so now we have a problem. All DBA’s and BI developers would want to see more than 4 days’ worth of history, which means the ” setting is not a good default to retain. How about changing the Agent job to the below?
Sounds perfect, right? Wrong. The GUI setting is not persisted, meaning the Agent Job will log history for 30 days on the next run only. This seems crazy, but is what we are dealing with unfortunately. There is a solution, which involves executing a stored procedure within the msdb database.
DECLARE @DeletePrior DATETIME;
SET @DeletePrior = CONVERT(VARCHAR(10), DATEADD(dd, –30, GETDATE()), 101);
EXEC msdb.dbo.sp_purge_jobhistory @oldest_date = @DeletePrior;
NOTE: Ensure the ‘row restriction’ property (shown above) is unchecked against the SQL Server Agent Job settings before executing.
Although you cannot see the changes in the GUI settings, you will now retain 30 days of history in msdb. The query can be tweaked to days, months or even years if you like.
Conclusion
If you take anything away from this article, then it should be to not ignore system databases. Microsoft create them automatically for us but unfortunately, the out of the box configurations are not always correct for your situation or business. For a deeper dive into msdb and other system databases, try some of the recommended blogs below.
If anyone would like to share their experiences with using/refining msdb, feel free to comment.
Recommended Reading
o MSDN – https://msdn.microsoft.com/en-us/library/ms187112.aspx?f=255&MSPPError=-2147217396
o Hemantgiri Goswami’s Blog – https://www.toadworld.com/platforms/sql-server/b/weblog/archive/2013/10/21/msdb-cleanup-is-necessary
o Ryan Adam’s Blog – http://www.ryanjadams.com/
Introduction to Data Wrangler in Microsoft Fabric
What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science
Jul
Autogen Power BI Model in Tabular Editor
In the realm of business intelligence, Power BI has emerged as a powerful tool for
Jul
Microsoft Healthcare Accelerator for Fabric
Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced
Jul
Unlock the Power of Colour: Make Your Power BI Reports Pop
Colour is a powerful visual tool that can enhance the appeal and readability of your
Jul
Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2
Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this
May
GPT-4 with Vision vs Custom Vision in Anomaly Detection
Businesses today are generating data at an unprecedented rate. Automated processing of data is essential
May
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr