Git as a version control has many advantages over Azure DevOps Server (formerly Teams Foundation Server, TFS) . The ability for multiple people to work on the same project at the same time – Azure DevOp’s Server lock out policy, while understandable, became quickly unworkable across a multi-person team. The command line tools also offer a lot of flexibility
However Git is not without its issues when working on a SQL Server solution. One of the most notable is the issues that come when trying to merge changes. Anyone who has merged changes in Git within Visual Studio will be familiar with the following:
This happens when Git is unable to automatically merge the two files and you have to manually solve conflicts using Visual Studio’s merge tool. Often this is a straightforward manner of choosing whether to take something from the local or remote file (or sometimes both). However there are times where the merge tool can’t quite work out what needs to be merged and you end up with duplicate elements in your project definition file and get the “An item with the same key has already been added” error when your project is loaded:
Working on a project with many hundreds of packages per project trying to track down these duplicate elements quickly became a time sink so I wrote a Python script which will identify them for me. It traverses the project metadata file, which is an XML file, and pulls out each instance of the PackageMetaData tag. It then loads these into a Pandas dataframe and does a “group by” to get a count and then only return those with a count greater than 1. A quick way to identify where duplicate elements exist in your merged file which can then be deleted from the XML and the project metadata file reloaded in Visual Studio:
import xml.etree.ElementTree as ET import pandas as pd # Variable pointing to your local xml localXml = "C:\\Users\\DPS\\Desktop\\SSIS.xml" namespace = "{www.microsoft.com/SqlServer/SSIS}" e = ET.parse(localXml).getroot() # traverses the XML document to get all the PackageMetaData elements t = e.findall("DeploymentModelSpecificContent/Manifest/{0}Project/{0}DeploymentInfo/{0}PackageInfo/{0}PackageMetaData".format(namespace)) # creates an empty list l = [] # loops through each instance of PackageMetaData and pulls out the Package and the Parameter names and puts them in a list for a in t: a_name = a.attrib.get('{0}Name'.format(namespace)) for z in a.getiterator('{0}Parameter'.format(namespace)): z_name = z.attrib.get('{0}Name'.format(namespace)) l.append(a_name + '-' + z_name) # loads the list into a dataframe, splits on the "-" and renames the columns df = pd.DataFrame(l, columns=["column"]) df = df["column"].str.split('-', expand=True) df.columns = ['package', 'parameter'] # counts the number of instances of each and filter where greater than 1 df = df.groupby(['package', 'parameter']).size().to_frame('size') df = df[df['size'] > 1] # print to the console print(df)
Introduction to Data Wrangler in Microsoft Fabric
What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science
Jul
Autogen Power BI Model in Tabular Editor
In the realm of business intelligence, Power BI has emerged as a powerful tool for
Jul
Microsoft Healthcare Accelerator for Fabric
Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced
Jul
Unlock the Power of Colour: Make Your Power BI Reports Pop
Colour is a powerful visual tool that can enhance the appeal and readability of your
Jul
Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2
Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this
May
GPT-4 with Vision vs Custom Vision in Anomaly Detection
Businesses today are generating data at an unprecedented rate. Automated processing of data is essential
May
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr