(678) 835-8539 info@key2consulting.com

Ken Adams, Senior Business Intelligence Consultant at Key2 Consulting
By: Ken Adams

 

JSON is a file format that has become more common for database professionals to encounter. Over the years, we have become accustomed to delimited text files, fixed length text files, Excel (I threw up in my mouth a little bit), and just as we got more familiar with XML – here comes JSON. It seems that JSON has become the preferred format for applications talking back and forth over the Internet (h/t to Two-Bit History).

SQL Server 2016 introduced support for JSON data, and prior to the release of SQL Server 2016, JSON support was one of the top feature requests (h/t to Brent Ozar). SQL Server 2016 gave us the ability to import, query, and transform JSON data. It also let us export our relational data in a JSON format. I must admit, though, while this was a good advancement for the database engine, it didn’t jump to the forefront like DIE! 🙂

Recently, I had a customer who was struggling with the loading of JSON files using SSIS. The process involved making an API call and downloading a JSON file. At that point, an attempt was made to write custom C# code to parse the file and load it to their local database. The customer had been successful in pulling the file down locally. While not fully formed, they were able to grab the file from the API and put it in a local folder. All in all, this was good, and I actually would have expected things to break down sooner. I say this only because these folks are trained in other areas and aren’t paid to write data integration scripts. They reached out to us after getting the retrieval portion working but not being able to get the C# script in place to parse the JSON.

The customer provided the C# code they were using and I’m sure I could have made that work. My preference was to avoid the custom code and lean on the database engine. The solution I provided was a combination of SSIS and SQL to parse the incoming JSON file and write that to a SQL Server database. Let’s go through what I recommended using some sample data. I found this file containing current US Senators on this GitHub list. (BTW don’t bother with most .gov datasets during a government shutdown. 😉 )

The downloaded file can be viewed in a more palatable manner using a free JSON viewer; I’ve used this one with good success. You can copy and paste the contents of your file and then let the viewer work its magic.

Before:


After:


The viewer allows you to investigate your incoming JSON file and determine which data elements are important. In this case, I selected a handful of tags and staged those into a local stage database. Using OPENROWSET, you can read the contents of the JSON file and return that file as a single value. The query below validates the incoming file:

 

Additionally, using SELECT with OPENJSON, you can navigate to the appropriate subset of JSON elements. The WITH clause allows you to grab the desired tags, including those that are nested, and return those as a row. In this example, I chose party, description, firstname, middlename, lastname, name, gender, phone, state, and website. The logic is below:

 

The contents of the JSON are now seen as a table:


Taking it a step further, I provided a sample package back to the customer that parsed their JSON file and loaded that into a local table. For purposes of this post, I’ve done the same – I created a simple package that reads a JSON file and loads that to a staging table.

Here is what that project/package looks like:

Database work involves having a staging database/schema with the following table:

 

Create 1 project parameter:

ETLFilePath = C:\Users\Key2\Documents\Blog\JSON (your file path)

Create a package with 4 local variables:

 

Add connection managers to the Staging database – I created both OLEDB and ADONET:

Add an Execute SQL Task named SQL Truncate Staging:

 

Parameter Mapping:

Add an Execution SQL Task SQL Load Senator Detail:

 

Parameter Mapping:

Executing this package truncates the staging table, reads the JSON file from the specified location, and loads that to the staging table. The data in the staging table looks just like the query results from above:


This package is pretty straightforward and avoids using custom C# code or custom SSIS components. Custom SSIS components can be useful, but in this case, that was not an option. C# was viable, but given the timeframe and skillset of folks who will ultimately maintain this solution, it didn’t make sense. My preference was to leverage the JSON functionality – introduced in SQL Server 2016 – to load that data into the database. In this scenario, SSIS functions as the shepherd for the JSON data rather than having a Data Flow do the “heavy lifting.”

Conclusion

Has JSON started popping up on your radar? If it hasn’t already, it will soon enough. This blog detailed ingesting data from a JSON file and loaded that into a database table. My plan is to take this a few steps further in subsequent posts:

  • Querying an API to download JSON
  • Loading 1 or more files to staging
  • Integrating staged data to a set of relational tables

In the meantime, I hope you found this useful. If you are struggling with JSON and integrating JSON with your existing SQL Server database, please let us know. We’ll be glad to help!

Additional Supporting Information

 

Other Microsoft SQL Server JSON Links:

 

Questions?

Thanks for reading! We hope you found this blog post to be useful. Do let us know if you have any questions or topic ideas related to BI, analytics, the cloud, machine learning, SQL Server, (Star Wars), or anything else of the like that you’d like us to write about. Simply leave us a comment below, and we’ll see what we can do!

Keep your data analytics sharp by subscribing to our mailing list

Get fresh Key2 content around Business Intelligence, Data Warehousing, Analytics, and more delivered right to your inbox!

 


 

Key2 Consulting is a data warehousing and business intelligence company located in Atlanta, Georgia. We create and deliver custom data warehouse solutions, business intelligence solutions, and custom applications. 

Share
Tweet
Share
Reddit