By: Andy Kim
Introduction
If you ask a SQL professional to name some data types, you might hear: varchar, int, datetime, decimal, or something to that effect. Rarely do people mention the spatial data types of: geography and geometry. This is because most SQL professionals haven’t been exposed to these data types.
Well – I would like to change that here.
The spatial data types are implemented using windows CLR functions, which becomes more apparent when working with the spatial insert scripts, queries, indexing, etc.
The Spatial Data Types
SQL Server comes with two spatial data types built in:
Geography – Represents data in a round-earth coordinate system like GPS Latitude/Longitude coordinates
Geometry – Represents data in flat coordinate system
These data types have many similarities and some differences as well.
Spatial Data Analysis – Demo Outline
In this demo, I would like to:
1) Create a custom restaurant map using Google Maps.
2) Export the restaurant geography data from Google Maps in Keyhole Markup Language (KML) format.
3) Prepare and load the data into SQL Server.
4) Run a simple query on the spatial data.
5) Demonstrate spatial data analysis by adding spatial richness to a PowerBI report.
Typically, you would get large, complete data sets from GIS systems, or other files, but I would like to do something simpler here. For this demo, I have manually put together a list of coordinates using Google Maps.
1. Creating a custom restaurant map using Google Maps.
a) Go to Google Maps and sign in.
b) Click the menu button to open it.
c) Click on “Your Places”.
d) Click on “Maps”.
e) Click “Create Map”.
Now, there is a blank map to add interest points.
f) You can rename the map, and beginning adding points to your map. I searched for restaurants and added them to my map.
Notice the Latitude/Longitude shows up once you add a place to your map.
g) In this way, you can create a map with all your points of interest. Here is an example of a finished map.
2. Exporting the Data from Google Maps in KML format.
a) Click the menu button, then click on “Export to KML/KMZ”…
b) Set the appropriate options and download the KML.
The KML looks a lot like XML. You can open it with a text editor like Notepad.
3. Loading the Data into SQL Server.
There are several ways you can load this data into SQL Server, but in this example, I will use a free online tool and Microsoft Excel.
a) Navigate to http://convertcsv.com/kml-to-csv.htm. Click “Choose File”, and select the KML file you downloaded. Click the “KML to Excel” button and download the outputted Excel file.
b) Open the downloaded file in Excel, then add a new column to calculate the Insert command.
In this case, the Excel formula is:
= CONCATENATE(“INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES (‘”,A2,”‘, geography::STPointFromText(‘POINT(“,C2,” “,D2,”)’, 4326))”)
c) In SQL Server, let’s run a script to create the table and load it with the data from above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
CREATE TABLE [dbo].[DimRestaurantGeography]( [RestaurantKey] [int] IDENTITY(1,1) NOT NULL, [RestaurantName] [varchar](100) NULL, [RestaurantGeography] [geography] NULL ) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Mavericks Cantina', geography::STPointFromText('POINT (-84.1769333 34.0509891)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('City Barbeque and Catering', geography::STPointFromText('POINT (-84.1673536 34.0716767)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Taqueria Los Hermanos', geography::STPointFromText('POINT (-84.1593168 34.0839946)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Burger 21', geography::STPointFromText('POINT (-84.174795 34.056289)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Verra Zanno Pizzeria', geography::STPointFromText('POINT (-84.1715453 34.064237)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Jason''s Deli', geography::STPointFromText('POINT (-84.1699434 34.0662492)', 4326)) -- extra apostrophe INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Firehouse Subs', geography::STPointFromText('POINT (-84.1717192 34.068203)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('QDOBA Mexican Eats', geography::STPointFromText('POINT (-84.1692103 34.0692028)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Kabob-ish', geography::STPointFromText('POINT (-84.1793856 34.0469995)', 4326)) INSERT INTO dbo.DimRestaurantGeography (RestaurantName,RestaurantGeography) VALUES ('Bai Tong Thai Restaurant', geography::STPointFromText('POINT (-84.1587254 34.0615508)', 4326)) select *, RestaurantGeography.STAsText() AS RestaurantGeographyText --computed text column for readability from dbo.DimRestaurantGeography |
4. Running a simple query on the Spatial Data.
As I mentioned earlier, querying spatial data looks a little different than your standard SQL queries due to the Windows CLR implementation. However, these functions are very powerful. Here is an example query below that finds all points (restaurants) within 0.4 miles of my office.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
-- Spatial Query to determine all points (restaurants) within a certain distance from the origin -- Distance is in meters. Conversion factor: 1 mile = 1609.344 meters. -- define the origin point and store in variable DECLARE @origin geography = GEOGRAPHY::STGeomFromText('POINT (-84.170665 34.062334)', 4326); -- define the origin and store in a variable (my office) SELECT @origin.STDistance(a.RestaurantGeography)/1609.344 as DistanceFromPointInMiles -- calculate the distance between the origin and each point in the table ,* , RestaurantGeography.STAsText() AS RestaurantGeographyText -- add a computed text column for WKT readability FROM [TestDB].dbo.DimRestaurantGeography a WHERE @origin.STDistance(a.RestaurantGeography)/1609.344 <= 0.4 -- filter out records where distance is greater than the value here (0.4 miles here) ORDER BY @origin.STDistance(a.RestaurantGeography) -- order by closest distance |
5. Add Geography Dimension to PowerBI Report.
a) Create a simple fact table and load it for this demo.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
--Create a sample fact table that holds lunch information CREATE TABLE FactLunchPrice ( LunchRecordID INT IDENTITY (1,1), LunchDateSID INT NOT NULL, RestaurantKey INT NOT NULL, Price smallmoney NOT NULL, ); GO INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190101,1,13.29) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190103,4,13.03) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190108,6,8.18) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190110,5,10.08) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190115,7,10.33) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190117,2,14.99) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190122,8,9.89) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190124,1,12.88) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190129,4,12.57) INSERT INTO [dbo].[FactLunchPrice] ([LunchDateSID],[RestaurantKey],[Price]) VALUES (20190131,1,12.99) |
b) Open PowerBI Desktop and Get Data from the SQL Server source similar to below:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
-- Use the .Long and.Lat to get the coordinate components and concatenate them with a comma in between select f.LunchRecordID ,f.LunchDateSID ,f.RestaurantKey ,f.Price ,d1.RestaurantName ,d1.RestaurantGeography.Long as Longitude ,d1.RestaurantGeography.Lat as Latitude ,CONCAT(CONVERT(VARCHAR(50), d1.RestaurantGeography.Long), ' , ', CONVERT(VARCHAR(50),d1.RestaurantGeography.Lat)) as LatLongCoordinates from [FactLunchPrice] f join [DimRestaurantGeography] d1 on (f.RestaurantKey = d1.RestaurantKey) |
Here are some examples of standard reports you might see:
Next, let’s use the “Map” visual to leverage the geography data. Configure as shown below. Be sure to map the Latitude and Longitude data to the appropriate fields.
As you can see, adding a spatial dimension can add value and richness to your data model. In many industries such as utilities, this spatial aspect is crucial. This is how all their equipment is mapped, where customers are located, how problem areas are detected, etc.
Alternatively, Power BI can use postal addresses or the concatenated coordinates in the “Location” field to perform the same operation, but the real power of the spatial data types is the ability to run queries on them. Here, we have just scratched the surface of spatial data types. In reality, there are more complex objects such as lines, polygons, and buffers. There are also other functions to calculate all kinds of spatial metrics.
Now that you have been exposed to spatial data types, I encourage you to learn more about them and their functions. If you have recently worked with any cool projects using spatial data types, please tell me about it. I would love to hear how you have used them.
References
- Microsoft: Spatial Data Types Overview
- Microsoft: Spatial Data (SQL Server)
- Microsoft: Spatial Data (Geography)
- Microsoft: Spatial Types (Geometry – Transact-SQL)
Questions?
Thanks for reading! We hope you found this blog post to be useful. Do let us know if you have any questions or topic ideas related to BI, analytics, the cloud, machine learning, SQL Server, (Star Wars), or anything else of the like that you’d like us to write about. Simply leave us a comment below, and we’ll see what we can do!
Keep your data analytics knowledge sharp by subscribing to our mailing list
Get fresh Key2 content around Business Intelligence, Data Warehousing, Analytics, and more delivered right to your inbox!
Key2 Consulting is a data warehousing and business intelligence company located in Atlanta, Georgia. We create and deliver custom data warehouse solutions, business intelligence solutions, and custom applications.