Monday, May 2, 2011

SQL Server Spatial Datums

As part of my investigation of spatial indexes in SQL Server, I performed some analysis on some spatial data sets, so during the import of my spatial data I had to consider the datum for my data, so I thought I would devote this blog post to a dicussion about datums and spatial reference systems in SQL Server.

What is a Datum?

What is a datum?  Put simply, a datum is the reference system that a map's coordinates relate to.  If we think of a graph for example, the graph's axes represent the datum to which the data in the graph is referenced.

When we talk about a datum in a mapping coordinate system, we are typically referring to the spheroidal model of the Earth that we are using to represent geographic coordinates (longitude, latitude) against.

What is a Spheroid?

In the paragraph above I used the term spheroidal, but what is a spheroid?  The shape that is the closest approximation for the shape of the Earth is referred to as the spheroid, or ellipsoid.  This shape can be thought of as a basketball that is being squished at the north and south poles, such that the line around the equator is the longest distance around the shape.

So if we know the Earth is spheroidal, then why do I use the plural term datums?  Well, in fact the Earth is not a perfect spheroid.  It is actually a shape known as the Geoid.  The Geoid is an undulating three dimensional surface representing the gravitational surface of the earth.  I often picture this surface in my mind as a potato with divots and growths, but the undulations are actually fairly subtle.

A spheroid is a three dimensional ellipse that closely approximates this geoidal surface.  But obviously different regions on the earth would have higher or lower average gravitational density, so there are different datums throughout the world that represent spheroids that best approximates the Geoid in their regional area.

The SQL Server SRID

In SQL Server, when you create an instance of the geography data type, one of the parameters is an SRID, but what is an SRID?  You guessed it - the SRID is an identifier that represents the datum that the geography feature is referenced to.  SRID stands for Spatial Reference Identifier.

I couldn't find a lot of information about the SRIDs in SQL Server, but I found that I could perform a query against the spatial_reference_systems system view to see which SRIDs it contained.
select * from sys.spatial_reference_systems
If you execute this query you will see a column that represents the Well Known Text (WKT) of each SRID, which defines the parameters of the spheroid, but in addition to this has the spheroid name.

Among my data sets I had a shape file of GDA94 (Geocentric Datum of Australia) geographic data representing the cadastral boundaries of the state of Queensland.  So I performed the following query to narrow my search.
select * from sys.spatial_reference_systems where well_known_text like '%gda%'
Resulting in SRID 4283 being found.  I then used this to import my data using the Shape2SQL utility from Morten Nielsen's blog.  

What about Projections?

Ok then, so the next question is - what is a projection?  A projection is a representation of spheroidal data on a flat piece of paper.

When we talk about geographic coordinates, the longitude and latitude represent polar coordinates, and differences between points are defined as angular arc measurements on the earth's surface, but these measurements are less meaningful to us than units we interact with every day such as meters and kilometres, or feet and miles.  So a projection is a way that we can represent the polar geographic coordinates on a piece of paper in a grid of units we are familiar with.

Like spheroidal datums, different countries usually have standard map grids that are grid representations of their spheroidal datum.  In my case I had another set of data in MGA (Map Grid of Australia) coordinates, which is a grid coordinate system based on the GDA94 datum.

Since this data is represented purely as grid values, the data could be represented as geometry shapes using a SRID of 0 when the data was imported.

Comparison Between Datums

When creating spatial data it is quite important to define the correct SRID, the first reason being that two spheroids can give slightly different results when calculating distances or areas, simply because of the difference in their mathematical parameters.

e.g. If I draw a line from the top left corner to the bottom right corner of the extent of my Queensland cadastral data in a GDA94 datum and get the distance of the line, I get a slightly  different answer than if I use the AGD66 (Australian Geodetic Datum) datum.
declare @line geography
set @line = geography::STLineFromText('LINESTRING(137.9946 -9.142,153.5518 -29.178)', 4283)
print @line.STLength()
set @line = geography::STLineFromText('LINESTRING(137.9946 -9.142,153.5518 -29.178)', 4202)
print @line.STLength()

The other important thing about comparisons between datums, is that SQL Server is not a GIS, it is a spatial data storage system, so it won't perform transformations between datums or projections, so you cannot use coordinates from one datum to select features in another datum.

In order to perform spatial selections between data in different datums or projections, a transformation must be done so that the comparison is done in a common datum.

In a future post I will discuss an example where I needed to convert data in an older AMG (Australian Map Grid) projection based on AGD66 to the MGA projection based on GDA94 datum.


Just some final thoughts about datums in SQL Server - while the SRID argument allows a geographic instance to define its datum, it serves more of a technical purpose to SQL Server to make spatial calculations such as distances and areas, than actually represent any meta data about the spatial data itself.  This is especially true of planar grid based geometry data instances that use a SRID of 0.

I think it is vitally important to record the datum/projection details for different spatial tables in your database, because at some time in the future you will need to know the origin of the data, and possibly how to transform it to another coordinate system.

For this purpose I think it is worth creating a table within your SQL Server database to define this data for your spatial tables, with the table name, datum, and the projection as an optional column. e.g. local_auth_1, AGD66, AMG Zone 56.

No comments:

Post a Comment