Wednesday, June 1, 2011

SQL Server Invalid Geometry

While recently investigating SQL Server spatial indexes I encountered an issue regarding invalid geometry in some of the features in my source shape files, so I thought I would do a quick blog post on what this means, and how to detect and resolve these issues in SQL Server.


What is valid geometry?

If you research the STIsValid method in the SQL Server documentation you will come across an explanation of the method that indicates that it will return whether the geometry instance is "well formed based on its Open Geospatial Consortium (OGC) type", but what does this mean?


Well, this checks for basic spatial integrity depending on the data type, such as polygons not having inner and outer rings crossing, and multilines not retracing back over the same points.


Part of the documentation from the OGC standards for simple feature access describe validity of polygons being:
  1. Polygons are topologically closed
  2. The boundary of a Polygon consists of a set of LinearRings that make up its exterior and interior boundaries
  3. No two Rings in the boundary cross, and the Rings in the boundary of a Polygon may intersect at a Point but only as a tangent
  4. A Polygon may not have cut lines, spikes or punctures 
  5. The interior of every Polygon is a connected point set
  6. The exterior of a Polygon with 1 or more holes is not connected. Each hole defines a connected component of the exterior


STIsValid Method

The STIsValid method can be used to detect if you have shape features that are invalid.  The method will return 0 if the feature is invalid

e.g.  The following TSQL command will count the number of records in the table with invalid shape features
select count(id) from cadastre where geom.STIsValid() = 0

MakeValid Method

Luckily for us there is a method in SQL Server to fix invalid shape data - MakeValid.

Be aware that MakeValid could alter your shape in a way that you consider incorrect, depending on the original problem with the shape.  So you should really investigate any shapes that are invalid using STIsValid to find them, and then if using MakeValid to resolve them, check the resulting shapes produced by MakeValid before performing an update.

My sample dataset only had a handful of errors caused by overshoots etc, so I was able to resolve my data with the following command
update cadastre set geom = geom.MakeValid() where geom.STIsValid() = 0

If you have a lot of invalid shapes you may want to perform commits in some kind of unit of work, such as this script that commits every 1000 updated rows
declare @rowCount int

set @rowCount = 9999

while @rowCount > 0
begin

    begin tran

    update top(1000) cadastre set geom = geom.MakeValid() where geom.STIsValid() = 0

    set @rowCount = @@rowcount

    commit

end

2 comments:

  1. Thank you for your post, it was a great help to solve my problems regarding geometries.

    ReplyDelete
  2. Thanks - exactly what I needed. Stan

    ReplyDelete