## FME 2012 Use Case: The Concave Hull and the Alpha

May 24th, 2012

Hi folks,
You may well have noticed that the ConvexHullReplacer and ConvexHullAccumulator transformers got a bit of a makeover in FME2012.

The most obvious change is that we dropped the “Convex” part off their names to become the HullReplacer and HullAccumulator. This occurred because the transformers now support not only convex hulls, but concave ones too.

But with the concave support comes a new parameter called “Alpha Value” and I wanted to try to explain what this means.

Our mathematically minded developers at Safe will probably be able to pick a thousand holes in what I’m about to say, but it’s how I managed to simplify things to understand the concept.

Convex vs Concave
OK, you probably know the difference but I’ll mention it anyway. A convex hull is the one that’s described as a rubber-band wrapped around your data. It will look something like this:

There’s a bunch of points, you put a line around them, and draw it tight. Simple eh? Well read the Wikipedia entry for convex hulls and your eyes will start to glaze over at the mention of Minkowski Summations and equations that look like this: ∑ Sn = { ∑ xn : xn ∈ Sn }

So a concave hull, where the bounds of a dataset cling tightly to it, like a sports shirt to a long-distance runner, ought to be unbelievably hard, right? In fact there’s even a website called ConcaveHull.com, which shows examples such as this:

Well, once you understand the alpha value it’s not so difficult to do this with FME; and this parameter can even be manipulated to the point where it’s so wrong that the results are good again! Let me show you how…

Alpha Values
OK, so the alpha value can range from zero to - in theory - infinity.

At infinity what you get is pretty much the standard convex hull. So the higher the number the more the hull around a set of points will resemble the rubber band look as shown in the above image.

But at zero you get the opposite. The hull is compressed so much that it has basically collapsed into nothing; sort of like our runner’s shirt becoming so tight that it looks like it’s been painted on.

I was going to insert an image of said athlete to illustrate the point. But Googling “body paint” returned images that are definitely Not Suitable For Work! Just imagine the point dataset above where each point has its own, infinitely small, hull.

Let’s look at an example, and we’ll see shortly why that happens.

Example
Here I am using the standard city parks dataset we use in a lot of our training:

A convex hull around all these features (using a HullAccumulator) would look like this:

And, given what I’ve said, you’d not be surprised to learn that I get the same thing with a Concave hull, when the alpha parameter is set to a high number like 9999999. I don’t need a screenshot - it’s identical to above.

Then, when I start to reduce the alpha value (to 8500) I start to get a more ‘clingy’ outline:

And then further reduced (6500) makes it clingier still:

So to find the ideal alpha value, the technique I would recommend is you start with a high number and reduce it until you get what works for you.

But what does the number mean?
Good question. And you’ll need to know this because you will find - as I did - that the range of useful values varies according to coordinate system units. So whereas a range of 6500 to 8500 seemed to work in my ‘State Plane’ data above, 0.5 to 3.0 was much better when I did a test using Lat/Long data.

For this section, I’m really grateful for this blog post by Alastair Aitchison (who occasionally also uses FME in some of his other posts) and the experts in our core team at Safe.

Basically the concave hull process first turns the data into a Delaunay triangulation, something like this:

Now the alpha value is applied. Any triangle with a circumradius more than the alpha value is discarded. Anything remaining (ie with a value less than the alpha) gets dissolved together to form the hull.

In essence, triangles bigger than the alpha value are discarded, the rest merged together. This screenshot shows which triangles are removed when the alpha is 6500 (as in the section above) and which remain:

Now you can see why a larger alpha produces a more convex boundary - because fewer triangles are discarded and more are added to the overall shape. When your alpha is larger than the largest circumradius, you won’t discard anything.

Conversely a smaller alpha produces a more concave boundary because more triangles are being discarded. And, of course, when your alpha value is smaller than the smallest circumradius, you discard every triangle and get nothing: FME will output a non-geometry feature in this case:

This leads to two important points. Firstly, you get a better “concave-ness” when the data is regularly distributed, because the largest triangles then appear on the edge of the triangulation model and are the first to be removed. When the data is not evenly distributed, then you can get large triangles inside the model, which makes for an ugly hull shape. We’ll see that below.

Secondly you can see that, because the size/units of this circumradius are closely tied to the size/units of the data, the alpha value that you need to use will vary greatly according to the coordinate system used.

Incidentally, if you don’t know what a circumradius is, then don’t worry. It’s really simple. It’s just the radius of the minimum bounding circle that takes in the three points of a given triangle.

For example, this triangle is surrounded by a circle. In fact it’s the smallest circle inside which the triangle can be inscribed. This is a circumcircle, and the radius of a circumcircle (the red line) is the circumradius:

So we calculate this for every triangle in the network, and when the circumradius is greater than the alpha value, then the triangle will be removed. What’s left is merged together to form the concave hull.

OK, so you would imagine that you don’t really want to reduce the alpha value too much, for fear of removing triangles in a way that would produce an odd shape.

For example, if I reduce the parks alpha value to 6000, then I lose one particular triangle and end up with an ugly-looking donut:

You can see in the triangulation screenshot above exactly which triangle has suddenly breached my alpha limit and been removed. It’s mostly the fault of the data: it’s just not evenly distributed enough. Like I mentioned, this can lead to large interior triangles that create holes - just like this.

Of course, if I reduce the alpha even further, I would get more and more strange shapes. I might even get to the point where all of the triangles in a particular swathe get removed, like these I’ve highlighted:

…which would leave me with two isolated hulls with a big gap in the middle.

Incidentally, if you leave the alpha parameter empty in the transformer, then FME computes the smallest alpha that will result in a single polygon feature; so you don’t need to worry about gaps if you are happy with an automated method.

So what I could do is reduce the alpha value to deliberately make gaps, but surely that couldn’t be good? Or could it? Let’s try with an alpha value of 3500:

Not particularly useful, though it does look rather amusingly like a poodle in outline. Let’s try an alpha of 1500:

Wow! I’ve removed so many triangles that the output has gone right through “bad” and actually looks useful. This could be a nice outline for “parked areas” in general, as you can see if I turn the park features back on:

So what we have now is a nice way to create a hull/outline around a group of features, without actually needing a common attribute to group by! I see it as a great method for generalizing data.

In fact I learned of this method while working on a support ticket last week, when my colleague Dmitri suggested I give it a try. A user had a point cloud dataset, but there were distinct sets of points within it, like a series of mini clouds. Each group was a separate building I think. The requirement was to create an outline around each mini point cloud and this method worked perfectly.

As an aside, and speaking of generalization, doesn’t the above “parked areas” data look more curvy than you would expect? Like this feature:

The method I used for that I learned recently from Dale Lutz: the Generalizer NURBfit method. I don’t know when this particular algorithm was added to the Generalizer transformer, but I’m finding it very useful in turning sharp edges into gentler curves.

Obviously generalizing the data makes it no longer as accurate as before. But in many cases - particularly hulls that are by nature fairly imprecise - it’s not a big problem.

And you have to admit that it looks way better.

Well, I seemed to have turned a single parameter into a major article here. But I’m very excited about some of the things I found out about concave hulls, and I’ll definitely be keeping these techniques in my FME armoury.

Why not fire up Workbench and give them a try yourself?

Regards,

Entry Filed under: Data Transformation, FME Desktop, GIS, Point Cloud, Science Lab

-->

## The FME Evangelist

Welcome! The FME Evangelist delivers insider news, cutting edge examples and the latest functional developments for Safe Software’s FME application.