In this article we want to explore the point-in-polygon (PIP) problem. The problem raises the question whether a given point in the plane lies inside, outside or on the boundary of a polygon. This is in particular a basic operation in computer graphics.
The article cannot be considered as a fully comprehensive coverage of every aspect of the problem. Errors cannot be excluded. Corrections or additions are always desired and welcome.
The following sample code written in C can perform a PIP check. The output is only the distinction between inside or outside the polygon.
https://github.com/digitsensitive/point-in-polygon-c
A point is a two-dimensional primitive, which is infinitely small and has an x and y coordinate.
A polygon is a plane figure with a finite number of straight line segments connected to create a closed polygonal chain. The points where two segments meet can be called vertices (or the vertex).
Polygons may be classified depending on number of sides, convexity and intersection, equality and symmetry and more.
There are different approaches how you can solve the problem. Two common approaches (ray casting and angle summation) have been used as early as 1974 and are described in the paper “A Characterization of Ten Hidden-Surface Algorithms” by Ivan Sutherland et al. The ray casting algorithm has even been presented earlier in 1962 by Shimrat. We will introduce the ray casting algorithm here shortly and then look at the Franklin’s Point Inclusion in Polygon (PNPOLY) algorithm in more detail. The PNPOLY method essentially differs from the fact that the rays are only shot horizontally (increasing x, fixed y).
This algorithm is also known as the crossing number algorithm or the even–odd rule algorithm. The observation of that even-odd rule may be mathematically proved using the Jordan curve theorem.
From a random point you draw a semi-infinite line (raycast) in a random direction and check for collisions with the line segments of the polygon. If the ray collides with an even number of times the point must be outside the polygon. If the ray collides with an odd number of times the point must be inside the polygon. What happens when the point lies on the edge of the polygon depends on the details of the ray intersection algorithm. Imagine that two polygons are next to each other. We select a point that is exactly on the line where both polygons touch each other. If we perform a PIP check for the left polygon and get the result “outside”, the PIP check for the right polygon must output the result “inside”!
You need to pay attention with this algorithm when the ray collides exactly with a vertex of a line segment (special case!). To get correct results you should not count the collision, if the line segment you are colliding with, is under the raycast. You will see how this works later.
Be aware that if the point lies very close to the boundary or exactly on the boundary of the polygon it might output a wrong result because of finite precision arithmetics of computers. If the point lies on the boundary, should you count the collision, meaning the point is outside? To fix this problem you might want to introduce a numerical tolerance ε and test whether the point lies within ε of the line. If that’s the case the algorithm should stop and inform the user that the point lies too close to the polygons boundary.
Randolph Franklin makes no claim to have invented the idea of this approach. Since he produced and published the approach in Fortran code already in 1970 it may be assumed that he is the Mastermind behind the idea.
Here is the C code, published on his website.
|
|
Let’s dive in and separate each part of the upper code to understand what is going on. The function takes the number of vertices, two pointers to an array of float values and the x and y position of the point to be tested.
int c = 0;
...
c = !c;
...
return c;
The variable c
is initialized to the value of zero (basic signed integer
type). This variable represents the status of how often the edges of the polygon
were crossed. At the beginning it is an even number (= zero, 0), which means the
point has not crossed any edges yet. If the semi-infinite ray does not collide
the polygon at all, the point must be outside. Or if it collides the polygon
twice, it must be outside too. You get the point. The collisions with the edges
are checked in the if statement (see more later). Each time the if statement is
true
the variable c
switches its state between 0 and 1.
To be more specific and clear I would recommend to rename the variable c
to
is_number_crossed_edges_odd
. So if the variable is_number_crossed_edges_odd
is 1 or true the number of crossed edges is odd. If it is 0 or false the number
of crossed edges is even. It is always worth to be clear with the variable
naming!
Next we want to unterstand what the for loop does.
for (i = 0, j = nvert - 1; i < nvert; j = i++) {}
On the first cycle of the for loop the values i
and j
are initialized to 0
respectively nvert-1
(f.e. six for a polygon with seven vertices). As long as
the condition i < nvert
is true (seven cycles for a polygon with seven
vertices) the i
and j
values are augmented by one. Be aware that j
is
always set first to i
and only then the i
value will be increased by one.
j = ++i
would increase i
first and then set j to i!
The values for each cycle for a polygon with seven vertices would be:
So now we know the values for i
and j
for each cycle. In addition, we know
that the number of cycles is always equal to the number of vertices/edges. But
the most important conclusion to take here is: We loop through each line segment
of the polygon, starting with the last line segment. Be aware that the order of
the vertices in the arrays (clockwise or counterclockwise) does not matter.
Finally, we would like to understand the if statement. For this we will break up
the two conditions, which are connected with an &&
. The if statement will only
be true if both conditions below are true.
In the C language, if the first statement is false, then the second statement will not be evaluated. In other languages you might need to be careful, because you might get a divide-by-zero constellation.
First condition:
(verty[i] > testy) != (verty[j] > testy)
In the first condition we are only interested in the vertical position (y
position) of the test point in relation to the current line segment being
evaluated. The two conditions in brackets are connected with an !=
, so it will
only return true if they are not equal. This is the case if the y position of
the test point is between the two points of the line segment or exactly on the
same height as the lower point of the line segment (see later). Let’s have a
detailed look at it and test what happens for some possible constellations. We
will always look at the last line segment of the polygon (segment g) that
connects the point G with the point A.
Let’s check for the test points H, I and J and fill the numbers:
// Test Point H:
(1 > 2) != (3 > 2) => false != true => true
// Test Point I:
(1 > 3) != (3 > 3) => false != false => false
// Test Point J:
(1 > 1) != (3 > 1) => false != true => true
Here we see, that if the y position of the test point is between the two points of the line segment, it will return true. We also note, that if the y position of the test point is the same as the y position of the upper point of the line segment it will return false (Point I in our example). That is not the case for the lower point, here it will return true.
This algorithm has the following rules:
It is crucial, that the collisions with vertices are evaluated that way. The algorithm would return false values, if it would count the collision with the upper vertex. It is worth to look at one more example.
If the algorithm would count the collision with the upper vertex you would get three (lower vertex of line segment c, upper vertex of line segment d and line segment b) collisions with the semi-infinite line (raycast) g. That is not correct because the test point E is clearly outside of the polygon!
Here you will find a lot of further informations regarding this topic. The following illustration from Mark VandeWettering was helpful to understand the problem better.
Let’s have a look at a different constellation.
Let’s check for the test point H and fill the numbers:
// Test Point H:
(2.5 > 2.5) != (2.5 > 2.5) => false != false => false
This is a special case and we observe that it does not count as a collision.
Condition 2:
(testx < (vertx[j] - vertx[i]) * (testy - verty[i]) / (verty[j] - verty[i]) + vertx[i])
This condition checks if the x position of the test point is on the left side of the line segment. It is important to mention, that if the test point is exactly on the line segment that no collision is counted. This explains, as we have already mentioned above in the part “Ray casting algorithm”, why you get different results (inside or outside) depending on which line segment the test point lies. If the test point lies on a line segment on the left side of the polygon it will output “outside”. If the test point lies on a line segment on the right side of the polygon it will output “inside”.
If you want to speed up your algorithm, you can create an axis-aligned bounding box (AABB) around the polygon first and check, if the point lies outside of this rectangle. If that is the case, no further tests need to be performed. Great!
Find below the quick test written in C. The function implementations can be found in the sample code.
|
|
In this method the bounding box surrounding the polygon is split into multiple bins. When a test point is evaluated the corresponding bin is retrieved and collisions are evaluated. For more details have a look at the great article “Point in Polygon Strategies” from Eric Haines.
This optimization is not included in the upper source code.
For even faster results you can use a lookup grids. Lay out a grid inside the bounding box and decide for each cell if it is fully inside, fully outside or indeterminate. For more details have a look at the great article “Point in Polygon Strategies” from Eric Haines.
This optimization is not included in the upper source code.
You might want to use different algorithms depending the type of the polygon (f.e. convex vs non-convex).