Point in Polygon & Intersect

Finding out if a certain point is located inside or outside of an area, or finding out if a line intersects with another line or polygon are fundamental geospatial operations that are often used e.g. to select data based on location. Such spatial queries are one of the typical first steps of the workflow when doing spatial analysis. Performing a spatial join (will be introduced later) between two spatial datasets is one of the most typical applications where Point in Polygon (PIP) query is used.

How to check if point is inside a polygon?

Computationally, detecting if a point is inside a polygon is most commonly done using a specific formula called Ray Casting algorithm. Luckily, we do not need to create such a function ourselves for conducting the Point in Polygon (PIP) query. Instead, we can take advantage of Shapely’s binary predicates that can evaluate the topolocical relationships between geographical objects, such as the PIP as we’re interested here.

There are basically two ways of conducting PIP in Shapely:

  1. using a function called .within() that checks if a point is within a polygon

  2. using a function called .contains() that checks if a polygon contains a point

Notice: even though we are talking here about Point in Polygon operation, it is also possible to check if a LineString or Polygon is inside another Polygon.

  • Let’s first create a Polygon using a list of coordinate-tuples and a couple of Point objects

from shapely.geometry import Point, Polygon

# Create Point objects
p1 = Point(24.952242, 60.1696017)
p2 = Point(24.976567, 60.1612500)

# Create a Polygon
coords = [(24.950899, 60.169158), (24.953492, 60.169158), (24.953510, 60.170104), (24.950958, 60.169990)]
poly = Polygon(coords)

# Let's check what we have
print(p1)
print(p2)
print(poly)
POINT (24.952242 60.1696017)
POINT (24.976567 60.16125)
POLYGON ((24.950899 60.169158, 24.953492 60.169158, 24.95351 60.170104, 24.950958 60.16999, 24.950899 60.169158))

Let’s check if those points are within the polygon

# Check if p1 is within the polygon using the within function
p1.within(poly)

# Check if p2 is within the polygon
p2.within(poly)
False

Okey, so we can see that the first point seems to be inside that polygon and the other one doesn’t.

  • In fact, the first point is close to the center of the polygon as we can see if we compare the point location to the centroid of the polygon:

# Our point
print(p1)

# The centroid
print(poly.centroid)
POINT (24.952242 60.1696017)
POINT (24.95224242849236 60.16960179038188)
  • It is also possible to do PIP other way around, i.e. to check if polygon contains a point:

# Does polygon contain p1?
poly.contains(p1)

# Does polygon contain p2?
poly.contains(p2)
False

Thus, both ways of checking the spatial relationship results in the same way.

Which one should you use then? Well, it depends:

  • if you have many points and just one polygon and you try to find out which one of them is inside the polygon:

    • you need to iterate over the points and check one at a time if it is within() the polygon specified

  • if you have many polygons and just one point and you want to find out which polygon contains the point

    • you need to iterate over the polygons until you find a polygon that contains() the point specified (assuming there are no overlapping polygons)

Intersect

Another typical geospatial operation is to see if a geometry intersect or touches another one. The difference between these two is that:

  • if objects intersect, the boundary and interior of an object needs to intersect in any way with those of the other.

  • If an object touches the other one, it is only necessary to have (at least) a single point of their boundaries in common but their interiors shoud NOT intersect.

Let’s try these out.

  • Let’s create two LineStrings

from shapely.geometry import LineString, MultiLineString

# Create two lines
line_a = LineString([(0, 0), (1, 1)])
line_b = LineString([(1, 1), (0, 2)])
  • Let’s see if they intersect

line_a.intersects(line_b)
True
  • Do they also touch each other?

line_a.touches(line_b)
True

Indeed, they do and we can see this by plotting the features together

# Create a MultiLineString from line_a and line_b
multi_line = MultiLineString([line_a, line_b])
multi_line
../../_images/point-in-polygon_15_0.svg

Thus, the line_b continues from the same node ( (1,1) ) where line_a ends.

However, if the lines overlap fully, they don’t touch due to the spatial relationship rule, as we can see:

  • Check if line_a touches itself

# Does the line touch with itself?
line_a.touches(line_a)
False
  • It does not. However, it does intersect

# Does the line intersect with itself?
line_a.intersects(line_a)
True

Point in Polygon using Geopandas

Next we will do a practical example where we check which of the addresses from previous tutorial are located in Southern district of Helsinki. Let’s start by reading a KML-file PKS_suuralue.kml that has the Polygons for districts of Helsinki Region (data openly available from Helsinki Region Infoshare.

  • Let’s start by reading the addresses from the Shapefile that we saved earlier.

import geopandas as gpd

fp = "data/addresses.shp"
data = gpd.read_file(fp)

data.head()
address id addr geometry
0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... 1000 Itämerenkatu 14, 00101 Helsinki, Finland POINT (24.9155624 60.1632015)
1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 Kampinkuja 1, 00100 Helsinki, Finland POINT (24.9316914 60.1690222)
2 Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel... 1002 Kaivokatu 8, 00101 Helsinki, Finland POINT (24.9416849 60.1699637)
3 1, Hermannin rantatie, Hermanninmäki, Hermanni... 1003 Hermannin rantatie 1, 00580 Helsinki, Finland POINT (24.9655355 60.2008878)
4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... 1005 Tyynenmerenkatu 9, 00220 Helsinki, Finland POINT (24.9216003 60.1566475)

Reading KML-files in Geopandas

It is possible to read the data from KML-files with GeoPandas in a similar manner as Shapefiles. However, we need to first, enable the KML-driver which is not enabled by default (because KML-files can contain unsupported data structures, nested folders etc., hence be careful when reading KML-files). Supported drivers are managed with fiona.supported_drivers, which is integrated in geopandas. Let’s first check which formats are currently supported:

import geopandas as gpd
gpd.io.file.fiona.drvsupport.supported_drivers
{'AeronavFAA': 'r',
 'ARCGEN': 'r',
 'BNA': 'raw',
 'DXF': 'raw',
 'OpenFileGDB': 'r',
 'ESRI Shapefile': 'raw',
 'GeoJSON': 'rw',
 'GPKG': 'rw',
 'GPX': 'raw',
 'GPSTrackMaker': 'raw',
 'Idrisi': 'r',
 'MapInfo File': 'raw',
 'DGN': 'raw',
 'PCIDSK': 'r',
 'SEGY': 'r',
 'SUA': 'r',
 'KML': 'rw'}
  • Let’s enable the read and write functionalities for KML-driver by passing 'rw' to whitelist of fiona’s supported drivers:

gpd.io.file.fiona.drvsupport.supported_drivers['KML'] = 'rw'

Let’s check again the supported drivers:

gpd.io.file.fiona.drvsupport.supported_drivers
{'AeronavFAA': 'r',
 'ARCGEN': 'r',
 'BNA': 'raw',
 'DXF': 'raw',
 'OpenFileGDB': 'r',
 'ESRI Shapefile': 'raw',
 'GeoJSON': 'rw',
 'GPKG': 'rw',
 'GPX': 'raw',
 'GPSTrackMaker': 'raw',
 'Idrisi': 'r',
 'MapInfo File': 'raw',
 'DGN': 'raw',
 'PCIDSK': 'r',
 'SEGY': 'r',
 'SUA': 'r',
 'KML': 'rw'}

Now we should be able to read a KML file with geopandas.read_file.

  • Let’s read the data from a following KML -file:

# Filepath to KML file
fp = "data/PKS_suuralue.kml"

polys = gpd.read_file(fp, driver='KML')

#Check the data
print("Number of rows:",len(polys))
polys.head(11)
Number of rows: 23
Name Description geometry
0 Suur-Espoonlahti POLYGON Z ((24.775059677807 60.1090604462157 0...
1 Suur-Kauklahti POLYGON Z ((24.6157775254076 60.1725681273527 ...
2 Vanha-Espoo POLYGON Z ((24.6757633262026 60.2120070032819 ...
3 Pohjois-Espoo POLYGON Z ((24.767921197401 60.2691954732391 0...
4 Suur-Matinkylä POLYGON Z ((24.7536131356802 60.1663051341717 ...
5 Kauniainen POLYGON Z ((24.6907528033566 60.2195779731868 ...
6 Suur-Leppävaara POLYGON Z ((24.797472695835 60.2082651196077 0...
7 Suur-Tapiola POLYGON Z ((24.8443596422129 60.1659790707387 ...
8 Myyrmäki POLYGON Z ((24.8245867448802 60.2902531157585 ...
9 Kivistö POLYGON Z ((24.9430919106369 60.3384471629062 ...
10 Eteläinen POLYGON Z ((24.7827651307035 60.09997268858 0,...

Nice, now we can see that we have 23 districts in our area. We are interested in an area that is called Eteläinen (‘Southern’ in english).

  • Let’s select the Eteläinen district and see where it is located on a map

# Select data and reset index
southern = polys.loc[polys['Name']=='Eteläinen']
southern.reset_index(drop=True, inplace=True)

Let’s create a map which shows the location of the selected district, and let’s also plot the address points on top of the map

import matplotlib.pyplot as plt

# Create a figure with one subplot
fig, ax = plt.subplots()

# Plot polygons
polys.plot(ax=ax, facecolor='gray')
southern.plot(ax=ax, facecolor='red')

# Plot points
data.plot(ax=ax, color='blue', markersize=5)

plt.tight_layout()
../../_images/point-in-polygon_33_0.png

Okey, so we can see that, indeed, certain points are within the selected red Polygon.

Let’s find out which one of them are located within the Polygon. Hence, we are conducting a Point in Polygon query.

  • Let’s first enable shapely.speedups which makes some of the spatial queries running faster.

import shapely.speedups
shapely.speedups.enable()
  • Let’s check which Points are within the southern Polygon. Notice, that here we check if the Points are within the geometry of the southern GeoDataFrame.

  • Hence, we use the .loc[0, 'geometry'] to parse the actual Polygon geometry object from the GeoDataFrame.

pip_mask = data.within(southern.loc[0, 'geometry'])
print(pip_mask)
0      True
1      True
2      True
3     False
4      True
5     False
6     False
7     False
8     False
9     False
10     True
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24    False
25    False
26    False
27    False
28    False
29    False
30     True
31     True
32     True
33     True
dtype: bool

As we can see, we now have an array of boolean values for each row, where the result is True if Point was inside the Polygon, and False if it was not.

  • We can now use this mask array to select the Points that are inside the Polygon. Selecting data with this kind of mask array (of boolean values) is easy by passing the array inside the loc indexing function of Pandas.

pip_data = data.loc[pip_mask]
pip_data
address id addr geometry
0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... 1000 Itämerenkatu 14, 00101 Helsinki, Finland POINT (24.9155624 60.1632015)
1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 Kampinkuja 1, 00100 Helsinki, Finland POINT (24.9316914 60.1690222)
2 Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel... 1002 Kaivokatu 8, 00101 Helsinki, Finland POINT (24.9416849 60.1699637)
4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... 1005 Tyynenmerenkatu 9, 00220 Helsinki, Finland POINT (24.9216003 60.1566475)
10 Rautatientori, Keskusta, Kluuvi, Eteläinen suu... 1011 Rautatientori 1, 00100 Helsinki, Finland POINT (24.9440942536239 60.17130125)
30 Kampin keskus, 1, Urho Kekkosen katu, Kamppi, ... 1031 Urho Kekkosen katu 1, 00100 Helsinki, Finland POINT (24.9331155798105 60.1690911)
31 Ruoholahdenkatu, Hietalahti, Kamppi, Eteläinen... 1032 Ruoholahdenkatu 17, 00101 Helsinki, Finland POINT (24.9252584 60.1648863)
32 3, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E... 1033 Tyynenmerenkatu 3, 00220 Helsinki, Finland POINT (24.9212065 60.1587845)
33 4, Vilhonkatu, Keskusta, Kluuvi, Eteläinen suu... 1034 Vilhonkatu 4, 00101 Helsinki, Finland POINT (24.9473289 60.1718719)

Let’s finally confirm that our Point in Polygon query worked as it should by plotting the points that are within the southern district:

# Create a figure with one subplot
fig, ax = plt.subplots()

# Plot polygons
polys.plot(ax=ax, facecolor='gray')
southern.plot(ax=ax, facecolor='red')

# Plot points
pip_data.plot(ax=ax, color='gold', markersize=2)

plt.tight_layout()
../../_images/point-in-polygon_41_0.png

Perfect! Now we only have the (golden) points that, indeed, are inside the red Polygon which is exactly what we wanted!