Point-in-polygon queries
Contents
Point-in-polygon queries#
Finding out if a certain point is located inside or outside of an area, or finding out if a line intersects with another line or polygon are fundamental geospatial operations that are often used e.g. to select data based on location. Such spatial queries are one of the typical first steps of the workflow when doing spatial analysis. Performing a spatial join (will be introduced later) between two spatial datasets is one of the most typical applications where Point in Polygon (PIP) query is used.
For further reading about PIP and other geometric operations, see Chapter 4.2 in Smith, Goodchild & Longley: Geospatial Analysis - 6th edition.
How to check if point is inside a polygon?#
Computationally, detecting if a point is inside a polygon is most commonly done using a specific formula called Ray Casting algorithm. Luckily, we do not need to create such a function ourselves for conducting the Point in Polygon (PIP) query. Instead, we can take advantage of Shapely’s binary predicates that can evaluate the topolocical relationships between geographical objects, such as the PIP as we’re interested here.
Point-in-polygon queries on shapely
geometries#
There are basically two ways of conducting PIP in Shapely:
using a function called within() that checks if a point is within a polygon
using a function called contains() that checks if a polygon contains a point
Note
Even though we are talking here about Point in Polygon operation, it is also possible to check if a LineString or Polygon is inside another Polygon.
Let’s first create a couple of point geometries:
import shapely.geometry
point1 = shapely.geometry.Point(24.952242, 60.1696017)
point2 = shapely.geometry.Point(24.976567, 60.1612500)
… and a polygon:
polygon = shapely.geometry.Polygon(
[
(24.950899, 60.169158),
(24.953492, 60.169158),
(24.953510, 60.170104),
(24.950958, 60.169990)
]
)
print(point1)
print(point2)
print(polygon)
POINT (24.952242 60.1696017)
POINT (24.976567 60.16125)
POLYGON ((24.950899 60.169158, 24.953492 60.169158, 24.95351 60.170104, 24.950958 60.16999, 24.950899 60.169158))
Let’s check if the points are within()
the polygon:
point1.within(polygon)
True
point2.within(polygon)
False
It seems that the first point is inside the polygon, but the second one is not.
We can turn the logic of the look-up around: Rather than check of the point is
within the polygon, we can also ask whether the polygon contains()
the point:
polygon.contains(point1)
True
polygon.contains(point2)
False
Hint
The two ways of checking the spatial relationship are complementary and yield
equivalent results;
contains()
is inverse to
within()
,
and vice versa.
Then, which one should you use? Well, it depends:
if you have many points and just one polygon and you try to find out which one of them is inside the polygon: You might need to iterate over the points and check one at a time if it is
within()
the polygon.if you have many polygons and just one point and you want to find out which polygon contains the point: You might need to iterate over the polygons until you find a polygon that
contains()
the point specified
Point-in-polygon queries on geopandas.GeoDataFrame
s#
In the following practical example we find which of the addresses we obtained in the geocoding section are located within a certain city district of Helsinki.
The data set we are using is from Helsinki Region Infoshare, and licensed under a Creative-Commons-Attribution-4.0 license.
import pathlib
NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"
import geopandas
city_districts = geopandas.read_file(
DATA_DIRECTORY / "helsinki_city_districts" / "helsinki_city_districts_2021.gpkg"
)
city_districts.head()
name | geometry | |
---|---|---|
0 | Eteläinen | POLYGON ((24.78280 60.09996, 24.80437 60.07607... |
1 | Läntinen | POLYGON ((24.83140 60.25406, 24.83168 60.25321... |
2 | Keskinen | POLYGON ((24.93345 60.18317, 24.93502 60.18005... |
3 | Pohjoinen | POLYGON ((24.90081 60.23526, 24.89944 60.23500... |
4 | Koillinen | POLYGON ((24.97163 60.24253, 24.97163 60.24246... |
city_districts.plot()
<AxesSubplot: >

Specifically, we want to find out which points are within the ‘Eteläinen’ (‘southern’) city district. Let’s start by obtaining a separate data set for this district, loading the addresses data, and plotting a multi-layer map that shows all districts, the ‘Eteläinen’ district, and all the points in one map:
southern_district = city_districts[city_districts.name == "Eteläinen"]
southern_district
name | geometry | |
---|---|---|
0 | Eteläinen | POLYGON ((24.78280 60.09996, 24.80437 60.07607... |
addresses = geopandas.read_file(DATA_DIRECTORY / "addresses.gpkg")
Plotting multiple map layers
To plot several map layers in one figure, use the ax
parameter to specify in
which axes data should be plotted. We used this in lesson 7 of
Geo-Python to add text to a plot, or modify axes’ properties.
The easiest way to obtain an axes is to save the first plot()
’s
return value (see below). Another option is to create subplots()
, possibly with only one row and one column.
axes = city_districts.plot(facecolor="grey")
southern_district.plot(ax=axes, facecolor="red")
addresses.plot(ax=axes, color="blue", markersize=5)
<AxesSubplot: >

Some points are within the ‘Eteläinen’ district, but others are not. To find
out which are the ones inside the district, we can use a point-in-polygon
query, this time on the entire geopandas.GeoDataFrame
. Its method
within()
returns Boolean (True
/False
) values that indicate whether or not
a row’s geometry is contained in the supplied other geometry:
geometry vs. geometry column
In the example below, we use southern.at[0, "geometry"]
to obtain a single
value, a shapely.geometry.Polygon
, instead of an entire column (a
GeoSeries
). This is in order to match each row’s geometry of the entire
addresses
data frame against the same polygon. If, in contrast, we would
run within()
against a column, the operation would be carried out row-wise,
i.e. the first address point would be checked against the first polygon, the
second address point against the second polygon, and so forth.
Check the documentation for
within()
to learn more!
addresses.within(southern_district.at[0, "geometry"])
0 True
1 True
2 True
3 False
4 True
5 False
6 False
7 False
8 False
9 False
10 True
11 False
12 False
13 False
14 False
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
26 False
27 False
28 False
29 False
30 True
31 True
32 True
33 True
dtype: bool
This list of Boolean values, also called a mask array can be used to filter the input data frame:
addresses_in_the_southern_district = addresses[
addresses.within(southern_district.at[0, "geometry"])
]
addresses_in_the_southern_district
address | geometry | |
---|---|---|
0 | Ruoholahti, 14, Itämerenkatu, Salmisaari, Ruoh... | POINT (24.91556 60.16320) |
1 | Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... | POINT (24.93166 60.16905) |
2 | Kauppakeskus Citycenter, 8, Kaivokatu, Keskust... | POINT (24.94179 60.16989) |
4 | 9, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E... | POINT (24.92151 60.15662) |
10 | Rautatientori, Keskusta, Kluuvi, Eteläinen suu... | POINT (24.94269 60.17118) |
30 | Kampin keskus, 1, Urho Kekkosen katu, Kamppi, ... | POINT (24.93312 60.16909) |
31 | Ruoholahdenkatu, Kamppi, Eteläinen suurpiiri, ... | POINT (24.93039 60.16641) |
32 | 3, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E... | POINT (24.92121 60.15878) |
33 | 4, Vilhonkatu, Kaisaniemi, Kluuvi, Eteläinen s... | POINT (24.94694 60.17198) |
Finally, let’s plot this list of addresses one more time to visually verify that all of them, indeed, are located within the ‘Eteläinen’ city district:
axes = city_districts.plot(facecolor="grey")
southern_district.plot(ax=axes, facecolor="red")
addresses_in_the_southern_district.plot(
ax=axes,
color="gold",
markersize=5
)
<AxesSubplot: >

Perfect! Now we are left with only the (golden) points which, indeed, are inside the red polygon. That’s exactly what we wanted!