Geocoding in geopandas#

Geopandas supports geocoding via a library called geopy, which needs to be installed to use geopandas’ geopandas.tools.geocode() function. geocode() expects a list or pandas.Series of addresses (strings) and returns a GeoDataFrame with resolved addresses and point geometries.

Let’s try this out.

We will geocode addresses stored in a semicolon-separated text file called addresses.txt. These addresses are located in the Helsinki Region in Southern Finland.

import pathlib
NOTEBOOK_PATH = pathlib.Path().resolve()
DATA_DIRECTORY = NOTEBOOK_PATH / "data"
import pandas
addresses = pandas.read_csv(
    DATA_DIRECTORY / "helsinki_addresses" / "addresses.txt",
    sep=";"
)

addresses.head()
id addr
0 1000 Itämerenkatu 14, 00101 Helsinki, Finland
1 1001 Kampinkuja 1, 00100 Helsinki, Finland
2 1002 Kaivokatu 8, 00101 Helsinki, Finland
3 1003 Hermannin rantatie 1, 00580 Helsinki, Finland
4 1005 Tyynenmerenkatu 9, 00220 Helsinki, Finland

We have an id for each row and an address in the addr column.

Geocode addresses using Nominatim#

In our example, we will use Nominatim as a geocoding provider. Nominatim is a library and service using OpenStreetMap data, and run by the OpenStreetMap Foundation. Geopandas’ geocode() function supports it natively.

Fair-use

Nominatim’s terms of use require that users of the service make sure they don’t send more frequent requests than one per second, and that a custom user-agent string is attached to each query.

Geopandas’ implementation allows us to specify a user_agent; the library also takes care of respecting the rate-limit of Nominatim.

Looking up an address is a quite expensive database operation. This is why, sometimes, the public and free-to-use Nominatim server takes slightly longer to respond. In this example, we add a parameter timeout=10 to wait up to 10 seconds for a response.

import geopandas

geocoded_addresses = geopandas.tools.geocode(
    addresses["addr"],
    provider="nominatim",
    user_agent="autogis2023",
    timeout=10
)
geocoded_addresses.head()
geometry address
0 POINT (24.91556 60.16320) Ruoholahti, 14, Itämerenkatu, Salmisaari, Ruoh...
1 POINT (24.93166 60.16905) Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp...
2 POINT (24.94179 60.16989) Kauppakeskus Citycenter, 8, Kaivokatu, Keskust...
3 POINT (24.97846 60.19206) Hermannin rantatie, Verkkosaari, Kalasatama, S...
4 POINT (24.92151 60.15662) 9, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E...

Et voilà! As a result we received a GeoDataFrame that contains a parsed version of our original addresses and a geometry column of shapely.geometry.Points that we can use, for instance, to export the data to a geospatial data format.

However, the id column was discarded in the process. To combine the input data set with our result set, we can use pandas’ join operations.

Join data frames#

Joining data sets using pandas

For a comprehensive overview of different ways of combining DataFrames and Series based on set theory, have a look at pandas documentation about merge, join and concatenate.

Joining data from two or more data frames or tables is a common task in many (spatial) data analysis workflows. As you might remember from our earlier lessons, combining data from different tables based on common key attribute can be done easily in pandas/geopandas using the merge() function. We used this approach in exercise 6 of the Geo-Python course.

However, sometimes it is useful to join two data frames together based on their index. The data frames have to have the same number of records and share the same index (simply put, they should have the same order of rows).

We can use this approach, here, to join information from the original data frame addresses to the geocoded addresses geocoded_addresses, row by row. The join() function, by default, joins two data frames based on their index. This works correctly for our example, as the order of the two data frames is identical.

geocoded_addresses_with_id = geocoded_addresses.join(addresses)
geocoded_addresses_with_id
geometry address id addr
0 POINT (24.91556 60.16320) Ruoholahti, 14, Itämerenkatu, Salmisaari, Ruoh... 1000 Itämerenkatu 14, 00101 Helsinki, Finland
1 POINT (24.93166 60.16905) Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 Kampinkuja 1, 00100 Helsinki, Finland
2 POINT (24.94179 60.16989) Kauppakeskus Citycenter, 8, Kaivokatu, Keskust... 1002 Kaivokatu 8, 00101 Helsinki, Finland
3 POINT (24.97846 60.19206) Hermannin rantatie, Verkkosaari, Kalasatama, S... 1003 Hermannin rantatie 1, 00580 Helsinki, Finland
4 POINT (24.92151 60.15662) 9, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E... 1005 Tyynenmerenkatu 9, 00220 Helsinki, Finland
5 POINT (25.08174 60.23522) 18, Kontulantie, Kontula, Mellunkylä, Itäinen ... 1006 Kontulantie 18, 00940 Helsinki, Finland
6 POINT (25.10974 60.22102) Itäväylä, Vartioharju, Vartiokylä, Itäinen suu... 1007 Itäväylä 3, 00950 Helsinki, Finland
7 POINT (25.02831 60.27844) Tapulikaupungintie, Tapulikaupunki, Suutarila,... 1008 Tapulikaupungintie 3, 00750 Helsinki, Finland
8 POINT (25.02883 60.26326) Sompionpolku, Fallkullan kiila, Tapanila, Tapa... 1009 Sompionpolku 2, 00730 Helsinki, Finland
9 POINT (24.87197 60.22244) 5, Atomitie, Strömberg, Pitäjänmäen teollisuus... 1010 Atomitie 5, 00370 Helsinki, Finland
10 POINT (24.94269 60.17118) Rautatientori, Keskusta, Kluuvi, Eteläinen suu... 1011 Rautatientori 1, 00100 Helsinki, Finland
11 POINT (24.88421 60.23050) Kuparitie, Lassila, Haaga, Läntinen suurpiiri,... 1012 Kuparitie 8, 00440 Helsinki, Finland
12 POINT (24.87527 60.23890) Rumpupolku, Kannelmäki, Kaarela, Läntinen suur... 1013 Rumpupolku 8, 00420 Helsinki, Finland
13 POINT (24.94854 60.22196) Otto. automaatti (ATM), 1, Mäkitorpantie, Pato... 1014 Mäkitorpantie 1, 00620 Helsinki, Finland
14 POINT (25.01295 60.25107) Yliopiston Apteekki, 15, Malminkaari, Ala-Malm... 1015 Malminkaari 15, 00700 Helsinki, Finland
15 POINT (24.89418 60.21722) 23, Kylätie, Etelä-Haaga, Haaga, Läntinen suur... 1016 Kylätie 23, 00320 Helsinki, Finland
16 POINT (24.86653 60.25131) Malminkartanontie, Malminkartano, Kaarela, Län... 1017 Malminkartanontie 17, 00410 Helsinki, Finland
17 POINT (24.96566 60.22982) Oulunkylän tori, Patola, Oulunkylä, Pohjoinen ... 1018 Oulunkylän tori 2b, 00640 Helsinki, Finland
18 POINT (24.93435 60.19857) 6, Ratapihantie, Itä-Pasila, Pasila, Keskinen ... 1019 Ratapihantie 6, 00101 Helsinki, Finland
19 POINT (24.86086 60.22407) 15, Pitäjänmäentie, Reimarla, Pitäjänmäki, Län... 1020 Pitäjänmäentie 15, 00370 Helsinki, Finland
20 POINT (24.99362 60.24365) K-Market, 2, Eskolantie, Savela, Pukinmäki, Ko... 1021 Eskolantie 2, 00720 Helsinki, Finland
21 POINT (25.02891 60.24244) Tattariharjuntie, Ala-Malmi, Malmi, Koillinen ... 1022 Tattariharjuntie, 00700 Helsinki, Finland
22 POINT (25.07842 60.20984) Otto. pankkiautomaatti, 1, Tallinnanaukio, Itä... 1023 Tallinnanaukio 1, 00930 Helsinki, Finland
23 POINT (25.13686 60.20703) Tyynylaavantie, Keski-Vuosaari, Vuosaari, Itäi... 1024 Tyynylaavantie 7, 00980 Helsinki, Finland
24 POINT (25.07918 60.22320) Myllypurontie, Myllypuro, Vartiokylä, Itäinen ... 1025 Myllypurontie 5, 00920 Helsinki, Finland
25 POINT (25.10964 60.23788) Mellunmäenraitio, Mellunmäki, Mellunkylä, Itäi... 1026 Mellunmäenraitio 6, 00970 Helsinki, Finland
26 POINT (24.96108 60.18801) Vaasanpolku, Kurvi, Harju, Alppiharju, Keskine... 1027 Vaasanpolku 2, 00101 Helsinki, Finland
27 POINT (25.02832 60.19442) Alko, 2, Hiihtäjäntie, Länsi-Herttoniemi, Hert... 1028 Hiihtäjäntie 2, 00810 Helsinki, Finland
28 POINT (25.00681 60.18872) Metro Kulosaari, 2, Ukko-Pekan porras, Kulosaa... 1029 Ukko-Pekan porras 2, 00570 Helsinki, Finland
29 POINT (24.94953 60.17954) Instrumentarium Hakaniemi, 16, Siltasaarenkatu... 1030 Siltasaarenkatu 16, 00530 Helsinki, Finland
30 POINT (24.93312 60.16909) Kampin keskus, 1, Urho Kekkosen katu, Kamppi, ... 1031 Urho Kekkosen katu 1, 00100 Helsinki, Finland
31 POINT (24.93039 60.16641) Ruoholahdenkatu, Kamppi, Eteläinen suurpiiri, ... 1032 Ruoholahdenkatu 17, 00101 Helsinki, Finland
32 POINT (24.92121 60.15878) 3, Tyynenmerenkatu, Jätkäsaari, Länsisatama, E... 1033 Tyynenmerenkatu 3, 00220 Helsinki, Finland
33 POINT (24.94694 60.17198) 4, Vilhonkatu, Kaisaniemi, Kluuvi, Eteläinen s... 1034 Vilhonkatu 4, 00101 Helsinki, Finland

The output of join() is a new geopandas.GeoDataFrame:

type(geocoded_addresses_with_id)
geopandas.geodataframe.GeoDataFrame

The new data frame has all original columns plus new columns for the geometry and for a parsed address that can be used to spot-check the results.

Note

If you would do the join the other way around, i.e. addresses.join(geocoded_addresses), the output would be a pandas.DataFrame, not a geopandas.GeoDataFrame.


It’s now easy to save the new data set as a geospatial file, for instance, in GeoPackage format:

geocoded_addresses.to_file(DATA_DIRECTORY / "addresses.gpkg")