{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Geocoding in Geopandas\n", "\n", "It is possible to do geocoding in Geopandas using its integrated functionalities of geopy. Geopandas has a function called ``geocode()``\n", "that can geocode a list of addresses (strings) and return a GeoDataFrame containing the resulting point objects in ``geometry`` column. \n", "\n", "Nice, isn't it! Let's try this out.\n", "\n", "We will geocode addresses stored in a text file called `addresses.txt`. The addresses are located in the Helsinki Region in Southern Finland.\n", "\n", "The first rows of the data look like this:\n", "\n", "```\n", "id;addr\n", "1000;Itämerenkatu 14, 00101 Helsinki, Finland\n", "1001;Kampinkuja 1, 00100 Helsinki, Finland\n", "1002;Kaivokatu 8, 00101 Helsinki, Finland\n", "1003;Hermannin rantatie 1, 00580 Helsinki, Finland\n", "```\n", "\n", "We have an `id` for each row and an address on column `addr`.\n", "\n", "- Let's first read the data into a Pandas DataFrame using the `read_csv()` -function:\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Import necessary modules\n", "import pandas as pd\n", "import geopandas as gpd\n", "from shapely.geometry import Point\n", "\n", "# Filepath\n", "fp = r\"data/addresses.txt\"\n", "\n", "# Read the data\n", "data = pd.read_csv(fp, sep=';')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idaddr
01000Itämerenkatu 14, 00101 Helsinki, Finland
11001Kampinkuja 1, 00100 Helsinki, Finland
21002Kaivokatu 8, 00101 Helsinki, Finland
31003Hermannin rantatie 1, 00580 Helsinki, Finland
41005Tyynenmerenkatu 9, 00220 Helsinki, Finland
\n", "
" ], "text/plain": [ " id addr\n", "0 1000 Itämerenkatu 14, 00101 Helsinki, Finland\n", "1 1001 Kampinkuja 1, 00100 Helsinki, Finland\n", "2 1002 Kaivokatu 8, 00101 Helsinki, Finland\n", "3 1003 Hermannin rantatie 1, 00580 Helsinki, Finland\n", "4 1005 Tyynenmerenkatu 9, 00220 Helsinki, Finland" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's take a look of the data\n", "data.head()" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Now we have our data in a Pandas DataFrame and we can geocode our addresses." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometryaddress
0POINT (24.9155624 60.1632015)Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns...
1POINT (24.9316914 60.1690222)Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp...
2POINT (24.9416849 60.1699637)Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel...
3POINT (24.9655355 60.2008878)1, Hermannin rantatie, Hermanninmäki, Hermanni...
4POINT (24.9216003 60.1566475)Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län...
\n", "
" ], "text/plain": [ " geometry \\\n", "0 POINT (24.9155624 60.1632015) \n", "1 POINT (24.9316914 60.1690222) \n", "2 POINT (24.9416849 60.1699637) \n", "3 POINT (24.9655355 60.2008878) \n", "4 POINT (24.9216003 60.1566475) \n", "\n", " address \n", "0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... \n", "1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... \n", "2 Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel... \n", "3 1, Hermannin rantatie, Hermanninmäki, Hermanni... \n", "4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Import the geocoding tool\n", "from geopandas.tools import geocode\n", "\n", "# Geocode addresses with Nominatim backend\n", "geo = geocode(data['addr'], provider = 'nominatim', user_agent = 'autogis_student_xx')\n", "geo.head()" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "And Voilà! As a result we have a GeoDataFrame that contains our original\n", "address and a 'geometry' column containing Shapely Point -objects that\n", "we can use for exporting the addresses to a Shapefile for example.\n", "However, the ``id`` column is not there. Thus, we need to join the\n", "information from ``data`` into our new GeoDataFrame ``geo``, thus making\n", "a **Table Join**." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## Table join\n", "\n", "\n", "Table joins are really common procedures when doing GIS analyses. As you might remember from our earlier lessons, combining data from different tables based on common\n", "**key** attribute can be done easily in Pandas/Geopandas using the [.merge()](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html) -function. We used this approach in the geo-python course [exercise 6](https://geo-python.github.io/2018/lessons/L6/exercise-6.html#joining-data-from-one-dataframe-to-another).\n", "\n", "However, sometimes it is useful to join two tables together based on the **index** of those DataFrames. In such case, we assume\n", "that there is **same number of records** in our DataFrames and that the **order of the records should be the same** in both DataFrames.\n", "In fact, now we have such a situation as we are geocoding our addresses where the order of the geocoded addresses in ``geo`` DataFrame is the same\n", "as in our original ``data`` DataFrame.\n", "\n", "Hence, we can join those tables together with ``join()`` -function which merges the two DataFrames together\n", "based on index by default." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometryaddressidaddr
0POINT (24.9155624 60.1632015)Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns...1000Itämerenkatu 14, 00101 Helsinki, Finland
1POINT (24.9316914 60.1690222)Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp...1001Kampinkuja 1, 00100 Helsinki, Finland
2POINT (24.9416849 60.1699637)Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel...1002Kaivokatu 8, 00101 Helsinki, Finland
3POINT (24.9655355 60.2008878)1, Hermannin rantatie, Hermanninmäki, Hermanni...1003Hermannin rantatie 1, 00580 Helsinki, Finland
4POINT (24.9216003 60.1566475)Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län...1005Tyynenmerenkatu 9, 00220 Helsinki, Finland
\n", "
" ], "text/plain": [ " geometry \\\n", "0 POINT (24.9155624 60.1632015) \n", "1 POINT (24.9316914 60.1690222) \n", "2 POINT (24.9416849 60.1699637) \n", "3 POINT (24.9655355 60.2008878) \n", "4 POINT (24.9216003 60.1566475) \n", "\n", " address id \\\n", "0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... 1000 \n", "1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 \n", "2 Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel... 1002 \n", "3 1, Hermannin rantatie, Hermanninmäki, Hermanni... 1003 \n", "4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... 1005 \n", "\n", " addr \n", "0 Itämerenkatu 14, 00101 Helsinki, Finland \n", "1 Kampinkuja 1, 00100 Helsinki, Finland \n", "2 Kaivokatu 8, 00101 Helsinki, Finland \n", "3 Hermannin rantatie 1, 00580 Helsinki, Finland \n", "4 Tyynenmerenkatu 9, 00220 Helsinki, Finland " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "join = geo.join(data)\n", "join.head()" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "- Let's also check the data type of our new ``join`` table." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/plain": [ "geopandas.geodataframe.GeoDataFrame" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(join)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "As a result we have a new GeoDataFrame called ``join`` where we now have\n", "all original columns plus a new column for ``geometry``.\n", "\n", "- Now it is easy to save our address points into a Shapefile" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Output file path\n", "outfp = r\"data/addresses.shp\"\n", "\n", "# Save to Shapefile\n", "join.to_file(outfp)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "That's it. Now we have successfully geocoded those addresses into Points\n", "and made a Shapefile out of them. Easy isn't it!" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Notes about Nominatim\n", "\n", "Nominatim works relatively nicely if you have well defined and well-known addresses such as the ones that we used in this tutorial. However, in some cases, you might not have such well-defined addresses, and you might have e.g. only the name of a museum available. In such cases, Nominatim might not provide such good results, and in such cases you might want to use e.g. [Google Geocoding API (V3)](https://developers.google.com/maps/documentation/geocoding/). [Take a look from past year's materials where we show how to use Google Geocoding API](https://automating-gis-processes.github.io/2016/Lesson3-geocoding.html#geocoding-in-geopandas) in a similar manner as we used Nominatim here." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }