{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Spatial join\n", "\n", "[Spatial join](http://wiki.gis.com/wiki/index.php/Spatial_Join) is\n", "yet another classic GIS problem. Getting attributes from one layer and\n", "transferring them into another layer based on their spatial relationship\n", "is something you most likely need to do on a regular basis.\n", "\n", "In the previous section we learned how to perform **a Point in Polygon query**.\n", "We could now apply those techniques and create our own function to perform **a spatial join** between two layers based on their\n", "spatial relationship. We could, for example, join the attributes of a polygon layer into a point layer where each point would get the\n", "attributes of a polygon that ``contains`` the point.\n", "\n", "Luckily, [spatial join is already implemented in Geopandas](http://geopandas.org/mergingdata.html#spatial-joins), thus we do not need to create our own function for doing it. There are three possible types of\n", "join that can be applied in spatial join that are determined with ``op`` -parameter in the ``gpd.sjoin()`` -function:\n", "\n", "- ``\"intersects\"``\n", "- ``\"within\"``\n", "- ``\"contains\"``\n", "\n", "Sounds familiar? Yep, all of those spatial relationships were discussed\n", "in the [Point in Polygon lesson](point-in-polygon.ipynb), thus you should know how they work. \n", "\n", "Furthermore, pay attention to the different options for the type of join via the `how` parameter; \"left\", \"right\" and \"inner\". You can read more about these options in the [geopandas sjoin documentation](http://geopandas.org/mergingdata.html#sjoin-arguments) and pandas guide for [merge, join and concatenate](https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html)\n", "\n", "Let's perform a spatial join between these two layers:\n", "- **Addresses:** the geocoded address-point (we created this Shapefile in the geocoding tutorial)\n", "- **Population grid:** 250m x 250m grid polygon layer that contains population information from the Helsinki Region.\n", " - The population grid a dataset is produced by the **Helsinki Region Environmental\n", "Services Authority (HSY)** (see [this page](https://www.hsy.fi/fi/asiantuntijalle/avoindata/Sivut/AvoinData.aspx?dataID=7) to access data from different years).\n", " - You can download the data from [from this link](https://www.hsy.fi/sites/AvoinData/AvoinData/SYT/Tietoyhteistyoyksikko/Shape%20(Esri)/V%C3%A4est%C3%B6tietoruudukko/Vaestotietoruudukko_2018_SHP.zip) in the [Helsinki Region Infroshare\n", "(HRI) open data portal](https://hri.fi/en_gb/).\n", "\n", "\n", "\n", "- Here, we will access the data directly from the HSY wfs:\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import geopandas as gpd\n", "from pyproj import CRS\n", "import requests\n", "import geojson\n", "\n", "# Specify the url for web feature service\n", "url = 'https://kartta.hsy.fi/geoserver/wfs'\n", "\n", "# Specify parameters (read data in json format). \n", "# Available feature types in this particular data source: http://geo.stat.fi/geoserver/vaestoruutu/wfs?service=wfs&version=2.0.0&request=describeFeatureType\n", "params = dict(service='WFS', \n", " version='2.0.0', \n", " request='GetFeature', \n", " typeName='asuminen_ja_maankaytto:Vaestotietoruudukko_2018', \n", " outputFormat='json')\n", "\n", "# Fetch data from WFS using requests\n", "r = requests.get(url, params=params)\n", "\n", "# Create GeoDataFrame from geojson\n", "pop = gpd.GeoDataFrame.from_features(geojson.loads(r.content))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Check the result: " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometryindexasukkaitaasvaljyysika0_9ika10_19ika20_29ika30_39ika40_49ika50_59ika60_69ika70_79ika_yli80
0MULTIPOLYGON Z (((25476499.999 6674248.999 0.0...3342108451123672617864
1MULTIPOLYGON Z (((25476749.997 6674498.998 0.0...3503273353524526240262590
2MULTIPOLYGON Z (((25476999.994 6675749.004 0.0...36602393446242445333025102
3MULTIPOLYGON Z (((25476999.994 6675499.004 0.0...366120230523713364311433
4MULTIPOLYGON Z (((25476999.994 6675249.005 0.0...366226130643236643420632
\n", "
" ], "text/plain": [ " geometry index asukkaita \\\n", "0 MULTIPOLYGON Z (((25476499.999 6674248.999 0.0... 3342 108 \n", "1 MULTIPOLYGON Z (((25476749.997 6674498.998 0.0... 3503 273 \n", "2 MULTIPOLYGON Z (((25476999.994 6675749.004 0.0... 3660 239 \n", "3 MULTIPOLYGON Z (((25476999.994 6675499.004 0.0... 3661 202 \n", "4 MULTIPOLYGON Z (((25476999.994 6675249.005 0.0... 3662 261 \n", "\n", " asvaljyys ika0_9 ika10_19 ika20_29 ika30_39 ika40_49 ika50_59 \\\n", "0 45 11 23 6 7 26 17 \n", "1 35 35 24 52 62 40 26 \n", "2 34 46 24 24 45 33 30 \n", "3 30 52 37 13 36 43 11 \n", "4 30 64 32 36 64 34 20 \n", "\n", " ika60_69 ika70_79 ika_yli80 \n", "0 8 6 4 \n", "1 25 9 0 \n", "2 25 10 2 \n", "3 4 3 3 \n", "4 6 3 2 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pop.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okey so we have multiple columns in the dataset but the most important\n", "one here is the column `asukkaita` (\"population\" in Finnish) that\n", "tells the amount of inhabitants living under that polygon.\n", "\n", "- Let's change the name of that column into `pop18` so that it is\n", " more intuitive. As you might remember, we can easily rename (Geo)DataFrame column names using the ``rename()`` function where we pass a dictionary of new column names like this: ``columns={'oldname': 'newname'}``." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['geometry', 'index', 'pop18', 'asvaljyys', 'ika0_9', 'ika10_19',\n", " 'ika20_29', 'ika30_39', 'ika40_49', 'ika50_59', 'ika60_69', 'ika70_79',\n", " 'ika_yli80'],\n", " dtype='object')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Change the name of a column\n", "pop = pop.rename(columns={'asukkaita': 'pop18'})\n", "\n", "# See the column names and confirm that we now have a column called 'pop17'\n", "pop.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Let's also get rid of all unnecessary columns by selecting only\n", " columns that we need i.e. ``pop18`` and ``geometry``" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Subset columns\n", "pop = pop[[\"pop18\", \"geometry\"]]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pop18geometry
0108MULTIPOLYGON Z (((25476499.999 6674248.999 0.0...
1273MULTIPOLYGON Z (((25476749.997 6674498.998 0.0...
2239MULTIPOLYGON Z (((25476999.994 6675749.004 0.0...
3202MULTIPOLYGON Z (((25476999.994 6675499.004 0.0...
4261MULTIPOLYGON Z (((25476999.994 6675249.005 0.0...
\n", "
" ], "text/plain": [ " pop18 geometry\n", "0 108 MULTIPOLYGON Z (((25476499.999 6674248.999 0.0...\n", "1 273 MULTIPOLYGON Z (((25476749.997 6674498.998 0.0...\n", "2 239 MULTIPOLYGON Z (((25476999.994 6675749.004 0.0...\n", "3 202 MULTIPOLYGON Z (((25476999.994 6675499.004 0.0...\n", "4 261 MULTIPOLYGON Z (((25476999.994 6675249.005 0.0..." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pop.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have cleaned the data and have only those columns that we need\n", "for our analysis." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Join the layers\n", "\n", "Now we are ready to perform the spatial join between the two layers that\n", "we have. The aim here is to get information about **how many people live\n", "in a polygon that contains an individual address-point** . Thus, we want\n", "to join attributes from the population layer we just modified into the\n", "addresses point layer ``addresses.shp`` that we created trough gecoding in the previous section.\n", "\n", "- Read the addresses layer into memory:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Addresses filpath\n", "addr_fp = r\"data/addresses.shp\"\n", "\n", "# Read data\n", "addresses = gpd.read_file(addr_fp)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
addressidaddrgeometry
0Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns...1000Itämerenkatu 14, 00101 Helsinki, FinlandPOINT (24.91556 60.16320)
1Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp...1001Kampinkuja 1, 00100 Helsinki, FinlandPOINT (24.93169 60.16902)
2Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel...1002Kaivokatu 8, 00101 Helsinki, FinlandPOINT (24.94168 60.16996)
3Hermannin rantatie, Kyläsaari, Hermanni, Helsi...1003Hermannin rantatie 1, 00580 Helsinki, FinlandPOINT (24.97193 60.19700)
4Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län...1005Tyynenmerenkatu 9, 00220 Helsinki, FinlandPOINT (24.92160 60.15665)
\n", "
" ], "text/plain": [ " address id \\\n", "0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... 1000 \n", "1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 \n", "2 Bangkok9, 8, Kaivokatu, Keskusta, Kluuvi, Etel... 1002 \n", "3 Hermannin rantatie, Kyläsaari, Hermanni, Helsi... 1003 \n", "4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... 1005 \n", "\n", " addr geometry \n", "0 Itämerenkatu 14, 00101 Helsinki, Finland POINT (24.91556 60.16320) \n", "1 Kampinkuja 1, 00100 Helsinki, Finland POINT (24.93169 60.16902) \n", "2 Kaivokatu 8, 00101 Helsinki, Finland POINT (24.94168 60.16996) \n", "3 Hermannin rantatie 1, 00580 Helsinki, Finland POINT (24.97193 60.19700) \n", "4 Tyynenmerenkatu 9, 00220 Helsinki, Finland POINT (24.92160 60.15665) " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the head of the file\n", "addresses.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to do a spatial join, the layers need to be in the same projection\n", "\n", "- Check the crs of input layers:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'init': 'epsg:4326'}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "addresses.crs" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "pop.crs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the crs information is missing from the population grid, we can **define the coordinate reference system** as **ETRS GK-25 (EPSG:3879)** because we know what it is based on the [population grid metadata](https://hri.fi/data/dataset/vaestotietoruudukko). " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Define crs\n", "pop.crs = CRS.from_epsg(3879).to_wkt()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'PROJCRS[\"ETRS89 / GK25FIN\",BASEGEOGCRS[\"ETRS89\",DATUM[\"European Terrestrial Reference System 1989\",ELLIPSOID[\"GRS 1980\",6378137,298.257222101,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4258]],CONVERSION[\"Finland Gauss-Kruger zone 25\",METHOD[\"Transverse Mercator\",ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8801]],PARAMETER[\"Longitude of natural origin\",25,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"Scale factor at natural origin\",1,SCALEUNIT[\"unity\",1],ID[\"EPSG\",8805]],PARAMETER[\"False easting\",25500000,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"northing (N)\",north,ORDER[1],LENGTHUNIT[\"metre\",1]],AXIS[\"easting (E)\",east,ORDER[2],LENGTHUNIT[\"metre\",1]],USAGE[SCOPE[\"unknown\"],AREA[\"Finland - 24.5°E to 25.5°E onshore nominal\"],BBOX[59.94,24.5,68.9,25.5]],ID[\"EPSG\",3879]]'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pop.crs" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Are the layers in the same projection?\n", "addresses.crs == pop.crs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's re-project addresses to the projection of the population layer:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "addresses = addresses.to_crs(pop.crs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Let's make sure that the coordinate reference system of the layers\n", "are identical" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PROJCRS[\"ETRS89 / GK25FIN\",BASEGEOGCRS[\"ETRS89\",DATUM[\"European Terrestrial Reference System 1989\",ELLIPSOID[\"GRS 1980\",6378137,298.257222101,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4258]],CONVERSION[\"Finland Gauss-Kruger zone 25\",METHOD[\"Transverse Mercator\",ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8801]],PARAMETER[\"Longitude of natural origin\",25,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"Scale factor at natural origin\",1,SCALEUNIT[\"unity\",1],ID[\"EPSG\",8805]],PARAMETER[\"False easting\",25500000,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"northing (N)\",north,ORDER[1],LENGTHUNIT[\"metre\",1]],AXIS[\"easting (E)\",east,ORDER[2],LENGTHUNIT[\"metre\",1]],USAGE[SCOPE[\"unknown\"],AREA[\"Finland - 24.5°E to 25.5°E onshore nominal\"],BBOX[59.94,24.5,68.9,25.5]],ID[\"EPSG\",3879]]\n", "PROJCRS[\"ETRS89 / GK25FIN\",BASEGEOGCRS[\"ETRS89\",DATUM[\"European Terrestrial Reference System 1989\",ELLIPSOID[\"GRS 1980\",6378137,298.257222101,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4258]],CONVERSION[\"Finland Gauss-Kruger zone 25\",METHOD[\"Transverse Mercator\",ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8801]],PARAMETER[\"Longitude of natural origin\",25,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"Scale factor at natural origin\",1,SCALEUNIT[\"unity\",1],ID[\"EPSG\",8805]],PARAMETER[\"False easting\",25500000,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"northing (N)\",north,ORDER[1],LENGTHUNIT[\"metre\",1]],AXIS[\"easting (E)\",east,ORDER[2],LENGTHUNIT[\"metre\",1]],USAGE[SCOPE[\"unknown\"],AREA[\"Finland - 24.5°E to 25.5°E onshore nominal\"],BBOX[59.94,24.5,68.9,25.5]],ID[\"EPSG\",3879]]\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check the crs of address points\n", "print(addresses.crs)\n", "\n", "# Check the crs of population layer\n", "print(pop.crs)\n", "\n", "# Do they match now?\n", "addresses.crs == pop.crs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now they should be identical. Thus, we can be sure that when doing spatial\n", "queries between layers the locations match and we get the right results\n", "e.g. from the spatial join that we are conducting here.\n", "\n", "- Let's now join the attributes from ``pop`` GeoDataFrame into\n", " ``addresses`` GeoDataFrame by using ``gpd.sjoin()`` -function:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# Make a spatial join\n", "join = gpd.sjoin(addresses, pop, how=\"inner\", op=\"within\")" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
addressidaddrgeometryindex_rightpop18
0Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns...1000Itämerenkatu 14, 00101 Helsinki, FinlandPOINT (25495311.608 6672258.695)1514515
1Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp...1001Kampinkuja 1, 00100 Helsinki, FinlandPOINT (25496207.840 6672906.173)1600182
3Hermannin rantatie, Kyläsaari, Hermanni, Helsi...1003Hermannin rantatie 1, 00580 Helsinki, FinlandPOINT (25498443.209 6676021.310)1904275
4Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län...1005Tyynenmerenkatu 9, 00220 Helsinki, FinlandPOINT (25495645.995 6671528.068)15501435
6Itäväylä, Vartioharju, Vartiokylä, Helsinki, H...1007Itäväylä 3, 00950 Helsinki, FinlandPOINT (25506149.985 6678773.518)3007155
\n", "
" ], "text/plain": [ " address id \\\n", "0 Ruoholahti, 14, Itämerenkatu, Ruoholahti, Läns... 1000 \n", "1 Kamppi, 1, Kampinkuja, Kamppi, Eteläinen suurp... 1001 \n", "3 Hermannin rantatie, Kyläsaari, Hermanni, Helsi... 1003 \n", "4 Hesburger, 9, Tyynenmerenkatu, Jätkäsaari, Län... 1005 \n", "6 Itäväylä, Vartioharju, Vartiokylä, Helsinki, H... 1007 \n", "\n", " addr \\\n", "0 Itämerenkatu 14, 00101 Helsinki, Finland \n", "1 Kampinkuja 1, 00100 Helsinki, Finland \n", "3 Hermannin rantatie 1, 00580 Helsinki, Finland \n", "4 Tyynenmerenkatu 9, 00220 Helsinki, Finland \n", "6 Itäväylä 3, 00950 Helsinki, Finland \n", "\n", " geometry index_right pop18 \n", "0 POINT (25495311.608 6672258.695) 1514 515 \n", "1 POINT (25496207.840 6672906.173) 1600 182 \n", "3 POINT (25498443.209 6676021.310) 1904 275 \n", "4 POINT (25495645.995 6671528.068) 1550 1435 \n", "6 POINT (25506149.985 6678773.518) 3007 155 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "join.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Awesome! Now we have performed a successful spatial join where we got\n", "two new columns into our ``join`` GeoDataFrame, i.e. ``index_right``\n", "that tells the index of the matching polygon in the population grid and\n", "``pop18`` which is the population in the cell where the address-point is\n", "located.\n", "\n", "- Let's still check how many rows of data we have now:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "28" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(join)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Did we lose some data here? \n", "\n", "- Check how many addresses we had originally:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "34" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(addresses)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we plot the layers on top of each other, we can observe that some of the points are located outside the populated grid squares (increase figure size if you can't see this properly!)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "\n", "# Create a figure with one subplot\n", "fig, ax = plt.subplots(figsize=(15,8))\n", "\n", "# Plot population grid\n", "pop.plot(ax=ax)\n", "\n", "# Plot points\n", "addresses.plot(ax=ax, color='red', markersize=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's also visualize the joined output:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Plot the points and use the ``pop18`` column to indicate the color.\n", " ``cmap`` -parameter tells to use a sequential colormap for the\n", " values, ``markersize`` adjusts the size of a point, ``scheme`` parameter can be used to adjust the classification method based on [pysal](http://pysal.readthedocs.io/en/latest/library/esda/mapclassify.html), and ``legend`` tells that we want to have a legend:\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Create a figure with one subplot\n", "fig, ax = plt.subplots(figsize=(10,6))\n", "\n", "# Plot the points with population info\n", "join.plot(ax=ax, column='pop18', cmap=\"Reds\", markersize=15, scheme='quantiles', legend=True);\n", "\n", "# Add title\n", "plt.title(\"Amount of inhabitants living close the the point\");\n", "\n", "# Remove white space around the figure\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In a similar way, we can plot the original population grid and check the overall population distribution in Helsinki:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Create a figure with one subplot\n", "fig, ax = plt.subplots(figsize=(10,6))\n", "\n", "# Plot the grid with population info\n", "pop.plot(ax=ax, column='pop18', cmap=\"Reds\", scheme='quantiles', legend=True);\n", "\n", "# Add title\n", "plt.title(\"Population 2018 in 250 x 250 m grid squares\");\n", "\n", "# Remove white space around the figure\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Finally, let's save the result point layer into a file:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Output path\n", "outfp = r\"data/addresses_population.shp\"\n", "\n", "# Save to disk\n", "join.to_file(outfp)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }