Case: Employment rate map
Contents
Case: Employment rate map#
Goal: plot an interactive map of employment rates across Finnish regions.
Required modules:
Folium for plotting interactive maps based on leaflet.js
Pandas for handling tabular data
Geopandas for handling spatial data
# Import required modules:
import folium
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
Employment rate data#
Employment rate refers to “the proportion of the employed among persons aged 15 to 64”. Data from Statistics Finland (saved from .xml-file to csv in Excel..).
# Read in data
data = pd.read_csv("data/seutukunta_tyollisyys_2013.csv", sep=",")
data.head()
seutukunta | seutukunta_nimi | tyollisyys | |
---|---|---|---|
0 | SK011 | Helsingin seutukunta | 73.0 |
1 | SK014 | Raaseporin seutukunta | 70.3 |
2 | SK015 | Porvoon seutukunta | 74.3 |
3 | SK016 | Loviisan seutukunta | 71.5 |
4 | SK021 | Åboland-Turunmaan seutukunta | 72.9 |
Sub-regional units#
The spatial data for the sub-regional units (Seutukunnat in Finnish) can be retrieved from the Statistics Finland Web Feature Service http://geo.stat.fi/geoserver/tilastointialueet/wfs
# A layer saved to GeoJson in QGIS..
#geodata = gpd.read_file('Seutukunnat_2018.geojson')
# Get features directly from the wfs
url = "http://geo.stat.fi/geoserver/tilastointialueet/wfs?request=GetFeature&typename=tilastointialueet:seutukunta1000k_2020&outputformat=JSON"
geodata = gpd.read_file(url)
geodata.head()
id | seutukunta | vuosi | nimi | namn | name | geometry | |
---|---|---|---|---|---|---|---|
0 | seutukunta1000k_2020.1 | 011 | 2020 | Helsinki | Helsingfors | Helsinki | MULTIPOLYGON (((409963.522 6681658.341, 409969... |
1 | seutukunta1000k_2020.2 | 014 | 2020 | Raasepori | Raseborg | Raasepori | MULTIPOLYGON (((306616.919 6665438.489, 306668... |
2 | seutukunta1000k_2020.3 | 015 | 2020 | Porvoo | Borgå | Porvoo | MULTIPOLYGON (((427108.141 6694151.025, 427175... |
3 | seutukunta1000k_2020.4 | 016 | 2020 | Loviisa | Lovisa | Loviisa | MULTIPOLYGON (((444038.768 6703649.355, 444155... |
4 | seutukunta1000k_2020.5 | 021 | 2020 | Åboland-Turunmaa | Åboland-Turunmaa | Åboland-Turunmaa | MULTIPOLYGON (((190999.717 6715878.622, 191021... |
Join attributes and geometries#
We can join the attribute layer and spatial layer based on the region code (stored in column ‘seutukunta’). The region codes in the csv contain additional letters “SK” which we need to remove before the join:
data["seutukunta"] = data["seutukunta"].apply(lambda x: x[2:])
data["seutukunta"].head()
0 011
1 014
2 015
3 016
4 021
Name: seutukunta, dtype: object
Now we can join the data based on the “seutukunta” -column. Let’s also check that we have a matching number of records before and after the join:
#print info
print("Count of original attributes:", len(data))
print("Count of original geometries:", len(geodata))
# Merge data
geodata = geodata.merge(data, on = "seutukunta")
#Print info
print("Count after the join:", len(geodata))
geodata.head()
Count of original attributes: 70
Count of original geometries: 70
Count after the join: 70
id | seutukunta | vuosi | nimi | namn | name | geometry | seutukunta_nimi | tyollisyys | |
---|---|---|---|---|---|---|---|---|---|
0 | seutukunta1000k_2020.1 | 011 | 2020 | Helsinki | Helsingfors | Helsinki | MULTIPOLYGON (((409963.522 6681658.341, 409969... | Helsingin seutukunta | 73.0 |
1 | seutukunta1000k_2020.2 | 014 | 2020 | Raasepori | Raseborg | Raasepori | MULTIPOLYGON (((306616.919 6665438.489, 306668... | Raaseporin seutukunta | 70.3 |
2 | seutukunta1000k_2020.3 | 015 | 2020 | Porvoo | Borgå | Porvoo | MULTIPOLYGON (((427108.141 6694151.025, 427175... | Porvoon seutukunta | 74.3 |
3 | seutukunta1000k_2020.4 | 016 | 2020 | Loviisa | Lovisa | Loviisa | MULTIPOLYGON (((444038.768 6703649.355, 444155... | Loviisan seutukunta | 71.5 |
4 | seutukunta1000k_2020.5 | 021 | 2020 | Åboland-Turunmaa | Åboland-Turunmaa | Åboland-Turunmaa | MULTIPOLYGON (((190999.717 6715878.622, 191021... | Åboland-Turunmaan seutukunta | 72.9 |
## Create a static map
Now we have a spatial layer with the employment rate information (in column “tyollisuus”). Let’s create a simple plot based on this data:
# Define which variable to plot
geodata.plot(column="tyollisyys")
<AxesSubplot:>
Adjusting the figure, we need to import matplotlib pyplot
# Adjust figure size
fig, ax = plt.subplots(1, figsize=(10, 8))
# Adjust colors and add a legend
geodata.plot(ax=ax, column="tyollisyys", scheme="quantiles", cmap="Reds", legend=True)
<AxesSubplot:>
Create an interactive map#
Next, we’ll plot an interactive map based on the same data, and usign the folium library, which enables us to create maps based on the JavaScript library leaflet.js.
# Create a Geo-id which is needed by the Folium (it needs to have a unique identifier for each row)
geodata['geoid'] = geodata.index.astype(str)
# Create a Map instance
m = folium.Map(location=[60.25, 24.8], tiles = 'cartodbpositron', zoom_start=8, control_scale=True)
folium.Choropleth(geo_data = geodata,
data = geodata,
columns=['geoid','tyollisyys'],
key_on='feature.id',
fill_color='RdYlBu',
line_color='white',
line_weight=0,
legend_name= 'Employment rate in Finland').add_to(m)
m
We can also plot “tooltips” on the map, which show the values for each feature.
folium.features.GeoJson(geodata, name='Labels',
style_function=lambda x: {'color':'transparent','fillColor':'transparent','weight':0},
tooltip=folium.features.GeoJsonTooltip(fields=['tyollisyys'],
aliases = ['Employment rate'],
labels=True,
sticky=False
)
).add_to(m)
m