{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating new layers\n", "\n", "Since geopandas takes advantage of Shapely geometric objects, it is possible to create spatial data from a scratch by passing Shapely's geometric objects into the GeoDataFrame. This is useful as it makes it easy to convert e.g. a text file that contains coordinates into spatial data layers. Next we will see how to create a new GeoDataFrame from scratch and save it into a Shapefile. Our goal is to define a geometry that represents the outlines of the [Senate square in Helsinki, Finland](https://fi.wikipedia.org/wiki/Senaatintori).\n", "\n", "- Start by importing necessary modules:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import geopandas as gpd\n", "from shapely.geometry import Point, Polygon\n", "from pyproj import CRS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Let's create an empty `GeoDataFrame`:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "newdata = gpd.GeoDataFrame()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Empty GeoDataFrame\n", "Columns: []\n", "Index: []\n" ] } ], "source": [ "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have an empty GeoDataFrame! A geodataframe is basically a pandas DataFrame that should have one column dedicated for geometries. By default, the geometry-column should be named `geometry` (geopandas looks for geometries from this column). \n", "\n", "- Let's create the `geometry` column:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Create a new column called 'geometry' to the GeoDataFrame\n", "newdata['geometry'] = None" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Empty GeoDataFrame\n", "Columns: [geometry]\n", "Index: []\n" ] } ], "source": [ "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a `geometry` column in our GeoDataFrame but we still don't have any data.\n", "\n", "- Let's create a Shapely `Polygon` repsenting the Helsinki Senate square that we can later insert to our GeoDataFrame:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Coordinates of the Helsinki Senate square in decimal degrees\n", "coordinates = [(24.950899, 60.169158), (24.953492, 60.169158), (24.953510, 60.170104), (24.950958, 60.169990)]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Create a Shapely polygon from the coordinate-tuple list\n", "poly = Polygon(coordinates)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "POLYGON ((24.950899 60.169158, 24.953492 60.169158, 24.95351 60.170104, 24.950958 60.16999, 24.950899 60.169158))\n" ] } ], "source": [ "# Check the polyogon\n", "print(poly)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we have an appropriate `Polygon` -object.\n", "\n", "- Let's insert the polygon into our 'geometry' column of our GeoDataFrame on the first row:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Insert the polygon into 'geometry' -column at row 0\n", "newdata.at[0, 'geometry'] = poly" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " geometry\n", "0 POLYGON ((24.95090 60.16916, 24.95349 60.16916...\n" ] } ], "source": [ "# Let's see what we have now\n", "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great, now we have a GeoDataFrame with a Polygon that we could already now export to a Shapefile. However, typically you might want to include some attribute information with the geometry. \n", "\n", "- Let's add another column to our GeoDataFrame called `location` with text `Senaatintori` that describes the location of the feature." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " geometry location\n", "0 POLYGON ((24.95090 60.16916, 24.95349 60.16916... Senaatintori\n" ] } ], "source": [ "# Add a new column and insert data \n", "newdata.at[0, 'location'] = 'Senaatintori'\n", "\n", "# Let's check the data\n", "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we have additional information that is useful for recognicing what the feature represents. \n", "\n", "Before exporting the data it is always good (basically necessary) to **determine the coordinate reference system (projection) for the GeoDataFrame.** GeoDataFrame has an attribute called `.crs` that shows the coordinate system of the data which is empty (None) in our case since we are creating the data from the scratch (more about projection on next tutorial):" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "print(newdata.crs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's add a crs for our GeoDataFrame. We passed the coordinates as latitude and longitude decimal degrees, so the correct CRS is WGS84 (epsg code: 4326).\n", "\n", "- add CRS definition to `newdata` in wkt format using pyproj CRS:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Set the GeoDataFrame's coordinate system to WGS84 (i.e. epsg code 4326)\n", "newdata.crs = CRS.from_epsg(4326).to_wkt()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, now we have added coordinate reference system information into our `GeoDataFrame`. The CRS information is necessary for creating a `.prj` file for our output Shapefile. \n", "\n", "- Finally, we can export the GeoDataFrame using `.to_file()` -function. The function works quite similarly as the export functions in pandas, but here we only need to provide the output path for the Shapefile. Easy isn't it!:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Determine the output path for the Shapefile\n", "outfp = \"L2_data/Senaatintori.shp\"\n", "\n", "# Write the data into that Shapefile\n", "newdata.to_file(outfp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have successfully created a Shapefile from scratch using only Python programming. Similar approach can be used to for example to read\n", "coordinates from a text file (e.g. points) and create Shapefiles from those automatically.\n", "\n", "\n", "
\n", "\n", "**TASK**\n", " \n", "Check the output Shapefile by reading it with geopandas and make sure that the attribute table and geometry seems correct.\n", "\n", "
\n", "\n", "
\n", "\n", "**EXTRA TASK**\n", " \n", "Re-project the data to ETRS-TM35FIN (EPSG:3067) and save again!\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }