{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating new layers (extra)\n", "\n", "Since geopandas takes advantage of Shapely geometric objects, it is possible to create spatial data from scratch by passing Shapely's geometric objects into the GeoDataFrame. This is useful as it makes it easy to convert e.g. a text file that contains coordinates into spatial data layers. Next we will see how to create a new GeoDataFrame from scratch and save it into a Shapefile. Our goal is to define a geometry that represents the outlines of the [Senate square in Helsinki, Finland](https://fi.wikipedia.org/wiki/Senaatintori).\n", "\n", "Start by importing necessary modules:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import geopandas as gpd\n", "from shapely.geometry import Point, Polygon\n", "from pyproj import CRS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create an empty `GeoDataFrame`:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "newdata = gpd.GeoDataFrame()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Empty GeoDataFrame\n", "Columns: []\n", "Index: []\n" ] } ], "source": [ "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have an empty GeoDataFrame! A geodataframe is basically a pandas DataFrame that should have one column dedicated for geometries. By default, the geometry-column should be named `geometry` (geopandas looks for geometries from this column). \n", "\n", "Let's create the `geometry` column:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Create a new column called 'geometry' to the GeoDataFrame\n", "newdata['geometry'] = None" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Empty GeoDataFrame\n", "Columns: [geometry]\n", "Index: []\n" ] } ], "source": [ "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a `geometry` column in our GeoDataFrame but we still don't have any data.\n", "\n", "Let's create a Shapely `Polygon` repsenting the Helsinki Senate square that we can later insert to our GeoDataFrame:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Coordinates of the Helsinki Senate square in decimal degrees\n", "coordinates = [(24.950899, 60.169158), (24.953492, 60.169158), (24.953510, 60.170104), (24.950958, 60.169990)]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Create a Shapely polygon from the coordinate-tuple list\n", "poly = Polygon(coordinates)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "POLYGON ((24.950899 60.169158, 24.953492 60.169158, 24.95351 60.170104, 24.950958 60.16999, 24.950899 60.169158))\n" ] } ], "source": [ "# Check the polyogon\n", "print(poly)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we have an appropriate `Polygon` -object.\n", "\n", "Let's insert the polygon into our 'geometry' column of our GeoDataFrame on the first row:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Insert the polygon into 'geometry' -column at row 0\n", "newdata.at[0, 'geometry'] = poly" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " geometry\n", "0 POLYGON ((24.95090 60.16916, 24.95349 60.16916...\n" ] } ], "source": [ "# Let's see what we have now\n", "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great, now we have a GeoDataFrame with a Polygon that we could already now export to a Shapefile. However, typically you might want to include some attribute information with the geometry. \n", "\n", "Let's add another column to our GeoDataFrame called `location` with text `Senaatintori` that describes the location of the feature." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " geometry location\n", "0 POLYGON ((24.95090 60.16916, 24.95349 60.16916... Senaatintori\n" ] } ], "source": [ "# Add a new column and insert data \n", "newdata.at[0, 'location'] = 'Senaatintori'\n", "\n", "# Let's check the data\n", "print(newdata)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we have additional information that is useful for recognicing what the feature represents. \n", "\n", "Before exporting the data it is always good (basically necessary) to **determine the coordinate reference system (projection) for the GeoDataFrame.** GeoDataFrame has an attribute called `.crs` that shows the coordinate system of the data which is empty (None) in our case since we are creating the data from the scratch (more about projection on next tutorial):" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n" ] } ], "source": [ "print(newdata.crs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's add a crs for our GeoDataFrame. We passed the coordinates as latitude and longitude decimal degrees, so the correct CRS is WGS84 (epsg code: 4326).\n", "\n", "Add CRS definition to `newdata` in wkt format using pyproj CRS:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Set the GeoDataFrame's coordinate system to WGS84 (i.e. epsg code 4326)\n", "newdata.crs = CRS.from_epsg(4326).to_wkt()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, now we have added coordinate reference system information into our `GeoDataFrame`. The CRS information is necessary for creating a `.prj` file for our output Shapefile. \n", "\n", "- Finally, we can export the GeoDataFrame using `.to_file()` -function. The function works quite similarly as the export functions in pandas, but here we only need to provide the output path for the Shapefile. Easy isn't it!:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# Determine the output path for the Shapefile\n", "outfp = \"L2_data/Senaatintori.shp\"\n", "\n", "# Write the data into that Shapefile\n", "newdata.to_file(outfp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have successfully created a Shapefile from scratch using only Python programming. Similar approach can be used to for example to read\n", "coordinates from a text file (e.g. points) and create Shapefiles from those automatically.\n", "\n", "\n", "#### Check your understanding\n", "\n", "\n", "
\n", "\n", " \n", "Check the output Shapefile by reading it with geopandas and make sure that the attribute table and geometry seems correct.\n", "\n", "
\n", "\n", "
\n", " \n", "Re-project the data to ETRS-TM35FIN (EPSG:3067) and save into a new file!\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 4 }