Splitting many polygons into (almost) equal parts using a simple script in PyQGIS, the Python Console of QGIS

Fellow researchers and open-source GIS enthusiasts,

Welcome to my blog!

I’d like to start with a disclaimer – I may be a researcher of this very area but that doesn’t mean everything I do or write here will work for you, in your own desktop configurations and package versions. I have no responsibility if you lose data or mess up your installation. I also do not authorize any copies of my content, except for the gists, which are available under the MIT License.

Whatever is your geo-thing, you may have come across a Polygon which needed to be split in half, three parts, five parts (a hundred parts?). Possible applications of the split polygons are performing zonal statistics and terrain parceling (agriculture).

But wait! There is already a known pipeline used for splitting polygons in GIS Software. It was developed by Darafei Praliaskouski for PostGIS and tested by Paul Ramsey (Thank you!), and is available at Paul Ramsey’s website. The same pipeline was also adapted by Ujaval Gandhi to be used in QGIS (through the User Interface) and can be found in his blog. If you are dealing with a small number of polygons to be divided in QGIS, the aforementioned post from Ujaval Gandhi may be useful to you, especially if you are not familiar with the Python Console of QGIS (PyQGIS).

If you are a Portuguese speaker, you can also check this video on the YouTube channel of Geoaplicada, in which Jocilene Barros uses the Polygon Divider plugin.

By the way, if you are a Portuguese speaker, did you know you could read this exact blog post, and all others, in Portuguese? Just change the language by clicking the globe symbol on the upper right.

Why PyQGIS?

Oh, so many options! If you are still here, I guess you want to use PyQGIS for this task. But why would you want that? I listed four motives below:

  1. You have many Polygons to divide in equal parts each. Doing the same process one-by-one is such a waste of time. Honestly, this was my motive.

  2. You don’t want to install more plugins, but you also don’t want to spend too much time splitting polygons.

  3. You are a Python novice/intermediate/advanced and want to see a working example of how to use PyQGIS.

  4. You want to use only native tools of QGIS. This pipeline here uses only native tools.

A Disclaimer about the usage of the word “Equal”

The resulting polygons generated by pipelines which involve K-means clustering and Voronoi/Thiessen polygons (such as mine) most of the times do not generate a perfect division in terms of area. There is an error associated to each division, which we will be exploring later, on the section “Checking the results and associated errors”. Do not use my pipeline if you need perfectly equal resulting polygons.

My pipeline

For this development, I use a slightly different pipeline from the one described in Paul Ramsey’s website, but I was inspired by that pipeline to build this one.

My pipeline, in general lines:

  1. Preprocessing: Split the original vector file into N other files, each one with one polygon.

  2. In PyQGIS: Get the extents of the vector layer

  3. Generate regular spacing points within those extents

  4. Clip the regular points to the polygon area

  5. Cluster your points by using K-Means algorithm, use the number of clusters as equal to the number of divisions

  6. Calculate mean coordinates for each cluster of points

  7. Calculate Voronoi polygons for the mean coordinates

  8. Clip the polygon based on the Voronoi polygons

  9. Postprocessing: check the results

The example

I start with this Shapefile:

shapefile qgis

This is its attribute table:

shapefile qgis attribute table

And, in this example, I want to divide each of the polygons in three equal parts.

I am using QGIS 3.22 Bialowieza for this tutorial.

Preprocessing details

To make sure the polygons are divided in equal (or almost equal) parts, it is important to know the area of each of them beforehand. On the attribute table, open the field calculator and create a new field called area. I gave my field the type “Decimal number (real)”. The expression is simply “$area”. Generate this column, save, and turn off the editing mode. My attribute table now looks like this:

shapefile qgis attribute table

Important: use a projected coordinate system when performing area calculations.

Another important preprocessing step. To run the PyQGIS code on this post, I need to have each polygon in a different shapefile. If you have all the polygons you want to split in the same shapefile, use “Split Vector Layer” (native:splitvectorlayer) to separate the polygons. In my case, I used the id column to split them.

split vector layer qgis

Save them in a permanent folder, preferentially.

The resulting shapefiles are: split vector layer qgis result

Running the PyQGIS code

This is the gist, hosted on GitHub. Made available under MIT License.

Make the adaptations necessary to each case, namely, coordinate system, number of divisions, spacing of the regular points used to divide the polygons, location of the files in the system, and name of the attribute table column used to split the polygons into different files.

And… run the code!

This is the resulting split for the example polygons:

divided polygons

Checking the results and associated errors

For polygon 4, I have that the total area is: 7080260.638, which means one third of it is 2360086.879.

In the attribute table of the divided polygon, I calculate “$area” (same as I did before). I get the values: 2050355.0129558295, 2675185.5178783312, and 2354716.308374524.

Therefore, we can notice there is a variation in area between the divided parts. This happens mainly because the clusters generated by K-Means Clustering have different sizes. Using Random Points instead of Regular Points does not help and can even induce a higher error, in some cases (like if you use a small number of random points for the area). If you do find a workaround for this issue, please let me know.

Extras

  • When I should NOT be using this script to divide my polygons?

You should not use this script, or any pipeline involving clusters of different sizes, if you need precision in your division.

  • If there are already pipelines to divide polygons into different areas, what exactly have you accomplished here?

The automation of a potentially tedious process, without the need to install any plugins.

  • You said you do not use any plugins, but Python Console is a plugin. Gotcha!

You’re right, “Python Console” is a plugin, as well as “Processing” and “GRASS GIS Provider” which are plugins too. I mentioned there is no need to install extra plugins to run my script.

Luísa Vieira Lucchese
Luísa Vieira Lucchese
Research Assistant Professor

Research Assistant Professor at University of Pittsburgh

Related