Introductionο
raster2poly turns multi-band rasters into classified polygon layers in a single function call. It wraps scikit-learn clustering and classification behind a GIS-native API β you give it a GeoTIFF, you get back a GeoDataFrame of dissolved, filtered polygons ready for QGIS or ArcGIS.
Why this package?ο
Converting a classified raster to usable vector polygons is a common GIS task, but the standard workflow involves half-a-dozen steps: read bands, mask nodata, flatten to feature arrays, run a classifier, reshape, polygonise, dissolve, filter. raster2poly collapses all of that into a single class with three classification methods and clean output.
Supported methodsο
Method |
API |
When to use |
|---|---|---|
KMeans |
|
Quick exploratory clustering, < 10 M pixels |
MiniBatchKMeans |
|
Large rasters (> 10 M pixels), lower RAM |
Random Forest |
|
You have labelled training ROIs (Points or Polygons) |
DN range rules |
|
You know the spectral signature of each class |
All methods return a GeoDataFrame with class_id and geometry
columns. Adjacent same-class polygons are dissolved by default, and a
min_area filter removes speckle.
Utility helpersο
clf.band_stats()Print min / max / mean / std for every band β essential before writing DN-range rules.
clf.available_algorithms()List all supported algorithms with usage examples.
clf.encode_roi(path, label_col="Age")Convert a text label column (e.g. Holocene, Jurassic) to consecutive integer IDs and save the encoded shapefile β no external pandas step required.
Design principlesο
No hardcoded paths β every file path is a function argument.
CRS safety β ROI vectors are always reprojected to the raster CRS, never the reverse.
Nodata β NaN β on load, nodata values are replaced with NaN and excluded from all computation.
Format detection β
clf.save()infers.shp/.gpkg/.geojsonfrom the file extension.