API Reference

class raster2poly.RasterClassifier(raster_path)[source]

Bases: object

Classify a raster and vectorise the result to polygons.

Parameters:

raster_path (path-like) – Multi-band GeoTIFF (or any GDAL-readable raster).

Examples

>>> clf = RasterClassifier("image.tif")
>>> gdf = clf.unsupervised(n_clusters=6)
>>> gdf.to_file("classes.shp")
__init__(raster_path)[source]
Parameters:

raster_path (str | Path)

Return type:

None

unsupervised(n_clusters=5, *, algorithm='kmeans', dissolve=True, min_area=0.0)[source]

K-Means or MiniBatchKMeans clustering.

Parameters:
  • n_clusters (int) – Number of classes.

  • algorithm ("kmeans" | "mini_batch_kmeans") – MiniBatchKMeans is recommended for rasters > 10 M pixels.

  • dissolve (bool) – Merge adjacent same-class polygons.

  • min_area (float) – Drop polygons smaller than this (map unitsΒ²).

Return type:

GeoDataFrame with class_id and geometry columns.

supervised(roi_path, *, class_col='class_id', n_estimators=100, dissolve=True, min_area=0.0)[source]

Random Forest classification trained from an ROI shapefile.

The ROI file may contain Points or Polygons (or both). For polygons every pixel inside the geometry is used as a training sample β€” far more robust than a single zonal mean.

Parameters:
  • roi_path (path-like) – Shapefile / GeoPackage with training geometries.

  • class_col (str) – Column holding integer class labels (default "class_id").

  • n_estimators (int) – Number of Random Forest trees.

  • dissolve (bool)

  • min_area (float)

Return type:

Classified GeoDataFrame.

from_dn_ranges(rules, *, dissolve=True, min_area=0.0)[source]

Rule-based classification from digital-number thresholds.

Parameters:
  • rules (dict) – {class_id: [(band, min_dn, max_dn), …], …} Band numbers are 1-based. A pixel must satisfy all conditions in the list to be assigned that class.

  • dissolve (bool)

  • min_area (float)

Return type:

GeoDataFrame

Example

>>> rules = {
...     1: [(4, 0.15, 1.0), (5, 0.0, 0.10)],  # high Red, low NIR
...     2: [(5, 0.25, 1.0)],                     # high NIR
... }
>>> gdf = clf.from_dn_ranges(rules)
static available_algorithms()[source]

Print all supported classification algorithms.

Example

>>> RasterClassifier.available_algorithms()
Return type:

None

band_stats()[source]

Print min / max / mean / std for every band directly from the raster.

Example

>>> clf.band_stats()
Return type:

None

encode_roi(roi_path, label_col, *, output_path=None, id_col='class_id')[source]

Encode a string / categorical label column to consecutive integer IDs and save the result β€” no extra scripts required.

Parameters:
  • roi_path (path-like) – Input shapefile / GeoPackage with a text label column.

  • label_col (str) – Column that holds the string class names (e.g. "Age").

  • output_path (path-like, optional) – Destination file. Defaults to <stem>_encoded<suffix> next to the input file.

  • id_col (str) – Name of the new integer-ID column added to the output (default "class_id").

Returns:

  • output_path (Path) – Path of the saved encoded file.

  • mapping (dict) – {label_name: integer_id} β€” labels are sorted alphabetically and numbered from 1 (0 is reserved for nodata).

Return type:

Tuple[Path, Dict[str, int]]

Example

>>> out, mapping = clf.encode_roi("ages2.shp", label_col="Age")
>>> print(mapping)   # {'Holocene': 1, 'Jurassic': 2, ...}
>>> gdf_rf = clf.supervised(roi_path=out, class_col="class_id")
save(gdf, path, driver=None)[source]

Write classified polygons to disk.

Format is inferred from extension (.shp, .gpkg, .geojson).

Parameters:
Return type:

None