Skip to content

stac

copernicus #

Copernicus Data Space Ecosystem (CDSE) tools and constants.

CopernicusS2Band #

Bases: StrEnum

Copernicus Sentinel-2 Bands for Level-2A.

The value of each member corresponds to the asset key for the band. Base band names (e.g., 'B02') default to their native resolution. Explicit resolution members (e.g., 'B02_20m') are also provided.

base_name property #

base_name: str

Returns the base name of the band (e.g., 'B02').

native_res property #

native_res: int

Returns the native resolution of the band in meters.

Defaults to 10m if band base name is not recognized.

at_res #

at_res(resolution: int | CopernicusS2Resolution) -> str

Returns the asset key for this band at the specified resolution.

Parameters:

Name Type Description Default
resolution int | CopernicusS2Resolution

The resolution to get the key for (e.g., 20 or CopernicusS2Resolution.R20M).

required

Returns:

Type Description
str

The asset key string (e.g., 'B02_20m').

Source code in src/geospatial_tools/stac/copernicus/constants.py
184
185
186
187
188
189
190
191
192
193
194
195
def at_res(self, resolution: int | CopernicusS2Resolution) -> str:
    """
    Returns the asset key for this band at the specified resolution.

    Args:
        resolution: The resolution to get the key for (e.g., 20 or CopernicusS2Resolution.R20M).

    Returns:
        The asset key string (e.g., 'B02_20m').
    """
    res_val = resolution.value if isinstance(resolution, CopernicusS2Resolution) else resolution
    return f"{self.base_name}_{res_val}m"

CopernicusS2Collection #

Bases: StrEnum

Copernicus Sentinel-2 Collections.

CopernicusS2Property #

Bases: StrEnum

Copernicus Sentinel-2 STAC query properties.

These are standard STAC properties shared across catalogs. The sortby_field property returns the full JSON path required by the STAC API sortby object.

sortby_field property #

sortby_field: str

Returns the full JSON path prefix required by the STAC API sortby object.

CopernicusS2Resolution #

Bases: int, Enum

Copernicus Sentinel-2 Resolutions in meters.

__str__ #

__str__() -> str

Returns the resolution as a string with 'm' suffix.

Source code in src/geospatial_tools/stac/copernicus/constants.py
65
66
67
def __str__(self) -> str:
    """Returns the resolution as a string with 'm' suffix."""
    return f"{self.value}m"

__repr__ #

__repr__() -> str

Returns the band name as a string.

Source code in src/geospatial_tools/stac/copernicus/constants.py
69
70
71
def __repr__(self) -> str:
    """Returns the band name as a string."""
    return f"{self.value}"

auth #

This module contains authentication-related functions.

get_copernicus_credentials #

get_copernicus_credentials(logger: Logger = LOGGER) -> tuple[str, str] | None

Retrieves Copernicus credentials from environment variables or prompts the user.

This function first checks for COPERNICUS_USERNAME and COPERNICUS_PASSWORD environment variables. If they are not set, it interactively prompts the user for their username and password.

Using environment variables is recommended for security and to comply with the 12-factor app methodology, which separates configuration from code. This prevents hardcoding sensitive information and makes the application more portable across different environments (development, testing, production).

Parameters:

Name Type Description Default
logger Logger

Logger instance.

LOGGER

Returns:

Type Description
tuple[str, str] | None

A tuple containing the username and password, or None if they could not be

tuple[str, str] | None

obtained.

Source code in src/geospatial_tools/stac/copernicus/auth.py
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def get_copernicus_credentials(logger: logging.Logger = LOGGER) -> tuple[str, str] | None:
    """
    Retrieves Copernicus credentials from environment variables or prompts the user.

    This function first checks for `COPERNICUS_USERNAME` and `COPERNICUS_PASSWORD`
    environment variables. If they are not set, it interactively prompts the user
    for their username and password.

    Using environment variables is recommended for security and to comply with the
    12-factor app methodology, which separates configuration from code. This prevents
    hardcoding sensitive information and makes the application more portable across
    different environments (development, testing, production).

    Args:
        logger: Logger instance.

    Returns:
        A tuple containing the username and password, or None if they could not be
        obtained.
    """
    logger.info("Retrieving Copernicus credentials...")
    username = os.environ.get("COPERNICUS_USERNAME")
    password = os.environ.get("COPERNICUS_PASSWORD")

    if not username:
        logger.warning("COPERNICUS_USERNAME environment variable not set.")
        try:
            username = input("Enter your Copernicus username: ")
        except EOFError:
            logger.error("Could not read username from prompt.")
            return None

    if not password:
        logger.warning("COPERNICUS_PASSWORD environment variable not set.")
        try:
            password = getpass.getpass("Enter your Copernicus password: ")
        except EOFError:
            logger.error("Could not read password from prompt.")
            return None

    if not username or not password:
        logger.error("Username or password could not be obtained. Cannot proceed with authentication.")
        return None

    logger.info("Successfully retrieved Copernicus credentials.")
    return username, password

get_copernicus_token #

get_copernicus_token(logger: Logger = LOGGER) -> str | None

Retrieves an access token from the Copernicus Data Space Ecosystem.

This function uses the credentials obtained from get_copernicus_credentials to request an access token from the authentication endpoint.

Parameters:

Name Type Description Default
logger Logger

Logger instance.

LOGGER

Returns:

Type Description
str | None

The access token as a string, or None if authentication fails.

Source code in src/geospatial_tools/stac/copernicus/auth.py
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def get_copernicus_token(logger: logging.Logger = LOGGER) -> str | None:
    """
    Retrieves an access token from the Copernicus Data Space Ecosystem.

    This function uses the credentials obtained from `get_copernicus_credentials`
    to request an access token from the authentication endpoint.

    Args:
        logger: Logger instance.

    Returns:
        The access token as a string, or None if authentication fails.
    """
    credentials = get_copernicus_credentials(logger)
    if not credentials:
        return None

    username, password = credentials
    data = {
        "client_id": "cdse-public",
        "username": username,
        "password": password,
        "grant_type": "password",
    }

    try:
        response = requests.post(COPERNICUS_AUTH_URL, data=data, timeout=10)
        response.raise_for_status()
        token_data = response.json()
        access_token = token_data.get("access_token")
        if access_token:
            logger.info("Successfully obtained Copernicus access token.")
            return access_token
        logger.error("Access token not found in response.")
        return None
    except requests.exceptions.RequestException as e:
        logger.error(f"Failed to obtain access token: {e}")
        return None

constants #

This module contains Enums for Sentinel-2 on Copernicus Data Space Ecosystem (CDSE).

CopernicusS2Collection #

Bases: StrEnum

Copernicus Sentinel-2 Collections.

CopernicusS2Property #

Bases: StrEnum

Copernicus Sentinel-2 STAC query properties.

These are standard STAC properties shared across catalogs. The sortby_field property returns the full JSON path required by the STAC API sortby object.

sortby_field property #
sortby_field: str

Returns the full JSON path prefix required by the STAC API sortby object.

CopernicusS2Resolution #

Bases: int, Enum

Copernicus Sentinel-2 Resolutions in meters.

__str__ #
__str__() -> str

Returns the resolution as a string with 'm' suffix.

Source code in src/geospatial_tools/stac/copernicus/constants.py
65
66
67
def __str__(self) -> str:
    """Returns the resolution as a string with 'm' suffix."""
    return f"{self.value}m"
__repr__ #
__repr__() -> str

Returns the band name as a string.

Source code in src/geospatial_tools/stac/copernicus/constants.py
69
70
71
def __repr__(self) -> str:
    """Returns the band name as a string."""
    return f"{self.value}"

CopernicusS2Band #

Bases: StrEnum

Copernicus Sentinel-2 Bands for Level-2A.

The value of each member corresponds to the asset key for the band. Base band names (e.g., 'B02') default to their native resolution. Explicit resolution members (e.g., 'B02_20m') are also provided.

base_name property #
base_name: str

Returns the base name of the band (e.g., 'B02').

native_res property #
native_res: int

Returns the native resolution of the band in meters.

Defaults to 10m if band base name is not recognized.

at_res #
at_res(resolution: int | CopernicusS2Resolution) -> str

Returns the asset key for this band at the specified resolution.

Parameters:

Name Type Description Default
resolution int | CopernicusS2Resolution

The resolution to get the key for (e.g., 20 or CopernicusS2Resolution.R20M).

required

Returns:

Type Description
str

The asset key string (e.g., 'B02_20m').

Source code in src/geospatial_tools/stac/copernicus/constants.py
184
185
186
187
188
189
190
191
192
193
194
195
def at_res(self, resolution: int | CopernicusS2Resolution) -> str:
    """
    Returns the asset key for this band at the specified resolution.

    Args:
        resolution: The resolution to get the key for (e.g., 20 or CopernicusS2Resolution.R20M).

    Returns:
        The asset key string (e.g., 'B02_20m').
    """
    res_val = resolution.value if isinstance(resolution, CopernicusS2Resolution) else resolution
    return f"{self.base_name}_{res_val}m"

core #

This module contains functions that are related to STAC API.

AssetSubItem #

AssetSubItem(asset: Item, item_id: str, band: str, filename: str | Path)

Class that represent a STAC asset sub item.

Generally represents a single satellite image band.

Initializes an AssetSubItem.

Parameters:

Name Type Description Default
asset Item

The pystac Item this asset belongs to.

required
item_id str

The ID of the item.

required
band str

The band name of this sub-item.

required
filename str | Path

The local filename of the downloaded asset.

required
Source code in src/geospatial_tools/stac/core.py
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
def __init__(self, asset: pystac.Item, item_id: str, band: str, filename: str | Path) -> None:
    """
    Initializes an AssetSubItem.

    Args:
        asset: The pystac Item this asset belongs to.
        item_id: The ID of the item.
        band: The band name of this sub-item.
        filename: The local filename of the downloaded asset.
    """
    if isinstance(filename, str):
        filename = Path(filename)
    self.asset = asset
    self.item_id: str = item_id
    self.band: str = band
    self.filename: Path = filename

Asset #

Asset(
    asset_id: str,
    bands: list[str] | None = None,
    asset_item_list: list[AssetSubItem] | None = None,
    merged_asset_path: str | Path | None = None,
    reprojected_asset: str | Path | None = None,
    logger: Logger = LOGGER,
)

Represents a STAC asset, potentially composed of multiple bands/sub-items.

Initializes an Asset object.

Parameters:

Name Type Description Default
asset_id str

Unique ID for the asset (usually the item ID).

required
bands list[str] | None

List of bands this asset contains.

None
asset_item_list list[AssetSubItem] | None

List of AssetSubItem objects belonging to this asset.

None
merged_asset_path str | Path | None

Path to the merged multi-band raster file.

None
reprojected_asset str | Path | None

Path to the reprojected raster file.

None
logger Logger

Logger instance.

LOGGER
Source code in src/geospatial_tools/stac/core.py
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def __init__(
    self,
    asset_id: str,
    bands: list[str] | None = None,
    asset_item_list: list[AssetSubItem] | None = None,
    merged_asset_path: str | Path | None = None,
    reprojected_asset: str | Path | None = None,
    logger: logging.Logger = LOGGER,
) -> None:
    """
    Initializes an Asset object.

    Args:
        asset_id: Unique ID for the asset (usually the item ID).
        bands: List of bands this asset contains.
        asset_item_list: List of AssetSubItem objects belonging to this asset.
        merged_asset_path: Path to the merged multi-band raster file.
        reprojected_asset: Path to the reprojected raster file.
        logger: Logger instance.
    """
    self.asset_id = asset_id
    self.bands = bands
    self.merged_asset_path = Path(merged_asset_path) if isinstance(merged_asset_path, str) else merged_asset_path
    self.reprojected_asset_path = (
        Path(reprojected_asset) if isinstance(reprojected_asset, str) else reprojected_asset
    )
    self.logger = logger

    self._sub_items: list[AssetSubItem] = asset_item_list or []

__iter__ #

__iter__() -> Iterator[AssetSubItem]

Allows direct iteration: for item in asset:

Source code in src/geospatial_tools/stac/core.py
194
195
196
def __iter__(self) -> Iterator[AssetSubItem]:
    """Allows direct iteration: `for item in asset:`"""
    return iter(self._sub_items)

__len__ #

__len__() -> int

Allows checking size: len(asset)

Source code in src/geospatial_tools/stac/core.py
198
199
200
def __len__(self) -> int:
    """Allows checking size: `len(asset)`"""
    return len(self._sub_items)

__contains__ #

__contains__(band_name: str) -> bool

Allows checking for band existence: "B04" in asset

Source code in src/geospatial_tools/stac/core.py
202
203
204
def __contains__(self, band_name: str) -> bool:
    """Allows checking for band existence: `"B04" in asset`"""
    return any(item.band == band_name for item in self._sub_items)

__getitem__ #

__getitem__(index: int) -> AssetSubItem
__getitem__(band_name: str) -> AssetSubItem
__getitem__(key: int | str) -> AssetSubItem

Allows indexing by position or band name: asset[0] or asset["B04"]

Source code in src/geospatial_tools/stac/core.py
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def __getitem__(self, key: int | str) -> AssetSubItem:
    """
    Allows indexing by position or band name:
    `asset[0]` or `asset["B04"]`
    """
    if isinstance(key, int):
        return self._sub_items[key]

    if isinstance(key, str):
        for item in self._sub_items:
            if item.band == key:
                return item
        raise KeyError(f"Band '{key}' not found in asset '{self.asset_id}'.")

    raise TypeError(f"Invalid argument type: {type(key)}. Expected int or str.")

add_asset_item #

add_asset_item(asset: AssetSubItem) -> None

Adds an AssetSubItem to the asset.

Parameters:

Name Type Description Default
asset AssetSubItem

The AssetSubItem to add.

required
Source code in src/geospatial_tools/stac/core.py
228
229
230
231
232
233
234
235
236
237
def add_asset_item(self, asset: AssetSubItem) -> None:
    """
    Adds an AssetSubItem to the asset.

    Args:
      asset: The AssetSubItem to add.
    """
    self._sub_items.append(asset)
    if self.bands is not None and asset.band not in self.bands:
        self.bands.append(asset.band)

show_asset_items #

show_asset_items() -> None

Show items that belong to this asset.

Source code in src/geospatial_tools/stac/core.py
239
240
241
242
243
244
def show_asset_items(self) -> None:
    """Show items that belong to this asset."""
    asset_list = [
        f"ID: [{item.item_id}], Band: [{item.band}], filename: [{item.filename}]" for item in self._sub_items
    ]
    self.logger.info(f"Asset list for asset [{self.asset_id}] :\n\t{asset_list}")

merge_asset #

merge_asset(
    base_directory: str | Path | None = None, delete_sub_items: bool = False
) -> Path | None

Merges individual band rasters into a single multi-band raster file.

Parameters:

Name Type Description Default
base_directory str | Path | None

Directory where the merged file will be saved.

None
delete_sub_items bool

If True, delete individual band files after merging.

False

Returns:

Type Description
Path | None

The Path to the merged file if successful, else None.

Source code in src/geospatial_tools/stac/core.py
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
def merge_asset(self, base_directory: str | Path | None = None, delete_sub_items: bool = False) -> Path | None:
    """
    Merges individual band rasters into a single multi-band raster file.

    Args:
      base_directory: Directory where the merged file will be saved.
      delete_sub_items: If True, delete individual band files after merging.

    Returns:
        The Path to the merged file if successful, else None.
    """
    if not base_directory:
        base_directory = Path()
    if isinstance(base_directory, str):
        base_directory = Path(base_directory)

    merged_filename = base_directory / f"{self.asset_id}_merged.tif"

    if not self._sub_items:
        self.logger.error(f"No asset items to merge for asset [{self.asset_id}]")
        return None

    asset_filename_list = [asset.filename for asset in self._sub_items]

    meta = self._create_merged_asset_metadata()

    merge_raster_bands(
        merged_filename=merged_filename,
        raster_file_list=asset_filename_list,
        merged_metadata=meta,
        merged_band_names=self.bands,
    )

    if merged_filename.exists():
        self.logger.info(f"Asset [{self.asset_id}] merged successfully")
        self.logger.info(f"Asset location : [{merged_filename}]")
        self.merged_asset_path = merged_filename
        if delete_sub_items:
            self.delete_asset_sub_items()
        return merged_filename
    self.logger.error(f"There was a problem merging asset [{self.asset_id}]")
    return None

reproject_merged_asset #

reproject_merged_asset(
    target_projection: str | int,
    base_directory: str | Path | None = None,
    delete_merged_asset: bool = False,
) -> Path | None

Reprojects the merged multi-band raster to a target projection.

Parameters:

Name Type Description Default
target_projection str | int

The target CRS (EPSG code or string).

required
base_directory str | Path | None

Directory where the reprojected file will be saved.

None
delete_merged_asset bool

If True, delete the merged file after reprojection.

False

Returns:

Type Description
Path | None

The Path to the reprojected file if successful, else None.

Source code in src/geospatial_tools/stac/core.py
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
def reproject_merged_asset(
    self,
    target_projection: str | int,
    base_directory: str | Path | None = None,
    delete_merged_asset: bool = False,
) -> Path | None:
    """
    Reprojects the merged multi-band raster to a target projection.

    Args:
      target_projection: The target CRS (EPSG code or string).
      base_directory: Directory where the reprojected file will be saved.
      delete_merged_asset: If True, delete the merged file after reprojection.

    Returns:
        The Path to the reprojected file if successful, else None.
    """
    if not base_directory:
        base_directory = Path()
    if isinstance(base_directory, str):
        base_directory = Path(base_directory)
    target_path = base_directory / f"{self.asset_id}_reprojected.tif"
    self.logger.info(f"Reprojecting asset [{self.asset_id}] ...")

    if not self.merged_asset_path:
        self.logger.error(f"Merged asset path is missing for asset [{self.asset_id}]")
        return None

    reprojected_filename = reproject_raster(
        dataset_path=self.merged_asset_path,
        target_path=target_path,
        target_crs=target_projection,
        logger=self.logger,
    )
    if reprojected_filename and reprojected_filename.exists():
        self.logger.info(f"Asset location : [{reprojected_filename}]")
        self.reprojected_asset_path = reprojected_filename
        if delete_merged_asset:
            self.delete_merged_asset()
        return reprojected_filename
    self.logger.error(f"There was a problem reprojecting asset [{self.asset_id}]")
    return None

delete_asset_sub_items #

delete_asset_sub_items() -> None

Delete all asset sub items that belong to this asset.

Source code in src/geospatial_tools/stac/core.py
332
333
334
335
336
337
def delete_asset_sub_items(self) -> None:
    """Delete all asset sub items that belong to this asset."""
    self.logger.info(f"Deleting asset sub items from asset [{self.asset_id}]")
    for item in self._sub_items:
        self.logger.info(f"Deleting [{item.filename}] ...")
        item.filename.unlink(missing_ok=True)

delete_merged_asset #

delete_merged_asset() -> None

Delete merged asset.

Source code in src/geospatial_tools/stac/core.py
339
340
341
342
343
def delete_merged_asset(self) -> None:
    """Delete merged asset."""
    if self.merged_asset_path:
        self.logger.info(f"Deleting merged asset file for [{self.merged_asset_path}]")
        self.merged_asset_path.unlink(missing_ok=True)

delete_reprojected_asset #

delete_reprojected_asset() -> None

Delete reprojected asset.

Source code in src/geospatial_tools/stac/core.py
345
346
347
348
349
def delete_reprojected_asset(self) -> None:
    """Delete reprojected asset."""
    if self.reprojected_asset_path:
        self.logger.info(f"Deleting reprojected asset file for [{self.reprojected_asset_path}]")
        self.reprojected_asset_path.unlink(missing_ok=True)

StacSearch #

StacSearch(catalog_name: str, logger: Logger = LOGGER)

Utility class to help facilitate and automate STAC API searches through the use of pystac_client.Client.

Initializes a StacSearch instance.

Parameters:

Name Type Description Default
catalog_name str

Name of the STAC catalog (e.g., 'planetary_computer', 'copernicus').

required
logger Logger

Logger instance.

LOGGER
Source code in src/geospatial_tools/stac/core.py
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
def __init__(self, catalog_name: str, logger: logging.Logger = LOGGER) -> None:
    """
    Initializes a StacSearch instance.

    Args:
        catalog_name: Name of the STAC catalog (e.g., 'planetary_computer', 'copernicus').
        logger: Logger instance.
    """
    self.catalog_name = catalog_name
    self.catalog: pystac_client.Client | None = catalog_generator(catalog_name=catalog_name)
    self.search_results: list[pystac.Item] | None = None
    self.cloud_cover_sorted_results: list[pystac.Item] | None = None
    self.filtered_results: list[pystac.Item] | None = None
    self.downloaded_search_assets: list[Asset] | None = None
    self.downloaded_cloud_cover_sorted_assets: list[Asset] | None = None
    self.downloaded_best_sorted_asset: Asset | None = None
    self.logger = logger
    self.s3_client: Any | None = None
    if catalog_name == COPERNICUS:
        self.s3_client = utils.get_s3_client()

search #

search(
    date_range: DateLike = None,
    max_items: int | None = None,
    limit: int | None = None,
    ids: list[str] | None = None,
    collections: str | list[str] | None = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    query: dict[str, Any] | None = None,
    sortby: list[dict[str, str]] | str | list[str] | None = None,
    max_retries: int = 3,
    delay: int = 5,
) -> list[Item]

STAC API search that will use search query and parameters. Essentially a wrapper on pystac_client.Client.

Parameter descriptions taken from pystac docs.

Parameters:

Name Type Description Default
date_range DateLike

Either a single datetime or datetime range used to filter results. You may express a single datetime using a :class:datetime.datetime instance, a RFC 3339-compliant <https://tools.ietf.org/html/rfc3339>__ timestamp, or a simple date string (see below). Instances of :class:datetime.datetime may be either timezone aware or unaware. Timezone aware instances will be converted to a UTC timestamp before being passed to the endpoint. Timezone unaware instances are assumed to represent UTC timestamps. You may represent a datetime range using a "/" separated string as described in the spec, or a list, tuple, or iterator of 2 timestamps or datetime instances. For open-ended ranges, use either ".." ('2020-01-01:00:00:00Z/..', ['2020-01-01:00:00:00Z', '..']) or a value of None (['2020-01-01:00:00:00Z', None]). If using a simple date string, the datetime can be specified in YYYY-mm-dd format, optionally truncating to YYYY-mm or just YYYY. Simple date strings will be expanded to include the entire time period, for example: 2017 expands to 2017-01-01T00:00:00Z/2017-12-31T23:59:59Z and 2017-06 expands to 2017-06-01T00:00:00Z/2017-06-30T23:59:59Z If used in a range, the end of the range expands to the end of that day/month/year, for example: 2017-06-10/2017-06-11 expands to 2017-06-10T00:00:00Z/2017-06-11T23:59:59Z (Default value = None)

None
max_items int | None

The maximum number of items to return from the search, even if there are more matching results.

None
limit int | None

A recommendation to the service as to the number of items to return per page of results.

None
ids list[str] | None

List of one or more Item ids to filter on.

None
collections str | list[str] | None

List of one or more Collection IDs or pystac. Collection instances. Only Items in one of the provided Collections will be searched

None
bbox BBoxLike | None

A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be filtered to only those intersecting the bounding box.

None
intersects IntersectsLike | None

A string or dictionary representing a GeoJSON geometry, or an object that implements a geo_interface property, as supported by several libraries including Shapely, ArcPy, PySAL, and geojson. Results filtered to only those intersecting the geometry.

None
query dict[str, Any] | None

List or JSON of query parameters as per the STAC API query extension.

None
sortby list[dict[str, str]] | str | list[str] | None

A single field or list of fields to sort the response by

None
max_retries int
3
delay int
5

Returns:

Type Description
list[Item]

A list of pystac.Item objects matching the search criteria.

Source code in src/geospatial_tools/stac/core.py
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
def search(
    self,
    date_range: DateLike = None,
    max_items: int | None = None,
    limit: int | None = None,
    ids: list[str] | None = None,
    collections: str | list[str] | None = None,
    bbox: geotools_types.BBoxLike | None = None,
    intersects: geotools_types.IntersectsLike | None = None,
    query: dict[str, Any] | None = None,
    sortby: list[dict[str, str]] | str | list[str] | None = None,
    max_retries: int = 3,
    delay: int = 5,
) -> list[pystac.Item]:
    """
    STAC API search that will use search query and parameters. Essentially a wrapper on `pystac_client.Client`.

    Parameter descriptions taken from pystac docs.

    Args:
      date_range: Either a single datetime or datetime range used to filter results.
            You may express a single datetime using a :class:`datetime.datetime`
            instance, a `RFC 3339-compliant <https://tools.ietf.org/html/rfc3339>`__
            timestamp, or a simple date string (see below). Instances of
            :class:`datetime.datetime` may be either
            timezone aware or unaware. Timezone aware instances will be converted to
            a UTC timestamp before being passed
            to the endpoint. Timezone unaware instances are assumed to represent UTC
            timestamps. You may represent a
            datetime range using a ``"/"`` separated string as described in the
            spec, or a list, tuple, or iterator
            of 2 timestamps or datetime instances. For open-ended ranges, use either
            ``".."`` (``'2020-01-01:00:00:00Z/..'``,
            ``['2020-01-01:00:00:00Z', '..']``) or a value of ``None``
            (``['2020-01-01:00:00:00Z', None]``).
            If using a simple date string, the datetime can be specified in
            ``YYYY-mm-dd`` format, optionally truncating
            to ``YYYY-mm`` or just ``YYYY``. Simple date strings will be expanded to
            include the entire time period, for example: ``2017`` expands to
            ``2017-01-01T00:00:00Z/2017-12-31T23:59:59Z`` and ``2017-06`` expands
            to ``2017-06-01T00:00:00Z/2017-06-30T23:59:59Z``
            If used in a range, the end of the range expands to the end of that
            day/month/year, for example: ``2017-06-10/2017-06-11`` expands to
              ``2017-06-10T00:00:00Z/2017-06-11T23:59:59Z`` (Default value = None)
      max_items: The maximum number of items to return from the search, even if there are
        more matching results.
      limit: A recommendation to the service as to the number of items to return per
        page of results.
      ids: List of one or more Item ids to filter on.
      collections: List of one or more Collection IDs or pystac. Collection instances. Only Items in one of the
        provided Collections will be searched
      bbox: A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be filtered
        to only those intersecting the bounding box.
      intersects: A string or dictionary representing a GeoJSON geometry, or an object that implements a
        __geo_interface__ property, as supported by several libraries including Shapely, ArcPy, PySAL, and geojson.
        Results filtered to only those intersecting the geometry.
      query: List or JSON of query parameters as per the STAC API query extension.
      sortby: A single field or list of fields to sort the response by
      max_retries:
      delay:

    Returns:
        A list of pystac.Item objects matching the search criteria.
    """
    if isinstance(collections, str):
        collections = [collections]
    if isinstance(sortby, dict):
        sortby = [sortby]

    if not self.catalog:
        self.logger.error("STAC client is not initialized.")
        return []

    intro_log = "Initiating STAC API search"
    if query:
        intro_log = f"{intro_log} \n\tQuery : [{query}]"
    self.logger.info(intro_log)
    items: list[pystac.Item] = []
    for attempt in range(1, max_retries + 1):
        try:
            items = self._base_catalog_search(
                date_range=date_range,
                max_items=max_items,
                limit=limit,
                ids=ids,
                collections=collections,
                bbox=bbox,
                intersects=intersects,
                query=query,
                sortby=sortby,
            )
            break
        except APIError as e:  # pylint: disable=W0718
            self.logger.error(f"Attempt {attempt} failed: {e}")
            if attempt < max_retries:
                time.sleep(delay)
            else:
                raise e

    self.search_results = items
    return items

search_for_date_ranges #

search_for_date_ranges(
    date_ranges: Sequence[DateLike],
    max_items: int | None = None,
    limit: int | None = None,
    collections: str | list[str] | None = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    query: dict[str, Any] | None = None,
    sortby: list[dict[str, str]] | str | list[str] | None = None,
    max_retries: int = 3,
    delay: int = 5,
) -> list[Item]

STAC API search that will use search query and parameters for each date range in given list of date_ranges.

Date ranges can be generated with the help of the geospatial_tools.utils.create_date_range_for_specific_period function for more complex ranges.

Parameters:

Name Type Description Default
date_ranges Sequence[DateLike]

List containing datetime date ranges

required
max_items int | None

The maximum number of items to return from the search, even if there are more matching results

None
limit int | None

A recommendation to the service as to the number of items to return per page of results.

None
collections str | list[str] | None

List of one or more Collection IDs or pystac. Collection instances. Only Items in one of the provided Collections will be searched

None
bbox BBoxLike | None

A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be filtered to only those intersecting the bounding box.

None
intersects IntersectsLike | None

A string or dictionary representing a GeoJSON geometry, or an object that implements a geo_interface property, as supported by several libraries including Shapely, ArcPy, PySAL, and geojson. Results filtered to only those intersecting the geometry.

None
query dict[str, Any] | None

List or JSON of query parameters as per the STAC API query extension.

None
sortby list[dict[str, str]] | str | list[str] | None

A single field or list of fields to sort the response by

None
max_retries int
3
delay int
5

Returns:

Type Description
list[Item]

A list of pystac.Item objects.

Source code in src/geospatial_tools/stac/core.py
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
def search_for_date_ranges(
    self,
    date_ranges: Sequence[DateLike],
    max_items: int | None = None,
    limit: int | None = None,
    collections: str | list[str] | None = None,
    bbox: geotools_types.BBoxLike | None = None,
    intersects: geotools_types.IntersectsLike | None = None,
    query: dict[str, Any] | None = None,
    sortby: list[dict[str, str]] | str | list[str] | None = None,
    max_retries: int = 3,
    delay: int = 5,
) -> list[pystac.Item]:
    """
    STAC API search that will use search query and parameters for each date range in given list of `date_ranges`.

    Date ranges can be generated with the help of the `geospatial_tools.utils.create_date_range_for_specific_period`
    function for more complex ranges.

    Args:
      date_ranges: List containing datetime date ranges
      max_items: The maximum number of items to return from the search, even if there are more matching results
      limit: A recommendation to the service as to the number of items to return per page of results.
      collections: List of one or more Collection IDs or pystac. Collection instances. Only Items in one of the
        provided Collections will be searched
      bbox: A list, tuple, or iterator representing a bounding box of 2D or 3D coordinates. Results will be
        filtered to only those intersecting the bounding box.
      intersects: A string or dictionary representing a GeoJSON geometry, or an object that implements
        a __geo_interface__ property, as supported by several libraries including Shapely, ArcPy, PySAL, and
        geojson. Results filtered to only those intersecting the geometry.
      query: List or JSON of query parameters as per the STAC API query extension.
      sortby: A single field or list of fields to sort the response by
      max_retries:
      delay:

    Returns:
        A list of pystac.Item objects.
    """
    results: list[pystac.Item] = []
    if isinstance(collections, str):
        collections = [collections]
    if isinstance(sortby, dict):
        sortby = [sortby]

    if not self.catalog:
        self.logger.error("STAC client is not initialized.")
        return []

    intro_log = f"Running STAC API search for the following parameters: \n\tDate ranges : {date_ranges}"
    if query:
        intro_log = f"{intro_log} \n\tQuery : {query}"
    self.logger.info(intro_log)

    for attempt in range(1, max_retries + 1):
        try:
            for date_range in date_ranges:
                items = self._base_catalog_search(
                    date_range=date_range,
                    max_items=max_items,
                    limit=limit,
                    collections=collections,
                    bbox=bbox,
                    intersects=intersects,
                    query=query,
                    sortby=sortby,
                )
                results.extend(items)
            break
        except APIError as e:  # pylint: disable=W0718
            self.logger.error(f"Attempt {attempt} failed: {e}")
            if attempt < max_retries:
                time.sleep(delay)
            else:
                raise e

    if not results:
        self.logger.warning(f"Search for date ranges [{date_ranges}] found no results!")
        self.search_results = None

    self.search_results = results
    return results

sort_results_by_cloud_coverage #

sort_results_by_cloud_coverage() -> list[Item] | None

Sorts the search results by cloud coverage (ascending).

Returns:

Type Description
list[Item] | None

A list of sorted pystac.Item objects, or None if no results exist.

Source code in src/geospatial_tools/stac/core.py
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
def sort_results_by_cloud_coverage(self) -> list[pystac.Item] | None:
    """
    Sorts the search results by cloud coverage (ascending).

    Returns:
        A list of sorted pystac.Item objects, or None if no results exist.
    """
    if self.search_results:
        self.logger.debug("Sorting results by cloud cover (from least to most)")
        self.cloud_cover_sorted_results = sorted(
            self.search_results, key=lambda item: item.properties.get("eo:cloud_cover", float("inf"))
        )
        return self.cloud_cover_sorted_results
    self.logger.warning("No results found: please run a search before trying to sort results")
    return None

filter_no_data #

filter_no_data(property_name: str, max_no_data_value: int = 5) -> list[Item] | None

Filter results that are above a nodata value threshold.

Parameters:

Name Type Description Default
property_name str

Name of the property containing nodata percentage.

required
max_no_data_value int

Max allowed percentage of nodata. (Default value = 5)

5

Returns:

Type Description
list[Item] | None

Filtered list of pystac.Item objects.

Source code in src/geospatial_tools/stac/core.py
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
def filter_no_data(self, property_name: str, max_no_data_value: int = 5) -> list[pystac.Item] | None:
    """
    Filter results that are above a nodata value threshold.

    Args:
      property_name: Name of the property containing nodata percentage.
      max_no_data_value: Max allowed percentage of nodata. (Default value = 5)

    Returns:
        Filtered list of pystac.Item objects.
    """
    sorted_results = self.cloud_cover_sorted_results
    if not sorted_results:
        sorted_results = self.sort_results_by_cloud_coverage()
    if not sorted_results:
        return None

    filtered_results = []
    for item in sorted_results:
        if item.properties.get(property_name, 0) < max_no_data_value:
            filtered_results.append(item)
    self.filtered_results = filtered_results

    return filtered_results

download_search_results #

download_search_results(bands: list[str], base_directory: str | Path) -> list[Asset]

Downloads assets for all search results.

Parameters:

Name Type Description Default
bands list[str]

List of bands to download.

required
base_directory str | Path

The base directory for downloads.

required

Returns:

Type Description
list[Asset]

A list of Asset objects for the downloaded search results.

Source code in src/geospatial_tools/stac/core.py
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
def download_search_results(self, bands: list[str], base_directory: str | Path) -> list[Asset]:
    """
    Downloads assets for all search results.

    Args:
        bands: List of bands to download.
        base_directory: The base directory for downloads.

    Returns:
        A list of Asset objects for the downloaded search results.
    """
    downloaded_search_results = self._download_results(
        results=self.search_results, bands=bands, base_directory=base_directory
    )
    self.downloaded_search_assets = downloaded_search_results
    return downloaded_search_results

download_sorted_by_cloud_cover_search_results #

download_sorted_by_cloud_cover_search_results(
    bands: list[str],
    base_directory: str | Path,
    first_x_num_of_items: int | None = None,
) -> list[Asset]

Downloads sorted results.

Parameters:

Name Type Description Default
bands list[str]

List of bands to download.

required
base_directory str | Path

The base directory for downloads.

required
first_x_num_of_items int | None

Optional number of top items to download.

None

Returns:

Type Description
list[Asset]

A list of Asset objects for the downloaded items.

Source code in src/geospatial_tools/stac/core.py
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
def download_sorted_by_cloud_cover_search_results(
    self, bands: list[str], base_directory: str | Path, first_x_num_of_items: int | None = None
) -> list[Asset]:
    """
    Downloads sorted results.

    Args:
        bands: List of bands to download.
        base_directory: The base directory for downloads.
        first_x_num_of_items: Optional number of top items to download.

    Returns:
        A list of Asset objects for the downloaded items.
    """
    results = self._generate_best_results()
    if not results:
        return []
    if first_x_num_of_items:
        results = results[:first_x_num_of_items]
    downloaded_search_results = self._download_results(results=results, bands=bands, base_directory=base_directory)
    self.downloaded_cloud_cover_sorted_assets = downloaded_search_results
    return downloaded_search_results

download_best_cloud_cover_result #

download_best_cloud_cover_result(
    bands: list[str], base_directory: str | Path
) -> Asset | None

Downloads the single best result based on cloud cover.

Parameters:

Name Type Description Default
bands list[str]

List of bands to download.

required
base_directory str | Path

The base directory for downloads.

required

Returns:

Type Description
Asset | None

The Asset object for the best result, or None if no results available.

Source code in src/geospatial_tools/stac/core.py
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
def download_best_cloud_cover_result(self, bands: list[str], base_directory: str | Path) -> Asset | None:
    """
    Downloads the single best result based on cloud cover.

    Args:
        bands: List of bands to download.
        base_directory: The base directory for downloads.

    Returns:
        The Asset object for the best result, or None if no results available.
    """
    results = self._generate_best_results()
    if not results:
        return None
    best_result = [results[0]]

    if self.downloaded_cloud_cover_sorted_assets:
        self.logger.info(f"Asset [{best_result[0].id}] is already downloaded")
        self.downloaded_best_sorted_asset = self.downloaded_cloud_cover_sorted_assets[0]
        return self.downloaded_cloud_cover_sorted_assets[0]

    downloaded_search_results = self._download_results(
        results=best_result, bands=bands, base_directory=base_directory
    )
    if downloaded_search_results:
        self.downloaded_best_sorted_asset = downloaded_search_results[0]
        return downloaded_search_results[0]
    return None

AbstractStacWrapper #

AbstractStacWrapper(
    collection: str | None = None,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: Logger = LOGGER,
)

Bases: ABC

Abstract base class for STAC search wrappers using a Facade + Proxy pattern.

This class provides a common interface and shared logic for different STAC collections (e.g., Sentinel-1, Sentinel-2). It delegates actual STAC operations to an underlying StacSearch client and exposes results via proxy properties.

Initialize the STAC wrapper.

Parameters:

Name Type Description Default
collection str | None

The STAC collection ID to search.

None
date_range DateLike

Temporal filter for the search.

None
bbox BBoxLike | None

Spatial bounding box filter.

None
intersects IntersectsLike | None

Spatial GeoJSON geometry filter.

None
logger Logger

Logger instance.

LOGGER
Source code in src/geospatial_tools/stac/core.py
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
def __init__(
    self,
    collection: str | None = None,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: logging.Logger = LOGGER,
):
    """
    Initialize the STAC wrapper.

    Args:
        collection: The STAC collection ID to search.
        date_range: Temporal filter for the search.
        bbox: Spatial bounding box filter.
        intersects: Spatial GeoJSON geometry filter.
        logger: Logger instance.
    """
    self.client: StacSearch = StacSearch(PLANETARY_COMPUTER)
    self.collection = collection
    self.date_range = date_range
    self.bbox = bbox
    self.intersects = intersects
    self.logger = logger
    self.custom_query_params: dict[str, Any] = {}

search_results property #

search_results: list[Item] | None

Proxy property for STAC search results from the underlying client.

downloaded_assets property #

downloaded_assets: list[Asset] | None

Proxy property for downloaded assets from the underlying client.

with_custom_query #

with_custom_query(query_params: dict[str, Any]) -> Self

Merge custom STAC query parameters and invalidate current state.

Parameters:

Name Type Description Default
query_params dict[str, Any]

Dictionary of custom STAC query parameters.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/core.py
939
940
941
942
943
944
945
946
947
948
949
950
951
def with_custom_query(self, query_params: dict[str, Any]) -> Self:
    """
    Merge custom STAC query parameters and invalidate current state.

    Args:
        query_params: Dictionary of custom STAC query parameters.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.custom_query_params.update(query_params)
    return self

search #

search() -> list[Item] | None

Execute the STAC search using the built query and parameters.

Returns:

Type Description
list[Item] | None

List of matched pystac Items, or None if no results.

Source code in src/geospatial_tools/stac/core.py
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
def search(self) -> list[pystac.Item] | None:
    """
    Execute the STAC search using the built query and parameters.

    Returns:
        List of matched pystac Items, or None if no results.
    """
    query = self._build_collection_query()
    query.update(self.custom_query_params)

    collections = [self.collection] if self.collection else None

    self.client.search(
        collections=collections,
        bbox=self.bbox,
        intersects=self.intersects,
        date_range=self.date_range,
        query=query if query else None,
    )
    return self.search_results

download #

download(bands: list[str], base_directory: str | Path) -> list[Asset] | None

Download assets for the matched search results.

Triggers a search if no results are currently available.

Parameters:

Name Type Description Default
bands list[str]

List of asset keys (bands) to download.

required
base_directory str | Path

Local directory where assets will be saved.

required

Returns:

Type Description
list[Asset] | None

List of downloaded Asset objects.

Source code in src/geospatial_tools/stac/core.py
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
def download(self, bands: list[str], base_directory: str | Path) -> list[Asset] | None:
    """
    Download assets for the matched search results.

    Triggers a search if no results are currently available.

    Args:
        bands: List of asset keys (bands) to download.
        base_directory: Local directory where assets will be saved.

    Returns:
        List of downloaded Asset objects.
    """
    if self.search_results is None:
        self.search()

    # S1 might need a small override here for lowercase logic,
    # but the core logic stays here.
    self.client.download_search_results(bands=bands, base_directory=Path(base_directory))
    return self.downloaded_assets

create_planetary_computer_catalog #

create_planetary_computer_catalog(
    max_retries: int = 3, delay: int = 5, logger: Logger = LOGGER
) -> Client | None

Creates a Planetary Computer Catalog Client.

Parameters:

Name Type Description Default
max_retries int

The maximum number of retries for the API connection. (Default value = 3)

3
delay int

The delay between retry attempts in seconds. (Default value = 5)

5
logger Logger

The logger instance to use. (Default value = LOGGER)

LOGGER

Returns:

Type Description
Client | None

A pystac_client.Client instance if successful, else None.

Source code in src/geospatial_tools/stac/core.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def create_planetary_computer_catalog(
    max_retries: int = 3, delay: int = 5, logger: logging.Logger = LOGGER
) -> pystac_client.Client | None:
    """
    Creates a Planetary Computer Catalog Client.

    Args:
      max_retries: The maximum number of retries for the API connection. (Default value = 3)
      delay: The delay between retry attempts in seconds. (Default value = 5)
      logger: The logger instance to use. (Default value = LOGGER)

    Returns:
        A pystac_client.Client instance if successful, else None.
    """
    for attempt in range(1, max_retries + 1):
        try:
            client = pystac_client.Client.open(PLANETARY_COMPUTER_API, modifier=sign_inplace)
            logger.debug("Successfully connected to the API.")
            return client
        except Exception as e:  # pylint: disable=W0718
            logger.error(f"Attempt {attempt} failed: {e}")
            if attempt < max_retries:
                time.sleep(delay)
            else:
                logger.error(e)
                raise e
    return None

create_copernicus_catalog #

create_copernicus_catalog(
    max_retries: int = 3, delay: int = 5, logger: Logger = LOGGER
) -> Client | None

Creates a Copernicus Data Space Ecosystem Catalog Client.

Parameters:

Name Type Description Default
max_retries int

The maximum number of retries for the API connection. (Default value = 3)

3
delay int

The delay between retry attempts in seconds. (Default value = 5)

5
logger Logger

The logger instance to use. (Default value = LOGGER)

LOGGER

Returns:

Type Description
Client | None

A pystac_client.Client instance if successful, else None.

Source code in src/geospatial_tools/stac/core.py
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def create_copernicus_catalog(
    max_retries: int = 3, delay: int = 5, logger: logging.Logger = LOGGER
) -> pystac_client.Client | None:
    """
    Creates a Copernicus Data Space Ecosystem Catalog Client.

    Args:
      max_retries: The maximum number of retries for the API connection. (Default value = 3)
      delay: The delay between retry attempts in seconds. (Default value = 5)
      logger: The logger instance to use. (Default value = LOGGER)

    Returns:
        A pystac_client.Client instance if successful, else None.
    """
    for attempt in range(1, max_retries + 1):
        try:
            client = pystac_client.Client.open(COPERNICUS_API)
            logger.debug("Successfully connected to the API.")
            return client
        except Exception as e:  # pylint: disable=W0718
            logger.error(f"Attempt {attempt} failed: {e}")
            if attempt < max_retries:
                time.sleep(delay)
            else:
                logger.error(e)
                raise e
    return None

catalog_generator #

catalog_generator(catalog_name: str, logger: Logger = LOGGER) -> Client | None

Generates a STAC Client for the specified catalog.

Parameters:

Name Type Description Default
catalog_name str

The name of the catalog (e.g., 'planetary_computer', 'copernicus').

required
logger Logger

The logger instance to use.

LOGGER

Returns:

Type Description
Client | None

A pystac_client.Client instance for the requested catalog if supported, else None.

Source code in src/geospatial_tools/stac/core.py
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def catalog_generator(catalog_name: str, logger: logging.Logger = LOGGER) -> pystac_client.Client | None:
    """
    Generates a STAC Client for the specified catalog.

    Args:
      catalog_name: The name of the catalog (e.g., 'planetary_computer', 'copernicus').
      logger: The logger instance to use.

    Returns:
        A pystac_client.Client instance for the requested catalog if supported, else None.
    """
    catalog_dict = {
        PLANETARY_COMPUTER: create_planetary_computer_catalog,
        COPERNICUS: create_copernicus_catalog,
    }
    if catalog_name not in catalog_dict:
        logger.error(f"Unsupported catalog name: {catalog_name}")
        return None

    catalog = catalog_dict[catalog_name]()

    return catalog

list_available_catalogs #

list_available_catalogs(logger: Logger = LOGGER) -> frozenset[str]

Lists all available STAC catalogs.

Parameters:

Name Type Description Default
logger Logger

The logger instance to use.

LOGGER

Returns:

Type Description
frozenset[str]

A frozenset of available catalog names.

Source code in src/geospatial_tools/stac/core.py
122
123
124
125
126
127
128
129
130
131
132
133
def list_available_catalogs(logger: logging.Logger = LOGGER) -> frozenset[str]:
    """
    Lists all available STAC catalogs.

    Args:
      logger: The logger instance to use.

    Returns:
        A frozenset of available catalog names.
    """
    logger.info("Available catalogs")
    return CATALOG_NAME_LIST

download_stac_asset #

download_stac_asset(
    asset_url: str,
    destination: Path,
    method: str = "http",
    headers: dict[str, str] | None = None,
    s3_client: Any | None = None,
    logger: Logger = LOGGER,
) -> Path | None

Generic dispatcher for downloading STAC assets via HTTP or S3.

Parameters:

Name Type Description Default
asset_url str

URL/HREF of the asset to download.

required
destination Path

Path where the file will be saved.

required
method str

Download method ('http' or 's3').

'http'
headers dict[str, str] | None

Headers for HTTP request.

None
s3_client Any | None

Boto3 S3 client (required for 's3' method).

None
logger Logger

Logger instance.

LOGGER

Returns:

Type Description
Path | None

The Path to the downloaded file if successful, else None.

Source code in src/geospatial_tools/stac/core.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
def download_stac_asset(
    asset_url: str,
    destination: Path,
    method: str = "http",
    headers: dict[str, str] | None = None,
    s3_client: Any | None = None,
    logger: logging.Logger = LOGGER,
) -> Path | None:
    """
    Generic dispatcher for downloading STAC assets via HTTP or S3.

    Args:
        asset_url: URL/HREF of the asset to download.
        destination: Path where the file will be saved.
        method: Download method ('http' or 's3').
        headers: Headers for HTTP request.
        s3_client: Boto3 S3 client (required for 's3' method).
        logger: Logger instance.

    Returns:
        The Path to the downloaded file if successful, else None.
    """
    if method == "s3":
        file_path = utils.download_url_s3(
            asset_url=asset_url, destination=destination, s3_client=s3_client, logger=logger
        )
        return file_path
    # Default to HTTP
    file_path = download_url(url=asset_url, filename=destination, headers=headers, logger=logger)
    return file_path

planetary_computer #

PlanetaryComputerS1Band #

Bases: StrEnum

Planetary Computer Sentinel-1 asset band keys.

Used to fetch assets from the STAC item.

PlanetaryComputerS1Collection #

Bases: StrEnum

Planetary Computer Sentinel-1 Collections.

PlanetaryComputerS1InstrumentMode #

Bases: StrEnum

Planetary Computer Sentinel-1 instrument modes.

Used for STAC queries.

PlanetaryComputerS1OrbitState #

Bases: StrEnum

Planetary Computer Sentinel-1 orbit states.

Used for STAC queries.

PlanetaryComputerS1Polarization #

Bases: StrEnum

Planetary Computer Sentinel-1 polarizations.

Used for STAC queries.

PlanetaryComputerS1Property #

Bases: StrEnum

Planetary Computer Sentinel-1 STAC query properties.

PlanetaryComputerS2Band #

Bases: StrEnum

Planetary Computer Sentinel-2 asset band keys.

Planetary Computer uses plain base names (e.g., "B02") as asset keys, unlike Copernicus which appends resolution suffixes.

PlanetaryComputerS2Collection #

Bases: StrEnum

Planetary Computer Sentinel-2 Collections.

PlanetaryComputerS2Property #

Bases: StrEnum

Planetary Computer Sentinel-2 STAC query properties.

sortby_field property #

sortby_field: str

Returns the full JSON path prefix required by the STAC API sortby object.

PlanetaryComputerS3Band #

Bases: StrEnum

Planetary Computer Sentinel-3 asset band keys.

PlanetaryComputerS3Collection #

Bases: StrEnum

Planetary Computer Sentinel-3 Collections.

PlanetaryComputerS3Property #

Bases: StrEnum

Planetary Computer Sentinel-3 STAC query properties.

constants #

Constants for Planetary Computer Sentinel-2 STAC catalog.

PlanetaryComputerS2Collection #

Bases: StrEnum

Planetary Computer Sentinel-2 Collections.

PlanetaryComputerS2Property #

Bases: StrEnum

Planetary Computer Sentinel-2 STAC query properties.

sortby_field property #
sortby_field: str

Returns the full JSON path prefix required by the STAC API sortby object.

PlanetaryComputerS2Band #

Bases: StrEnum

Planetary Computer Sentinel-2 asset band keys.

Planetary Computer uses plain base names (e.g., "B02") as asset keys, unlike Copernicus which appends resolution suffixes.

PlanetaryComputerS1Collection #

Bases: StrEnum

Planetary Computer Sentinel-1 Collections.

PlanetaryComputerS1Property #

Bases: StrEnum

Planetary Computer Sentinel-1 STAC query properties.

PlanetaryComputerS1Band #

Bases: StrEnum

Planetary Computer Sentinel-1 asset band keys.

Used to fetch assets from the STAC item.

PlanetaryComputerS1InstrumentMode #

Bases: StrEnum

Planetary Computer Sentinel-1 instrument modes.

Used for STAC queries.

PlanetaryComputerS1Polarization #

Bases: StrEnum

Planetary Computer Sentinel-1 polarizations.

Used for STAC queries.

PlanetaryComputerS1OrbitState #

Bases: StrEnum

Planetary Computer Sentinel-1 orbit states.

Used for STAC queries.

PlanetaryComputerS3Collection #

Bases: StrEnum

Planetary Computer Sentinel-3 Collections.

PlanetaryComputerS3Property #

Bases: StrEnum

Planetary Computer Sentinel-3 STAC query properties.

PlanetaryComputerS3OrbitState #

Bases: StrEnum

Planetary Computer Sentinel-3 orbit states.

Used for STAC queries.

PlanetaryComputerS3Band #

Bases: StrEnum

Planetary Computer Sentinel-3 asset band keys.

sentinel_1 #

Sentinel1Search #

Sentinel1Search(
    collection: PlanetaryComputerS1Collection | str = GRD,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: Logger = LOGGER,
)

Bases: AbstractStacWrapper

Executable wrapper for Sentinel-1 GRD data on Planetary Computer.

Implements a fluent builder pattern to construct STAC queries for SAR data. Execution and result storage are delegated to an underlying StacSearch client via proxy properties.

Initialize Sentinel1Search.

Parameters:

Name Type Description Default
collection PlanetaryComputerS1Collection | str

The Sentinel-1 STAC collection (default: sentinel-1-grd).

GRD
date_range DateLike

Temporal filter, native pystac DateLike.

None
bbox BBoxLike | None

Spatial bounding box filter.

None
intersects IntersectsLike | None

Spatial GeoJSON geometry filter.

None
logger Logger

Custom logger instance.

LOGGER
Source code in src/geospatial_tools/stac/planetary_computer/sentinel_1.py
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def __init__(
    self,
    collection: PlanetaryComputerS1Collection | str = PlanetaryComputerS1Collection.GRD,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: logging.Logger = LOGGER,
) -> None:
    """
    Initialize Sentinel1Search.

    Args:
        collection: The Sentinel-1 STAC collection (default: sentinel-1-grd).
        date_range: Temporal filter, native pystac DateLike.
        bbox: Spatial bounding box filter.
        intersects: Spatial GeoJSON geometry filter.
        logger: Custom logger instance.
    """
    super().__init__(collection=collection, date_range=date_range, bbox=bbox, intersects=intersects, logger=logger)

    self.instrument_modes: list[PlanetaryComputerS1InstrumentMode] | None = None
    self.polarizations: list[PlanetaryComputerS1Polarization] | None = None
    self.orbit_states: list[PlanetaryComputerS1OrbitState] | None = None
    self.custom_query_params: dict[str, Any] = {}
search_results property #
search_results: list[Item] | None

Proxy property for STAC search results from the underlying client.

downloaded_assets property #
downloaded_assets: list[Asset] | None

Proxy property for downloaded assets from the underlying client.

filter_by_instrument_mode #
filter_by_instrument_mode(
    modes: list[PlanetaryComputerS1InstrumentMode] | PlanetaryComputerS1InstrumentMode,
) -> Self

Filter SAR products by instrument mode (e.g., IW, EW).

Invalidates current search results.

Parameters:

Name Type Description Default
modes list[PlanetaryComputerS1InstrumentMode] | PlanetaryComputerS1InstrumentMode

Single mode or list of PlanetaryComputerS1InstrumentMode.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_1.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def filter_by_instrument_mode(
    self, modes: list[PlanetaryComputerS1InstrumentMode] | PlanetaryComputerS1InstrumentMode
) -> Self:
    """
    Filter SAR products by instrument mode (e.g., IW, EW).

    Invalidates current search results.

    Args:
        modes: Single mode or list of `PlanetaryComputerS1InstrumentMode`.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    if isinstance(modes, list):
        self.instrument_modes = modes
    else:
        self.instrument_modes = [modes]
    return self
filter_by_polarization #
filter_by_polarization(
    polarizations: (
        list[PlanetaryComputerS1Polarization] | PlanetaryComputerS1Polarization
    ),
) -> Self

Filter SAR products by polarization (e.g., VV, VH).

Invalidates current search results. Note: PC STAC requires an exact array match for sar:polarizations.

Parameters:

Name Type Description Default
polarizations list[PlanetaryComputerS1Polarization] | PlanetaryComputerS1Polarization

Single polarization or list of PlanetaryComputerS1Polarization.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_1.py
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
def filter_by_polarization(
    self, polarizations: list[PlanetaryComputerS1Polarization] | PlanetaryComputerS1Polarization
) -> Self:
    """
    Filter SAR products by polarization (e.g., VV, VH).

    Invalidates current search results. Note: PC STAC requires an exact array match
    for `sar:polarizations`.

    Args:
        polarizations: Single polarization or list of `PlanetaryComputerS1Polarization`.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    if isinstance(polarizations, list):
        self.polarizations = polarizations
    else:
        self.polarizations = [polarizations]
    return self
filter_by_orbit_state #
filter_by_orbit_state(
    states: list[PlanetaryComputerS1OrbitState] | PlanetaryComputerS1OrbitState,
) -> Self

Filter SAR products by orbit state (ascending or descending).

Invalidates current search results.

Parameters:

Name Type Description Default
states list[PlanetaryComputerS1OrbitState] | PlanetaryComputerS1OrbitState

Single state or list of PlanetaryComputerS1OrbitState.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_1.py
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
def filter_by_orbit_state(
    self, states: list[PlanetaryComputerS1OrbitState] | PlanetaryComputerS1OrbitState
) -> Self:
    """
    Filter SAR products by orbit state (ascending or descending).

    Invalidates current search results.

    Args:
        states: Single state or list of `PlanetaryComputerS1OrbitState`.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    if isinstance(states, list):
        self.orbit_states = states
    else:
        self.orbit_states = [states]
    return self
download #
download(
    bands: list[PlanetaryComputerS1Band | str], base_directory: str | Path
) -> list[Asset] | None

Download Sentinel-1 assets with lowercase band key normalization.

Parameters:

Name Type Description Default
bands list[PlanetaryComputerS1Band | str]

List of bands to download.

required
base_directory str | Path

Local directory where assets will be saved.

required

Returns:

Type Description
list[Asset] | None

List of downloaded Asset objects.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_1.py
148
149
150
151
152
153
154
155
156
157
158
159
160
161
def download(self, bands: list[PlanetaryComputerS1Band | str], base_directory: str | Path) -> list[Asset] | None:
    """
    Download Sentinel-1 assets with lowercase band key normalization.

    Args:
        bands: List of bands to download.
        base_directory: Local directory where assets will be saved.

    Returns:
        List of downloaded Asset objects.
    """
    # Small specialized override for S1 casing requirements
    lower_bands = [str(b).lower() for b in bands]
    return super().download(bands=lower_bands, base_directory=base_directory)
with_custom_query #
with_custom_query(query_params: dict[str, Any]) -> Self

Merge custom STAC query parameters and invalidate current state.

Parameters:

Name Type Description Default
query_params dict[str, Any]

Dictionary of custom STAC query parameters.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/core.py
939
940
941
942
943
944
945
946
947
948
949
950
951
def with_custom_query(self, query_params: dict[str, Any]) -> Self:
    """
    Merge custom STAC query parameters and invalidate current state.

    Args:
        query_params: Dictionary of custom STAC query parameters.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.custom_query_params.update(query_params)
    return self
search #
search() -> list[Item] | None

Execute the STAC search using the built query and parameters.

Returns:

Type Description
list[Item] | None

List of matched pystac Items, or None if no results.

Source code in src/geospatial_tools/stac/core.py
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
def search(self) -> list[pystac.Item] | None:
    """
    Execute the STAC search using the built query and parameters.

    Returns:
        List of matched pystac Items, or None if no results.
    """
    query = self._build_collection_query()
    query.update(self.custom_query_params)

    collections = [self.collection] if self.collection else None

    self.client.search(
        collections=collections,
        bbox=self.bbox,
        intersects=self.intersects,
        date_range=self.date_range,
        query=query if query else None,
    )
    return self.search_results

sentinel_2 #

Sentinel2Search #

Sentinel2Search(
    collection: PlanetaryComputerS2Collection | str = L2A,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: Logger = LOGGER,
)

Bases: AbstractStacWrapper

Executable wrapper for Sentinel-2 L2A data on Planetary Computer.

Implements a fluent builder pattern to construct STAC queries for optical data. Execution and result storage are delegated to an underlying StacSearch client via proxy properties.

Initialize Sentinel2Search.

Parameters:

Name Type Description Default
collection PlanetaryComputerS2Collection | str

Collection used to search for Sentinel 2 products.

L2A
date_range DateLike

Temporal filter, native pystac DateLike.

None
bbox BBoxLike | None

Spatial bounding box filter.

None
intersects IntersectsLike | None

Spatial GeoJSON geometry filter.

None
logger Logger

Custom logger instance.

LOGGER
Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
def __init__(
    self,
    collection: PlanetaryComputerS2Collection | str = PlanetaryComputerS2Collection.L2A,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: logging.Logger = LOGGER,
) -> None:
    """
    Initialize Sentinel2Search.

    Args:
        collection: Collection used to search for Sentinel 2 products.
        date_range: Temporal filter, native pystac DateLike.
        bbox: Spatial bounding box filter.
        intersects: Spatial GeoJSON geometry filter.
        logger: Custom logger instance.
    """
    super().__init__(collection=collection, date_range=date_range, bbox=bbox, intersects=intersects, logger=logger)

    self.max_cloud_cover: int | None = None
    self.max_no_data_value: int | None = None
    self.mgrs_tiles: list[str] | None = None
    self.custom_query_params: dict[str, Any] = {}
search_results property #
search_results: list[Item] | None

Proxy property for STAC search results from the underlying client.

downloaded_assets property #
downloaded_assets: list[Asset] | None

Proxy property for downloaded assets from the underlying client.

filter_by_cloud_cover #
filter_by_cloud_cover(max_cloud_cover: int) -> Self

Filter by maximum cloud cover percentage.

Invalidates current search results.

Parameters:

Name Type Description Default
max_cloud_cover int

Maximum percentage of cloud cover allowed.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
def filter_by_cloud_cover(self, max_cloud_cover: int) -> Self:
    """
    Filter by maximum cloud cover percentage.

    Invalidates current search results.

    Args:
        max_cloud_cover: Maximum percentage of cloud cover allowed.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.max_cloud_cover = max_cloud_cover
    return self
filter_by_nodata_pixel_percentage #
filter_by_nodata_pixel_percentage(max_no_data_value: int) -> Self

Filter by maximum no-data pixel percentage.

Invalidates current search results.

Parameters:

Name Type Description Default
max_no_data_value int

Maximum percentage of no-data pixels allowed.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
def filter_by_nodata_pixel_percentage(self, max_no_data_value: int) -> Self:
    """
    Filter by maximum no-data pixel percentage.

    Invalidates current search results.

    Args:
        max_no_data_value: Maximum percentage of no-data pixels allowed.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.max_no_data_value = max_no_data_value
    return self
filter_by_mgrs_tile #
filter_by_mgrs_tile(mgrs_tiles: list[str] | str) -> Self

Filter by MGRS tile ID(s).

Invalidates current search results.

Parameters:

Name Type Description Default
mgrs_tiles list[str] | str

Single MGRS tile ID or list of IDs.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
def filter_by_mgrs_tile(self, mgrs_tiles: list[str] | str) -> Self:
    """
    Filter by MGRS tile ID(s).

    Invalidates current search results.

    Args:
        mgrs_tiles: Single MGRS tile ID or list of IDs.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    if isinstance(mgrs_tiles, list):
        self.mgrs_tiles = mgrs_tiles
    else:
        self.mgrs_tiles = [mgrs_tiles]
    return self
with_custom_query #
with_custom_query(query_params: dict[str, Any]) -> Self

Merge custom STAC query parameters and invalidate current state.

Parameters:

Name Type Description Default
query_params dict[str, Any]

Dictionary of custom STAC query parameters.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/core.py
939
940
941
942
943
944
945
946
947
948
949
950
951
def with_custom_query(self, query_params: dict[str, Any]) -> Self:
    """
    Merge custom STAC query parameters and invalidate current state.

    Args:
        query_params: Dictionary of custom STAC query parameters.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.custom_query_params.update(query_params)
    return self
search #
search() -> list[Item] | None

Execute the STAC search using the built query and parameters.

Returns:

Type Description
list[Item] | None

List of matched pystac Items, or None if no results.

Source code in src/geospatial_tools/stac/core.py
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
def search(self) -> list[pystac.Item] | None:
    """
    Execute the STAC search using the built query and parameters.

    Returns:
        List of matched pystac Items, or None if no results.
    """
    query = self._build_collection_query()
    query.update(self.custom_query_params)

    collections = [self.collection] if self.collection else None

    self.client.search(
        collections=collections,
        bbox=self.bbox,
        intersects=self.intersects,
        date_range=self.date_range,
        query=query if query else None,
    )
    return self.search_results
download #
download(bands: list[str], base_directory: str | Path) -> list[Asset] | None

Download assets for the matched search results.

Triggers a search if no results are currently available.

Parameters:

Name Type Description Default
bands list[str]

List of asset keys (bands) to download.

required
base_directory str | Path

Local directory where assets will be saved.

required

Returns:

Type Description
list[Asset] | None

List of downloaded Asset objects.

Source code in src/geospatial_tools/stac/core.py
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
def download(self, bands: list[str], base_directory: str | Path) -> list[Asset] | None:
    """
    Download assets for the matched search results.

    Triggers a search if no results are currently available.

    Args:
        bands: List of asset keys (bands) to download.
        base_directory: Local directory where assets will be saved.

    Returns:
        List of downloaded Asset objects.
    """
    if self.search_results is None:
        self.search()

    # S1 might need a small override here for lowercase logic,
    # but the core logic stays here.
    self.client.download_search_results(bands=bands, base_directory=Path(base_directory))
    return self.downloaded_assets

BestProductsForFeatures #

BestProductsForFeatures(
    sentinel2_tiling_grid: GeoDataFrame,
    sentinel2_tiling_grid_column: str,
    vector_features: GeoDataFrame,
    vector_features_column: str,
    collection: PlanetaryComputerS2Collection | str = L2A,
    date_ranges: list[str] | None = None,
    max_cloud_cover: int = 5,
    max_no_data_value: int = 5,
    logger: Logger = LOGGER,
)

Class made to facilitate and automate searching for Sentinel 2 products using the Sentinel 2 tiling grid as a reference.

Current limitation is that vector features used must fit, or be completely contained inside a single Sentinel 2 tiling grid.

For larger features, a mosaic of products will be necessary.

This class was conceived first and foremost to be used for numerous smaller vector features, like polygon grids created from geospatial_tools.vector.create_vector_grid

Parameters:

Name Type Description Default
sentinel2_tiling_grid GeoDataFrame

GeoDataFrame containing Sentinel 2 tiling grid

required
sentinel2_tiling_grid_column str

Name of the column in sentinel2_tiling_grid that contains the tile names (ex tile name: 10SDJ)

required
vector_features GeoDataFrame

GeoDataFrame containing the vector features for which the best Sentinel 2 products will be chosen for.

required
vector_features_column str

Name of the column in vector_features where the best Sentinel 2 products will be written to

required
date_ranges list[str] | None

Date range used to search for Sentinel 2 products. should be created using geospatial_tools.utils.create_date_range_for_specific_period separately, or BestProductsForFeatures.create_date_range after initialization.

None
max_cloud_cover int

Maximum cloud cover used to search for Sentinel 2 products.

5
logger Logger

Logger instance

LOGGER
Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def __init__(
    self,
    sentinel2_tiling_grid: GeoDataFrame,
    sentinel2_tiling_grid_column: str,
    vector_features: GeoDataFrame,
    vector_features_column: str,
    collection: PlanetaryComputerS2Collection | str = PlanetaryComputerS2Collection.L2A,
    date_ranges: list[str] | None = None,
    max_cloud_cover: int = 5,
    max_no_data_value: int = 5,
    logger: logging.Logger = LOGGER,
):
    """

    Args:
        sentinel2_tiling_grid: GeoDataFrame containing Sentinel 2 tiling grid
        sentinel2_tiling_grid_column: Name of the column in `sentinel2_tiling_grid` that contains the tile names
            (ex tile name: 10SDJ)
        vector_features: GeoDataFrame containing the vector features for which the best Sentinel 2
            products will be chosen for.
        vector_features_column: Name of the column in `vector_features` where the best Sentinel 2 products
            will be written to
        date_ranges: Date range used to search for Sentinel 2 products. should be created using
            `geospatial_tools.utils.create_date_range_for_specific_period` separately,
            or `BestProductsForFeatures.create_date_range` after initialization.
        max_cloud_cover: Maximum cloud cover used to search for Sentinel 2 products.
        logger: Logger instance
    """
    self.logger = logger
    self.collection = collection
    self.date_ranges = date_ranges
    self._max_cloud_cover = max_cloud_cover
    self.max_no_data_value = max_no_data_value

    self.sentinel2_tiling_grid = sentinel2_tiling_grid
    self.sentinel2_tiling_grid_column = sentinel2_tiling_grid_column
    self.sentinel2_tile_list = sentinel2_tiling_grid["name"].to_list()
    self.vector_features = vector_features
    self.vector_features_column = vector_features_column
    self.vector_features_best_product_column = "best_s2_product_id"
    self.vector_features_with_products = None
    self.successful_results: dict[Any, Any] = {}
    self.incomplete_results: list[Any] = []
    self.error_results: list[Any] = []
max_cloud_cover property #
max_cloud_cover

Max % of cloud cover used for Sentinel 2 product search.

create_date_ranges #
create_date_ranges(
    start_year: int, end_year: int, start_month: int, end_month: int
) -> list[str] | None

This function create a list of date ranges.

For example, I want to create date ranges for 2020 and 2021, but only for the months from March to May. I therefore expect to have 2 ranges: [2020-03-01 to 2020-05-30, 2021-03-01 to 2021-05-30].

Handles the automatic definition of the last day for the end month, as well as periods that cross over years

For example, I want to create date ranges for 2020 and 2022, but only for the months from November to January. I therefore expect to have 2 ranges: [2020-11-01 to 2021-01-31, 2021-11-01 to 2022-01-31].

Parameters:

Name Type Description Default
start_year int

Start year for ranges

required
end_year int

End year for ranges

required
start_month int

Starting month for each period

required
end_month int

End month for each period (inclusively)

required

Returns:

Type Description
list[str] | None

List of date ranges

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
def create_date_ranges(self, start_year: int, end_year: int, start_month: int, end_month: int) -> list[str] | None:
    """
    This function create a list of date ranges.

    For example, I want to create date ranges for 2020 and 2021, but only for the months from March to May.
    I therefore expect to have 2 ranges: [2020-03-01 to 2020-05-30, 2021-03-01 to 2021-05-30].

    Handles the automatic definition of the last day for the end month, as well as periods that cross over years

    For example, I want to create date ranges for 2020 and 2022, but only for the months from November to January.
    I therefore expect to have 2 ranges: [2020-11-01 to 2021-01-31, 2021-11-01 to 2022-01-31].

    Args:
      start_year: Start year for ranges
      end_year: End year for ranges
      start_month: Starting month for each period
      end_month: End month for each period (inclusively)

    Returns:
        List of date ranges
    """
    self.date_ranges = create_date_range_for_specific_period(
        start_year=start_year, end_year=end_year, start_month_range=start_month, end_month_range=end_month
    )
    return self.date_ranges
find_best_complete_products #
find_best_complete_products(
    max_cloud_cover: int | None = None, max_no_data_value: int = 5
) -> dict

Finds the best complete products for each Sentinel 2 tiles. This function will filter out all products that have more than 5% of nodata values.

Filtered out tiles will be stored in self.incomplete and tiles for which the search has found no results will be stored in self.error_list

Parameters:

Name Type Description Default
max_cloud_cover int | None

Max percentage of cloud cover allowed used for the search (Default value = None)

None
max_no_data_value int

Max percentage of no-data coverage by individual Sentinel 2 product (Default value = 5)

5

Returns:

Type Description
dict

Dictionary of product IDs and their corresponding Sentinel 2 tile names.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
def find_best_complete_products(self, max_cloud_cover: int | None = None, max_no_data_value: int = 5) -> dict:
    """
    Finds the best complete products for each Sentinel 2 tiles. This function will filter out all products that have
    more than 5% of nodata values.

    Filtered out tiles will be stored in `self.incomplete` and tiles for which
    the search has found no results will be stored in `self.error_list`

    Args:
      max_cloud_cover: Max percentage of cloud cover allowed used for the search  (Default value = None)
      max_no_data_value: Max percentage of no-data coverage by individual Sentinel 2 product  (Default value = 5)

    Returns:
        Dictionary of product IDs and their corresponding Sentinel 2 tile names.
    """
    cloud_cover = self.max_cloud_cover
    if max_cloud_cover:
        cloud_cover = max_cloud_cover
    no_data_value = self.max_no_data_value
    if max_no_data_value:
        no_data_value = max_no_data_value
    if not self.date_ranges:
        raise ValueError("date_ranges must be set before searching")

    tile_dict, incomplete_list, error_list = find_best_product_per_s2_tile(
        collection=self.collection,
        date_ranges=self.date_ranges,
        max_cloud_cover=cloud_cover,
        s2_tile_grid_list=self.sentinel2_tile_list,
        num_of_workers=4,
        max_no_data_value=no_data_value,
    )
    self.successful_results = tile_dict
    self.incomplete_results = incomplete_list
    if incomplete_list:
        self.logger.warning(
            "Warning, some of the input Sentinel 2 tiles do not have products covering the entire tile. "
            "These tiles will need to be handled differently (ex. creating a mosaic with multiple products"
        )
        self.logger.warning(f"Incomplete list: {incomplete_list}")
    self.error_results = error_list
    if error_list:
        self.logger.warning(
            "Warning, products for some Sentinel 2 tiles could not be found. "
            "Consider either extending date range input or max cloud cover"
        )
        self.logger.warning(f"Error list: {error_list}")
    return self.successful_results
select_best_products_per_feature #
select_best_products_per_feature() -> GeoDataFrame

Return a GeoDataFrame containing the best products for each Sentinel 2 tile.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
def select_best_products_per_feature(self) -> GeoDataFrame:
    """Return a GeoDataFrame containing the best products for each Sentinel 2 tile."""
    spatial_join_results = spatial_join_within(
        polygon_features=self.sentinel2_tiling_grid,
        polygon_column=self.sentinel2_tiling_grid_column,
        vector_features=self.vector_features,
        vector_column_name=self.vector_features_column,
    )
    write_best_product_ids_to_dataframe(
        spatial_join_results=spatial_join_results,
        tile_dictionary=self.successful_results,
        best_product_column=self.vector_features_best_product_column,
        s2_tiles_column=self.vector_features_column,
    )
    self.vector_features_with_products = spatial_join_results
    return self.vector_features_with_products
to_file #
to_file(output_dir: str | Path) -> None

Parameters:

Name Type Description Default
output_dir str | Path

Output directory used to write to file

required
Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
295
296
297
298
299
300
301
302
303
304
305
306
307
def to_file(self, output_dir: str | pathlib.Path) -> None:
    """

    Args:
      output_dir: Output directory used to write to file
    """
    write_results_to_file(
        cloud_cover=self.max_cloud_cover,
        successful_results=self.successful_results,
        incomplete_results=self.incomplete_results,
        error_results=self.error_results,
        output_dir=output_dir,
    )
sentinel_2_complete_tile_search(
    tile_id: int,
    collection: str,
    date_ranges: list[str],
    max_cloud_cover: int,
    max_no_data_value: int = 5,
) -> tuple[int, str, float | None, float | None] | None

Parameters:

Name Type Description Default
collection str
required
tile_id int
required
date_ranges list[str]
required
max_cloud_cover int
required
max_no_data_value int

(Default value = 5)

5

Returns:

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
def sentinel_2_complete_tile_search(
    tile_id: int,
    collection: str,
    date_ranges: list[str],
    max_cloud_cover: int,
    max_no_data_value: int = 5,
) -> tuple[int, str, float | None, float | None] | None:
    """

    Args:
      collection:
      tile_id:
      date_ranges:
      max_cloud_cover:
      max_no_data_value: (Default value = 5)

    Returns:


    """
    client = StacSearch(PLANETARY_COMPUTER)
    tile_ids = [tile_id]
    query: dict[str, Any] = {
        PlanetaryComputerS2Property.CLOUD_COVER: {"lt": max_cloud_cover},
        PlanetaryComputerS2Property.MGRS_TILE: {"in": tile_ids},
    }
    sortby = [{"field": PlanetaryComputerS2Property.CLOUD_COVER.sortby_field, "direction": "asc"}]

    client.search_for_date_ranges(
        date_ranges=date_ranges, collections=collection, query=query, sortby=sortby, limit=100
    )
    try:
        sorted_items = client.sort_results_by_cloud_coverage()
        if not sorted_items:
            return tile_id, "error: No results found", None, None
        filtered_items = client.filter_no_data(
            property_name=PlanetaryComputerS2Property.NODATA_PIXEL_PERCENTAGE,
            max_no_data_value=max_no_data_value,
        )
        if not filtered_items:
            return tile_id, "incomplete: No results found that cover the entire tile", None, None
        optimal_result = filtered_items[0]
        if optimal_result:
            return (
                tile_id,
                optimal_result.id,
                optimal_result.properties[PlanetaryComputerS2Property.CLOUD_COVER],
                optimal_result.properties[PlanetaryComputerS2Property.NODATA_PIXEL_PERCENTAGE],
            )

    except (IndexError, TypeError) as error:
        LOGGER.warning(str(error))
        return tile_id, f"error: {error}", None, None
    return None

find_best_product_per_s2_tile #

find_best_product_per_s2_tile(
    collection: str,
    date_ranges: list[str],
    max_cloud_cover: int,
    s2_tile_grid_list: list,
    max_no_data_value: int = 5,
    num_of_workers: int = 4,
)

Parameters:

Name Type Description Default
collection str
required
date_ranges list[str]
required
max_cloud_cover int
required
s2_tile_grid_list list
required
max_no_data_value int

(Default value = 5)

5
num_of_workers int

(Default value = 4)

4

Returns:

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
def find_best_product_per_s2_tile(
    collection: str,
    date_ranges: list[str],
    max_cloud_cover: int,
    s2_tile_grid_list: list,
    max_no_data_value: int = 5,
    num_of_workers: int = 4,
):
    """

    Args:
      collection:
      date_ranges:
      max_cloud_cover:
      s2_tile_grid_list:
      max_no_data_value:  (Default value = 5)
      num_of_workers: (Default value = 4)

    Returns:


    """
    successful_results: dict[Any, Any] = {}
    for tile in s2_tile_grid_list:
        successful_results[tile] = ""
    incomplete_results = []
    error_results = []
    with ThreadPoolExecutor(max_workers=num_of_workers) as executor:
        future_to_tile = {
            executor.submit(
                sentinel_2_complete_tile_search,
                tile_id=tile,
                collection=collection,
                date_ranges=date_ranges,
                max_cloud_cover=max_cloud_cover,
                max_no_data_value=max_no_data_value,
            ): tile
            for tile in s2_tile_grid_list
        }

        for future in as_completed(future_to_tile):
            result = future.result()
            if result is None:
                continue
            tile_id, optimal_result_id, result_cloud_cover, no_data = result
            if optimal_result_id.startswith("error:"):
                error_results.append(tile_id)
                continue
            if optimal_result_id.startswith("incomplete:"):
                incomplete_results.append(tile_id)
                continue
            successful_results[tile_id] = {
                "id": optimal_result_id,
                "cloud_cover": result_cloud_cover,
                "no_data": no_data,
            }
        cleaned_successful_results = {k: v for k, v in successful_results.items() if v != ""}
    return cleaned_successful_results, incomplete_results, error_results

write_best_product_ids_to_dataframe #

write_best_product_ids_to_dataframe(
    spatial_join_results: GeoDataFrame,
    tile_dictionary: dict,
    best_product_column: str = "best_s2_product_id",
    s2_tiles_column: str = "s2_tiles",
    logger: Logger = LOGGER,
) -> None

Parameters:

Name Type Description Default
spatial_join_results GeoDataFrame
required
tile_dictionary dict
required
best_product_column str
'best_s2_product_id'
s2_tiles_column str
's2_tiles'
logger Logger
LOGGER

Returns:

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
def write_best_product_ids_to_dataframe(
    spatial_join_results: GeoDataFrame,
    tile_dictionary: dict,
    best_product_column: str = "best_s2_product_id",
    s2_tiles_column: str = "s2_tiles",
    logger: logging.Logger = LOGGER,
) -> None:
    """

    Args:
      spatial_join_results:
      tile_dictionary:
      best_product_column:
      s2_tiles_column:
      logger:

    Returns:


    """
    logger.info("Writing best product IDs to dataframe")
    spatial_join_results[best_product_column] = spatial_join_results[s2_tiles_column].apply(
        lambda x: _get_best_product_id_for_each_grid_tile(s2_tile_search_results=tile_dictionary, feature_s2_tiles=x)
    )

write_results_to_file #

write_results_to_file(
    cloud_cover: int,
    successful_results: dict,
    incomplete_results: list | None = None,
    error_results: list | None = None,
    output_dir: str | Path = DATA_DIR,
    logger: Logger = LOGGER,
) -> dict

Parameters:

Name Type Description Default
cloud_cover int
required
successful_results dict
required
incomplete_results list | None
None
error_results list | None
None
output_dir str | Path
DATA_DIR
logger Logger
LOGGER

Returns:

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
def write_results_to_file(
    cloud_cover: int,
    successful_results: dict,
    incomplete_results: list | None = None,
    error_results: list | None = None,
    output_dir: str | pathlib.Path = DATA_DIR,
    logger: logging.Logger = LOGGER,
) -> dict:
    """

    Args:
      cloud_cover:
      successful_results:
      incomplete_results:
      error_results:
      output_dir:
      logger:

    Returns:


    """
    if isinstance(output_dir, str):
        output_dir = pathlib.Path(output_dir)
    tile_filename = output_dir / f"data_lt{cloud_cover}cc.json"
    with open(tile_filename, "w", encoding="utf-8") as json_file:
        json.dump(successful_results, json_file, indent=4)
    logger.info(f"Results have been written to {tile_filename}")

    incomplete_filename: pathlib.Path | None = None
    if incomplete_results:
        incomplete_dict = {"incomplete": incomplete_results}
        incomplete_filename = output_dir / f"incomplete_lt{cloud_cover}cc.json"
        with open(incomplete_filename, "w", encoding="utf-8") as json_file:
            json.dump(incomplete_dict, json_file, indent=4)
        logger.info(f"Incomplete results have been written to {incomplete_filename}")

    error_filename: pathlib.Path | None = None
    if error_results:
        error_dict = {"errors": error_results}
        error_filename = output_dir / f"errors_lt{cloud_cover}cc.json"
        with open(error_filename, "w", encoding="utf-8") as json_file:
            json.dump(error_dict, json_file, indent=4)
        logger.info(f"Errors results have been written to {error_filename}")

    return {
        "tile_filename": tile_filename,
        "incomplete_filename": incomplete_filename,
        "errors_filename": error_filename,
    }

download_and_process_sentinel2_asset #

download_and_process_sentinel2_asset(
    product_id: str,
    product_bands: list[str],
    collections: PlanetaryComputerS2Collection | str = L2A,
    target_projection: int | str | None = None,
    base_directory: str | Path = DATA_DIR,
    delete_intermediate_files: bool = False,
    logger: Logger = LOGGER,
) -> Asset

This function downloads a Sentinel 2 product based on the product ID provided.

It will download the individual asset bands provided in the bands argument, merge then all in a single tif and then reproject them to the input CRS.

Parameters:

Name Type Description Default
product_id str

ID of the Sentinel 2 product to be downloaded

required
product_bands list[str]

List of the product bands to be downloaded

required
collections PlanetaryComputerS2Collection | str

Collections to be downloaded from. Defaults to sentinel-2-l2a

L2A
target_projection int | str | None

The CRS project for the end product. If None, the reprojection step will be skipped

None
stac_client

StacSearch client to used. A new one will be created if not provided

required
base_directory str | Path

The base directory path where the downloaded files will be stored

DATA_DIR
delete_intermediate_files bool

Flag to determine if intermediate files should be deleted. Defaults to False

False
logger Logger

Logger instance

LOGGER

Returns:

Type Description
Asset

Asset instance

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_2.py
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
def download_and_process_sentinel2_asset(
    product_id: str,
    product_bands: list[str],
    collections: PlanetaryComputerS2Collection | str = PlanetaryComputerS2Collection.L2A,
    target_projection: int | str | None = None,
    base_directory: str | pathlib.Path = DATA_DIR,
    delete_intermediate_files: bool = False,
    logger: logging.Logger = LOGGER,
) -> Asset:
    """
    This function downloads a Sentinel 2 product based on the product ID provided.

    It will download the individual asset bands provided in the `bands` argument,
    merge then all in a single tif and then reproject them to the input CRS.

    Args:
      product_id: ID of the Sentinel 2 product to be downloaded
      product_bands: List of the product bands to be downloaded
      collections: Collections to be downloaded from. Defaults to `sentinel-2-l2a`
      target_projection: The CRS project for the end product. If `None`, the reprojection step will be
        skipped
      stac_client: StacSearch client to used. A new one will be created if not provided
      base_directory: The base directory path where the downloaded files will be stored
      delete_intermediate_files: Flag to determine if intermediate files should be deleted. Defaults to False
      logger: Logger instance

    Returns:
        Asset instance
    """
    base_file_name = f"{base_directory}/{product_id}"
    merged_file = f"{base_file_name}_merged.tif"
    reprojected_file = f"{base_file_name}_reprojected.tif"

    merged_file_exists = pathlib.Path(merged_file).exists()
    reprojected_file_exists = pathlib.Path(reprojected_file).exists()

    if reprojected_file_exists:
        logger.info(f"Reprojected file [{reprojected_file}] already exists")
        asset = Asset(asset_id=product_id, bands=product_bands, reprojected_asset=reprojected_file)
        return asset

    if merged_file_exists:
        logger.info(f"Merged file [{merged_file}] already exists")
        asset = Asset(asset_id=product_id, bands=product_bands, merged_asset_path=merged_file)
        if target_projection:
            logger.info(f"Reprojecting merged file [{merged_file}]")
            asset.reproject_merged_asset(
                base_directory=base_directory,
                target_projection=target_projection,
                delete_merged_asset=delete_intermediate_files,
            )
        return asset

    stac_client = StacSearch(catalog_name=PLANETARY_COMPUTER)
    items = stac_client.search(collections=collections, ids=[product_id])
    logger.info(items)
    asset_list = stac_client.download_search_results(bands=product_bands, base_directory=base_directory)
    logger.info(asset_list)
    asset = asset_list[0]
    asset.merge_asset(base_directory=base_directory, delete_sub_items=delete_intermediate_files)
    if not target_projection:
        logger.info("Skipping reprojection")
        return asset
    if target_projection:
        asset.reproject_merged_asset(
            target_projection=target_projection,
            base_directory=base_directory,
            delete_merged_asset=delete_intermediate_files,
        )
    return asset

sentinel_3 #

Sentinel3Search #

Sentinel3Search(
    collection: PlanetaryComputerS3Collection | str = OLCI_WFR,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: Logger = LOGGER,
)

Bases: AbstractStacWrapper

Executable wrapper for Sentinel-3 OLCI data on Planetary Computer.

Implements a fluent builder pattern to construct STAC queries. Execution and result storage are delegated to an underlying StacSearch client via proxy properties.

Initialize Sentinel3Search.

Parameters:

Name Type Description Default
collection PlanetaryComputerS3Collection | str

The Sentinel-3 STAC collection (default: sentinel-3-olci-wfr-l2-netcdf).

OLCI_WFR
date_range DateLike

Temporal filter, native pystac DateLike.

None
bbox BBoxLike | None

Spatial bounding box filter.

None
intersects IntersectsLike | None

Spatial GeoJSON geometry filter.

None
logger Logger

Custom logger instance.

LOGGER
Source code in src/geospatial_tools/stac/planetary_computer/sentinel_3.py
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
def __init__(
    self,
    collection: PlanetaryComputerS3Collection | str = PlanetaryComputerS3Collection.OLCI_WFR,
    date_range: DateLike = None,
    bbox: BBoxLike | None = None,
    intersects: IntersectsLike | None = None,
    logger: logging.Logger = LOGGER,
) -> None:
    """
    Initialize Sentinel3Search.

    Args:
        collection: The Sentinel-3 STAC collection (default: sentinel-3-olci-wfr-l2-netcdf).
        date_range: Temporal filter, native pystac DateLike.
        bbox: Spatial bounding box filter.
        intersects: Spatial GeoJSON geometry filter.
        logger: Custom logger instance.
    """
    super().__init__(collection=collection, date_range=date_range, bbox=bbox, intersects=intersects, logger=logger)

    self.orbit_states: list[PlanetaryComputerS3OrbitState] | None = None
    self.custom_query_params: dict[str, Any] = {}
search_results property #
search_results: list[Item] | None

Proxy property for STAC search results from the underlying client.

downloaded_assets property #
downloaded_assets: list[Asset] | None

Proxy property for downloaded assets from the underlying client.

filter_by_orbit_state #
filter_by_orbit_state(
    states: list[PlanetaryComputerS3OrbitState] | PlanetaryComputerS3OrbitState,
) -> Self

Filter products by orbit state (ascending or descending).

Invalidates current search results.

Parameters:

Name Type Description Default
states list[PlanetaryComputerS3OrbitState] | PlanetaryComputerS3OrbitState

Single state or list of PlanetaryComputerS3OrbitState.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_3.py
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
def filter_by_orbit_state(
    self, states: list[PlanetaryComputerS3OrbitState] | PlanetaryComputerS3OrbitState
) -> Self:
    """
    Filter products by orbit state (ascending or descending).

    Invalidates current search results.

    Args:
        states: Single state or list of `PlanetaryComputerS3OrbitState`.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    if isinstance(states, list):
        self.orbit_states = states
    else:
        self.orbit_states = [states]
    return self
download #
download(
    bands: list[PlanetaryComputerS3Band | str], base_directory: str | Path
) -> list[Asset] | None

Download Sentinel-3 assets with lowercase band key normalization.

Parameters:

Name Type Description Default
bands list[PlanetaryComputerS3Band | str]

List of bands to download.

required
base_directory str | Path

Local directory where assets will be saved.

required

Returns:

Type Description
list[Asset] | None

List of downloaded Asset objects.

Source code in src/geospatial_tools/stac/planetary_computer/sentinel_3.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def download(self, bands: list[PlanetaryComputerS3Band | str], base_directory: str | Path) -> list[Asset] | None:
    """
    Download Sentinel-3 assets with lowercase band key normalization.

    Args:
        bands: List of bands to download.
        base_directory: Local directory where assets will be saved.

    Returns:
        List of downloaded Asset objects.
    """
    lower_bands = [str(b).lower() for b in bands]
    return super().download(bands=lower_bands, base_directory=base_directory)
with_custom_query #
with_custom_query(query_params: dict[str, Any]) -> Self

Merge custom STAC query parameters and invalidate current state.

Parameters:

Name Type Description Default
query_params dict[str, Any]

Dictionary of custom STAC query parameters.

required

Returns:

Type Description
Self

The instance itself (Self) for fluent chaining.

Source code in src/geospatial_tools/stac/core.py
939
940
941
942
943
944
945
946
947
948
949
950
951
def with_custom_query(self, query_params: dict[str, Any]) -> Self:
    """
    Merge custom STAC query parameters and invalidate current state.

    Args:
        query_params: Dictionary of custom STAC query parameters.

    Returns:
        The instance itself (Self) for fluent chaining.
    """
    self._invalidate_state()
    self.custom_query_params.update(query_params)
    return self
search #
search() -> list[Item] | None

Execute the STAC search using the built query and parameters.

Returns:

Type Description
list[Item] | None

List of matched pystac Items, or None if no results.

Source code in src/geospatial_tools/stac/core.py
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
def search(self) -> list[pystac.Item] | None:
    """
    Execute the STAC search using the built query and parameters.

    Returns:
        List of matched pystac Items, or None if no results.
    """
    query = self._build_collection_query()
    query.update(self.custom_query_params)

    collections = [self.collection] if self.collection else None

    self.client.search(
        collections=collections,
        bbox=self.bbox,
        intersects=self.intersects,
        date_range=self.date_range,
        query=query if query else None,
    )
    return self.search_results

utils #

create_date_range_for_specific_period #

create_date_range_for_specific_period(
    start_year: int, end_year: int, start_month_range: int, end_month_range: int
) -> list[str]

This function create a list of date ranges.

For example, I want to create date ranges for 2020 and 2021, but only for the months from March to May. I therefore expect to have 2 ranges: [2020-03-01 to 2020-05-30, 2021-03-01 to 2021-05-30].

Handles the automatic definition of the last day for the end month, as well as periods that cross over years

For example, I want to create date ranges for 2020 and 2022, but only for the months from November to January. I therefore expect to have 2 ranges: [2020-11-01 to 2021-01-31, 2021-11-01 to 2022-01-31].

Parameters:

Name Type Description Default
start_year int

Start year for ranges

required
end_year int

End year for ranges

required
start_month_range int

Starting month for each period

required
end_month_range int

End month for each period (inclusively)

required

Returns:

Source code in src/geospatial_tools/stac/utils.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def create_date_range_for_specific_period(
    start_year: int, end_year: int, start_month_range: int, end_month_range: int
) -> list[str]:
    """
    This function create a list of date ranges.

    For example, I want to create date ranges for 2020 and 2021, but only for the months from March to May.
    I therefore expect to have 2 ranges: [2020-03-01 to 2020-05-30, 2021-03-01 to 2021-05-30].

    Handles the automatic definition of the last day for the end month, as well as periods that cross over years

    For example, I want to create date ranges for 2020 and 2022, but only for the months from November to January.
    I therefore expect to have 2 ranges: [2020-11-01 to 2021-01-31, 2021-11-01 to 2022-01-31].

    Args:
      start_year: Start year for ranges
      end_year: End year for ranges
      start_month_range: Starting month for each period
      end_month_range: End month for each period (inclusively)

    Returns:
    """
    date_ranges = []
    year_bump = 0
    if start_month_range > end_month_range:
        year_bump = 1
    range_end_year = end_year + 1 - year_bump
    for year in range(start_year, range_end_year):
        start_date = datetime.datetime(year, start_month_range, 1)
        last_day = calendar.monthrange(year + year_bump, end_month_range)[1]
        end_date = datetime.datetime(year + year_bump, end_month_range, last_day, 23, 59, 59)
        date_ranges.append(f"{start_date.isoformat()}Z/{end_date.isoformat()}Z")
    return date_ranges

get_s3_client #

get_s3_client(endpoint_url: str | None = None) -> client

Creates and returns a boto3 S3 client.

Parameters:

Name Type Description Default
endpoint_url str | None

The S3 endpoint URL. If None, it attempts to use the COPERNICUS_S3_ENDPOINT environment variable.

None

Returns:

Type Description
client

A boto3 S3 client.

Source code in src/geospatial_tools/stac/utils.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
def get_s3_client(endpoint_url: str | None = None) -> boto3.client:
    """
    Creates and returns a boto3 S3 client.

    Args:
        endpoint_url: The S3 endpoint URL. If None, it attempts to use
                      the COPERNICUS_S3_ENDPOINT environment variable.

    Returns:
        A boto3 S3 client.
    """
    if not endpoint_url:
        endpoint_url = os.environ.get("COPERNICUS_S3_ENDPOINT", "https://eodata.dataspace.copernicus.eu")

    access_key = os.environ.get("AWS_ACCESS_KEY_ID") or None
    secret_key = os.environ.get("AWS_SECRET_ACCESS_KEY") or None

    if not access_key or not secret_key:
        LOGGER.warning("AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY not found in environment.")

    LOGGER.info(f"Creating S3 client with endpoint: [{endpoint_url}]")

    # Note: boto3 automatically picks up AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
    # from the environment, but we can also pass them explicitly if needed.
    return boto3.client(
        "s3",
        endpoint_url=endpoint_url,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
    )

parse_s3_url #

parse_s3_url(url: str) -> tuple[str, str]

Parses an S3 URL or a CDSE STAC href to extract the bucket and key.

Expected formats: - s3://bucket/key - https://eodata.dataspace.copernicus.eu/bucket/key - https://zipper.dataspace.copernicus.eu/download/uuid (this might not be a direct S3 key)

Parameters:

Name Type Description Default
url str

The URL to parse.

required

Returns:

Type Description
tuple[str, str]

A tuple of (bucket, key).

Raises:

Type Description
ValueError

If the URL cannot be parsed into a bucket and key.

Source code in src/geospatial_tools/stac/utils.py
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def parse_s3_url(url: str) -> tuple[str, str]:
    """
    Parses an S3 URL or a CDSE STAC href to extract the bucket and key.

    Expected formats:
    - s3://bucket/key
    - https://eodata.dataspace.copernicus.eu/bucket/key
    - https://zipper.dataspace.copernicus.eu/download/uuid (this might not be a direct S3 key)

    Args:
        url: The URL to parse.

    Returns:
        A tuple of (bucket, key).

    Raises:
        ValueError: If the URL cannot be parsed into a bucket and key.
    """
    parsed = urlparse(url)

    if parsed.scheme == "s3":
        bucket = parsed.netloc
        key = parsed.path.lstrip("/")
        return bucket, key

    if parsed.scheme in ["http", "https"]:
        # For CDSE eodata endpoint, the path starts with the bucket
        # e.g., /Sentinel-2/MSI/L2A/2023/01/01/...
        path_parts = parsed.path.lstrip("/").split("/")
        if len(path_parts) < 2:
            raise ValueError(f"URL path does not contain enough parts to determine bucket and key: {url}")

        bucket = path_parts[0]
        key = "/".join(path_parts[1:])
        return bucket, key

    raise ValueError(f"Unsupported URL scheme: {parsed.scheme} in URL: {url}")