pandas_xyz.algorithms module

Algorithm functions that take pandas.Series as input.

These functions are then delegated to become methods of the .xyz DataFrame accessor.

pandas_xyz.algorithms.resample_dist(df, on='distance', sample_len=5.0, bound_lo=None, bound_hi=None)[source]
Parameters

on (str) – column to use instead of index for resampling. Column must be numeric and monotonic increasing.

pandas_xyz.algorithms.ds_from_xy(lat, lon)[source]

Calculate point-to-point displacements from GPS coordinates.

The chosen scheme: displacement at [i] represents the distance from [i-1] to [i].

Parameters
  • lat (pandas.Series) – latitude coordinates along the route in degrees N (-90, 90). Must be numeric dtype.

  • lon (pandas.Series) – longitude coordinates along the route in degrees E (-180, 180). Must be numeric dtype.

Returns

point-to-point displacements along the route in meters.

Return type

pandas.Series

Assumptions:
  • Earth is a perfect sphere, with radius = pandas_xyz.scalar.EARTH_RADIUS_METERS.

  • The point-to-point distances are sufficiently short (so that latitude distortion and curvature have limited effects).

pandas_xyz.algorithms.s_from_xy(lat, lon)[source]

Calculate cumulative distances from GPS coordinates.

Parameters
  • lat (pandas.Series) – latitude coordinates along the route in degrees N (-90, 90). Must be numeric dtype.

  • lon (pandas.Series) – longitude coordinates along the route in degrees E (-180, 180). Must be numeric dtype.

Returns

cumulative distances along the route in meters.

Return type

pandas.Series

pandas_xyz.algorithms.ds_from_s(distance)[source]

Calculate point-to-point displacements from cumulative distances.

The chosen scheme: displacement at [i] represents the distance from [i-1] to [i].

Parameters

distance (pandas.Series) – cumulative distances along the route in meters. Must be numeric dtype.

Returns

point-to-point displacements along the route in meters.

Return type

pandas.Series

pandas_xyz.algorithms.s_from_ds(displacement)[source]

Calculate cumulative distances from point-to-point displacements.

The chosen scheme: displacement at [i] represents the distance from [i-1] to [i]. This scheme means converting displacements to cumulative distances does not require any extrapolation.

Parameters

displacement (pandas.Series) – point-to-point displacements along the route in meters. Must be numeric dtype.

Returns

cumulative distances along the route in meters.

Return type

pandas.Series

pandas_xyz.algorithms.s_from_v(speed, time=None)[source]

Calculate cumulative distances from speed

The chosen scheme: speed at [i] represents the distance from [i] to [i+1]. This means distance.diff() and time.diff() are shifted by one index from speed. I have chosen to extrapolate the position at the first index by assuming we start at a cumulative distance of 0.

Parameters
  • speed (pandas.Series) – speed along the route in meters per second. Must be numeric dtype.

  • time (pandas.Series) – cumulative time from start along the route in seconds. Must be numeric dtype. Default None.

Returns

cumulative distances along the route in meters.

Return type

pandas.Series

pandas_xyz.algorithms.v_from_s(distance, time=None)[source]

Calculate speed from cumulative distances.

The chosen scheme: speed at [i] represents the distance from [i] to [i+1]. This means distance.diff() and time.diff() are shifted by one index from speed. I have chosen to extrapolate the speed at the final position by ffill.

Parameters
  • distance (pandas.Series) – cumulative distances along the route in meters. Must be numeric dtype.

  • time (pandas.Series) – cumulative time from start along the route in seconds. Must be numeric dtype. Default None.

Returns

speed along the route in meters per second.

Return type

pandas.Series

pandas_xyz.algorithms.v_from_ds(displacement, time=None)[source]

Calculate speed from point-to-point displacements.

The chosen scheme: displacement at [i] represents the distance from [i-1] to [i], while speed at [i] represents the distance from [i] to [i+1]. This means displacements and time.diff() are shifted by one index from speed. I have chosen to extrapolate the speed at the final position by ffill.

Parameters
  • displacement (pandas.Series) – point-to-point displacements along the route in meters. Must be numeric dtype.

  • time (pandas.Series) – cumulative time from start along the route in seconds. Must be numeric dtype. Default None.

Returns

speed along the route in meters per second.

Return type

pandas.Series

pandas_xyz.algorithms.reduced_point_index(lat, lon, min_dist=15.0)[source]

Detect GPS coordinates that are too close together.

Returns a boolean same-sized list indicating subsampled GPS coordinates that are far enough apart from each other.

No matter how closely spaced the points, always returns the start and end points.

Originally developed in my mapmatching package; an old version still exists there.

Parameters
  • lat (pandas.Series) – latitude coordinates along the route in degrees N (-90, 90). Must be numeric dtype.

  • lon (pandas.Series) – longitude coordinates along the route in degrees E (-180, 180). Must be numeric dtype.

  • min_dist (float) – The minimum distance (meters) between the resulting downsampled GPS coordinates. Default 15.

Returns

A boolean array that can be used to subsample a pandas.Series corresponding to this GPS trace, based on this minimum distance scheme.

Return type

list

pandas_xyz.algorithms.z_filter_threshold(elevation, threshold=5.0)[source]

Filter elevation coordinates by ignoring changes smaller than some threshold value.

The resulting series of values will look like a staircase with varying tread lengths; the same reference value persists until the unfiltered coordinate series attains a value that has changed from that baseline by at least the threshold value, then that new coordinate value becomes the reference value.

Parameters
  • elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

  • threshold (float) – threshold, in meters, beyond which a change in elevation is registered by the algorithm. Default 5.0.

Returns

elevation coordinates filtered by the threshold value.

Return type

pandas.Series

pandas_xyz.algorithms.z_smooth_time(elevation, sample_len=1, window_len=21, polyorder=2)[source]

Smooths noisy elevation time series.

Because of GPS and DEM inaccuracy, elevation data is not smooth. Calculations involving terrain slope (the derivative of elevation with respect to distance, dy/dx) will not yield reasonable values unless the data is smoothed.

This method’s approach follows the overview outlined in the NREL paper cited in README. However, unlike the algorithm in the paper, which samples regularly over distance, this algorithm samples regularly over time (well, it presumes the elevation values are sampled at even 1-second intervals). The body only cares about energy use over time, not over distance. The noisy elevation data is downsampled and passed through a Savitzky-Golay (SG) filter. Parameters for the filters were not described in the paper, so they must be tuned to yield intended results when applied to a particular type of data. Because the assumptions about user behavior depend on the activity being performed, the parameters will likely differ for a road run, a trail run, or a trail hike.

Parameters
  • elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype. Assumed 1-second interval.

  • sample_len (int) – time (in seconds) between between desired resampled data. Default is 1.

  • window_len (int) – length of the window used in the SG filter. Must be positive odd integer. Default 21.

  • polyorder (int) – order of the polynomial used in the SG filter. Must be less than window_len. Default 2.

Returns

elevation coordinates that result from this smoothing algorithm.

Return type

pandas.Series

pandas_xyz.algorithms.z_smooth_distance(distance, elevation, sample_len=5.0, window_len=7, polyorder=2)[source]

Like z_smooth_time(), but sampled over distance instead of time.

Parameters
  • distance (pandas.Series) – cumulative distances along the route in meters. Must be numeric dtype.

  • elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

  • sample_len (float) – desired distance (meters) between resampled data points.

  • window_len (int) – length of the window used in the SG filter. Must be positive odd integer.

  • polyorder (int) – order of the polynomial used in the SG filter. Must be less than window_len.

Returns

elevation coordinates that result from this smoothing algorithm.

Return type

pandas.Series

pandas_xyz.algorithms.z_flatten(elevation)[source]

Return a series of elevation coordinates with no changes in elevation.

Parameters

elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

Returns

flat elevation coordinates located at the mean value of the input elevation records.

Return type

pandas.Series

pandas_xyz.algorithms.z_gain_naive(elevation)[source]

Calculate elevation gain (scalar).

This is the most generous elevation gain algorithm there is: it counts every little rise in the trail towards your total.

Parameters

elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

Returns

total elevation gain along the route in meters.

Return type

float

pandas_xyz.algorithms.z_loss_naive(elevation)[source]

Calculate elevation loss (scalar).

See z_gain_naive().

Parameters

elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

Returns

total elevation loss along the route in meter.

Return type

float

pandas_xyz.algorithms.z_gain_threshold(elevation, threshold=5.0)[source]

Conservatively calculate elevation gain from a series of coordinates.

This algorithm doesn’t count elevation gain until the elevation coordinates rise by at least a threshold value from their prior reference location.

See z_filter_threshold().

Parameters
  • elevation (pandas.Series) – elevation coordinates along the route in meters above sea level. Must be numeric dtype.

  • threshold (float) – the value, in meters, by which an elevation coordinate must exceed the reference elevation coordinate in order to count toward the total. Default 5.0.

Returns

total elevation gain along the route in meters.

Return type

float