Utils
Warning
This library is under development; none of the provided solutions are available for download.
This module contains auxiliary functions that may be useful to users during the processing and analysis of forest data.
Available Functions
stats_summary
Generates a statistical summary for the specified numeric columns of a DataFrame.
Parameters:
- df
: Pandas DataFrame with the input data.
- *args
: Names of the numeric columns to be summarized.
- ignore_zeros
(bool): If True
, zero values are ignored in the calculations.
- language
(str): Sets the output column language. Accepts "en"
or "pt-br"
.
Output:
- DataFrame with statistics: mean, minimum, maximum, standard deviation, coefficient of variation (CV), quartiles (Q1, Q2, Q3), and interquartile range (IQR).
get_metrics
Calculates evaluation metrics for predictive models.
Parameters:
- real_y
: List or array with the actual values.
- predicted_y
: List or array with the predicted values.
Calculated metrics:
- MAE: Mean Absolute Error.
- MAPE: Mean Absolute Percentage Error.
- MSE: Mean Squared Error.
- RMSE: Root Mean Squared Error.
- R²: Coefficient of determination.
- Explained variance.
- Mean error (model bias).
Output:
- Tuple with the metric values in the following order: (mae, mape, mse, rmse, r_squared, explained_variance, mean_error)
.
plot_x_y
Generates a scatter plot for one variable x
and one or more variables y
.
Parameters:
- x
: List or array with the X-axis values.
- *ys
: One or more lists or arrays with the Y-axis values.
Behavior:
- Each y
series is represented with a unique combination of marker and color.
- Axes start from zero, grid is shown in the background, and each y
series has a legend.
Output:
- Displays the plot on screen.