Utils

This module contains auxiliary functions that may be useful to users during the processing and analysis of forest data.

Available Functions

from fptools.utils import stats_summary, get_metrics, plot_x_y

`stats_summary`

Generates a statistical summary for the specified numeric columns of a DataFrame.

Parameters:
- df: Pandas DataFrame with the input data.
- *args: Names of the numeric columns to be summarized.
- ignore_zeros (bool): If True, zero values are ignored in the calculations.
- language (str): Sets the output column language. Accepts "en" or "pt-br".

Output:
- DataFrame with statistics: mean, minimum, maximum, standard deviation, coefficient of variation (CV), quartiles (Q1, Q2, Q3), and interquartile range (IQR).

`get_metrics`

Calculates evaluation metrics for predictive models.

Parameters:
- real_y: List or array with the actual values.
- predicted_y: List or array with the predicted values.

Calculated metrics:
- MAE: Mean Absolute Error.
- MAPE: Mean Absolute Percentage Error.
- MSE: Mean Squared Error.
- RMSE: Root Mean Squared Error.
- R²: Coefficient of determination.
- Explained variance.
- Mean error (model bias).

Output:
- Tuple with the metric values in the following order: (mae, mape, mse, rmse, r_squared, explained_variance, mean_error).

`plot_x_y`

Generates a scatter plot for one variable x and one or more variables y.

Parameters:
- x: List or array with the X-axis values.
- *ys: One or more lists or arrays with the Y-axis values.

Behavior:
- Each y series is represented with a unique combination of marker and color.
- Axes start from zero, grid is shown in the background, and each y series has a legend.

Output:
- Displays the plot on screen.