Regression
ELASTIC_NET
Elastic net regression combines L1 and L2 regularization, balancing sparse feature selection against coefficient stability. It is a useful compromise when pure lasso is unstable on correlated predictors.
The model minimizes the following objective function over the samples and coefficients:
\frac{1}{2 n_{\text{samples}}} ||y - Xw||^2_2 + \alpha \rho ||w||_1 + \frac{\alpha(1-\rho)}{2} ||w||^2_2
where \alpha controls the overall regularization strength and \rho is the L1 mixing ratio.
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and learned coefficient arrays.
Excel Usage
=ELASTIC_NET(data, target, alpha, enet_ratio, fit_intercept, max_iter, tol, coord_selection, random_state)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.alpha(float, optional, default: 1): Overall regularization strength applied to the regression model.enet_ratio(float, optional, default: 0.5): Elastic net mixing ratio corresponding to sklearn’s l1_ratio parameter.fit_intercept(bool, optional, default: true): Whether to include an intercept term in the linear model.max_iter(int, optional, default: 1000): Maximum number of coordinate-descent iterations.tol(float, optional, default: 0.0001): Optimization tolerance used for convergence checks.coord_selection(str, optional, default: “cyclic”): Coordinate update order used by the optimizer.random_state(int, optional, default: null): Integer seed used when random coordinate selection is enabled. Leave blank for the estimator default.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted coefficient arrays.
Example 1: Fit elastic net regression on a two-feature linear trend
Inputs:
| data | target | alpha | enet_ratio | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0.01 | 0.5 | true | 5000 | 0.0001 | cyclic | 0 |
| 1 | 0 | 3 | |||||||
| 0 | 1 | 4 | |||||||
| 1 | 1 | 6 | |||||||
| 2 | 1 | 8 | |||||||
| 2 | 2 | 11 |
Excel formula:
=ELASTIC_NET({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 0.01, 0.5, TRUE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999924,"properties":{"training_r2":{"type":"Double","basicValue":0.999924},"mean_squared_error":{"type":"Double","basicValue":0.000832226},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.03586}],[{"type":"Double","basicValue":3.03362}],[{"type":"Double","basicValue":3.99552}],[{"type":"Double","basicValue":5.99328}],[{"type":"Double","basicValue":7.99103}],[{"type":"Double","basicValue":10.9507}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0358611}],[{"type":"Double","basicValue":-0.0336196}],[{"type":"Double","basicValue":0.00448235}],[{"type":"Double","basicValue":0.00672392}],[{"type":"Double","basicValue":0.00896548}],[{"type":"Double","basicValue":0.049309}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99776},{"type":"Double","basicValue":2.95966}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.03586}]]},"dual_gap":{"type":"Double","basicValue":0.0000930618},"iteration_count":{"type":"Double","basicValue":10}}}
Example 2: Flatten a single-row numeric target range for elastic net regression
Inputs:
| data | target | alpha | enet_ratio | fit_intercept | max_iter | tol | coord_selection | random_state | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 3 | 5 | 7 | 9 | 11 | 0.01 | 0.5 | true | 5000 | 0.0001 | cyclic | 0 |
| 1 | |||||||||||||
| 2 | |||||||||||||
| 3 | |||||||||||||
| 4 | |||||||||||||
| 5 |
Excel formula:
=ELASTIC_NET({0;1;2;3;4;5}, {1,3,5,7,9,11}, 0.01, 0.5, TRUE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999993,"properties":{"training_r2":{"type":"Double","basicValue":0.999993},"mean_squared_error":{"type":"Double","basicValue":0.000076879},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01284}],[{"type":"Double","basicValue":3.0077}],[{"type":"Double","basicValue":5.00257}],[{"type":"Double","basicValue":6.99743}],[{"type":"Double","basicValue":8.9923}],[{"type":"Double","basicValue":10.9872}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0128351}],[{"type":"Double","basicValue":-0.00770108}],[{"type":"Double","basicValue":-0.00256703}],[{"type":"Double","basicValue":0.00256703}],[{"type":"Double","basicValue":0.00770108}],[{"type":"Double","basicValue":0.0128351}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99487}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01284}]]},"dual_gap":{"type":"Double","basicValue":-1.91542e-18},"iteration_count":{"type":"Double","basicValue":2}}}
Example 3: Fit a no-intercept elastic net model on a two-feature plane
Inputs:
| data | target | alpha | enet_ratio | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 2 | 0.01 | 0.7 | false | 5000 | 0.0001 | cyclic | 0 |
| 0 | 1 | 3 | |||||||
| 1 | 1 | 5 | |||||||
| 2 | 1 | 7 | |||||||
| 1 | 2 | 8 | |||||||
| 2 | 2 | 10 |
Excel formula:
=ELASTIC_NET({1,0;0,1;1,1;2,1;1,2;2,2}, {2;3;5;7;8;10}, 0.01, 0.7, FALSE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999982,"properties":{"training_r2":{"type":"Double","basicValue":0.999982},"mean_squared_error":{"type":"Double","basicValue":0.000139382},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2.00061}],[{"type":"Double","basicValue":2.99079}],[{"type":"Double","basicValue":4.9914}],[{"type":"Double","basicValue":6.99201}],[{"type":"Double","basicValue":7.98219}],[{"type":"Double","basicValue":9.9828}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.000612489}],[{"type":"Double","basicValue":0.00921332}],[{"type":"Double","basicValue":0.00860083}],[{"type":"Double","basicValue":0.00798835}],[{"type":"Double","basicValue":0.0178142}],[{"type":"Double","basicValue":0.0172017}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2.00061},{"type":"Double","basicValue":2.99079}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]},"dual_gap":{"type":"Double","basicValue":0.000609683},"iteration_count":{"type":"Double","basicValue":22}}}
Example 4: Use random coordinate updates on correlated features with elastic net
Inputs:
| data | target | alpha | enet_ratio | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.1 | 1 | 0.05 | 0.4 | true | 5000 | 0.0001 | random | 0 |
| 1 | 0.9 | 3.1 | |||||||
| 2 | 2.1 | 5.8 | |||||||
| 3 | 2.9 | 7.9 | |||||||
| 4 | 4.1 | 10.2 | |||||||
| 5 | 4.9 | 12.1 |
Excel formula:
=ELASTIC_NET({0,0.1;1,0.9;2,2.1;3,2.9;4,4.1;5,4.9}, {1;3.1;5.8;7.9;10.2;12.1}, 0.05, 0.4, TRUE, 5000, 0.0001, "random", 0)
Expected output:
{"type":"Double","basicValue":0.998847,"properties":{"training_r2":{"type":"Double","basicValue":0.998847},"mean_squared_error":{"type":"Double","basicValue":0.0171298},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.15865}],[{"type":"Double","basicValue":3.18584}],[{"type":"Double","basicValue":5.66974}],[{"type":"Double","basicValue":7.69693}],[{"type":"Double","basicValue":10.1808}],[{"type":"Double","basicValue":12.208}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.158646}],[{"type":"Double","basicValue":-0.0858364}],[{"type":"Double","basicValue":0.130262}],[{"type":"Double","basicValue":0.203072}],[{"type":"Double","basicValue":0.0191697}],[{"type":"Double","basicValue":-0.108021}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.11377},{"type":"Double","basicValue":1.14178}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.04447}]]},"dual_gap":{"type":"Double","basicValue":0.00131021},"iteration_count":{"type":"Double","basicValue":353}}}
Python Code
Show Code
import numpy as np
from sklearn.linear_model import ElasticNet as SklearnElasticNet
def elastic_net(data, target, alpha=1, enet_ratio=0.5, fit_intercept=True, max_iter=1000, tol=0.0001, coord_selection='cyclic', random_state=None):
"""
Fit an elastic net regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
alpha (float, optional): Overall regularization strength applied to the regression model. Default is 1.
enet_ratio (float, optional): Elastic net mixing ratio corresponding to sklearn's l1_ratio parameter. Default is 0.5.
fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
max_iter (int, optional): Maximum number of coordinate-descent iterations. Default is 1000.
tol (float, optional): Optimization tolerance used for convergence checks. Default is 0.0001.
coord_selection (str, optional): Coordinate update order used by the optimizer. Valid options: Cyclic, Random. Default is 'cyclic'.
random_state (int, optional): Integer seed used when random coordinate selection is enabled. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted coefficient arrays.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
selection_value = str(coord_selection).strip().lower()
if selection_value not in {"cyclic", "random"}:
return "Error: coord_selection must be 'cyclic' or 'random'"
if float(alpha) < 0:
return "Error: alpha must be non-negative"
if float(enet_ratio) < 0 or float(enet_ratio) > 1:
return "Error: enet_ratio must be between 0 and 1"
if int(max_iter) < 1:
return "Error: max_iter must be at least 1"
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnElasticNet(
alpha=float(alpha),
l1_ratio=float(enet_ratio),
fit_intercept=bool(fit_intercept),
max_iter=int(max_iter),
tol=float(tol),
selection=selection_value,
random_state=None if random_state in (None, "") else int(random_state)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
dual_gap = float(np.atleast_1d(fitted.dual_gap_).reshape(-1)[0])
iteration_count = float(np.atleast_1d(fitted.n_iter_).reshape(-1)[0])
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())},
"dual_gap": {"type": "Double", "basicValue": dual_gap},
"iteration_count": {"type": "Double", "basicValue": iteration_count}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
GB_REGRESS
Gradient boosting regression builds a sequence of shallow trees that iteratively correct earlier residual errors. It is a flexible nonlinear regressor for tabular data and exposes feature-importance estimates.
The ensemble builds an additive model in a forward stage-wise fashion:
F_m(x) = F_{m-1}(x) + \nu \sum_{j=1}^{J_m} \gamma_{jm} I(x \in R_{jm})
where \nu is the learning rate, and each stage m adds a tree to minimize the specified loss function.
This wrapper accepts rows as samples and a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and fitted feature importances.
Excel Usage
=GB_REGRESS(data, target, n_estimators, learning_rate, max_depth, subsample, random_state)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.n_estimators(int, optional, default: 100): Number of boosting stages to fit.learning_rate(float, optional, default: 0.1): Shrinkage factor applied to each boosting stage.max_depth(int, optional, default: 3): Maximum depth of each individual regression tree.subsample(float, optional, default: 1): Fraction of samples used to fit each boosting stage.random_state(int, optional, default: null): Integer seed for reproducible boosting and tree construction. Leave blank for the estimator default.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted feature importances.
Example 1: Fit a gradient boosting regressor on a two-feature linear trend
Inputs:
| data | target | n_estimators | learning_rate | max_depth | subsample | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 60 | 0.1 | 2 | 1 | 0 |
| 1 | 0 | 3 | |||||
| 0 | 1 | 4 | |||||
| 1 | 1 | 6 | |||||
| 2 | 1 | 8 | |||||
| 2 | 2 | 11 |
Excel formula:
=GB_REGRESS({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 60, 0.1, 2, 1, 0)
Expected output:
{"type":"Double","basicValue":0.999957,"properties":{"training_r2":{"type":"Double","basicValue":0.999957},"mean_squared_error":{"type":"Double","basicValue":0.000465387},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.02009}],[{"type":"Double","basicValue":3.01012}],[{"type":"Double","basicValue":4.01197}],[{"type":"Double","basicValue":5.9665}],[{"type":"Double","basicValue":8.01783}],[{"type":"Double","basicValue":10.9735}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0200879}],[{"type":"Double","basicValue":-0.0101206}],[{"type":"Double","basicValue":-0.0119703}],[{"type":"Double","basicValue":0.0334958}],[{"type":"Double","basicValue":-0.017833}],[{"type":"Double","basicValue":0.0265161}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.752868}],[{"type":"Double","basicValue":0.247132}]]},"estimator_count":{"type":"Double","basicValue":60}}}
Example 2: Flatten a single-row numeric target range for gradient boosting regression
Inputs:
| data | target | n_estimators | learning_rate | max_depth | subsample | random_state | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 4 | 9 | 16 | 25 | 80 | 0.1 | 2 | 1 | 0 |
| 1 | |||||||||||
| 2 | |||||||||||
| 3 | |||||||||||
| 4 | |||||||||||
| 5 |
Excel formula:
=GB_REGRESS({0;1;2;3;4;5}, {0,1,4,9,16,25}, 80, 0.1, 2, 1, 0)
Expected output:
{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":0.00000939573},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.00469531}],[{"type":"Double","basicValue":1.00243}],[{"type":"Double","basicValue":4.00045}],[{"type":"Double","basicValue":8.99966}],[{"type":"Double","basicValue":15.9974}],[{"type":"Double","basicValue":24.9954}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.00469531}],[{"type":"Double","basicValue":-0.00242558}],[{"type":"Double","basicValue":-0.000453668}],[{"type":"Double","basicValue":0.000338078}],[{"type":"Double","basicValue":0.00263296}],[{"type":"Double","basicValue":0.00460352}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":80}}}
Example 3: Fit gradient boosting regression with stochastic subsampling
Inputs:
| data | target | n_estimators | learning_rate | max_depth | subsample | random_state |
|---|---|---|---|---|---|---|
| 0 | 1 | 80 | 0.1 | 2 | 0.8 | 0 |
| 1 | 3 | |||||
| 2 | 5 | |||||
| 3 | 7 | |||||
| 4 | 9 | |||||
| 5 | 11 |
Excel formula:
=GB_REGRESS({0;1;2;3;4;5}, {1;3;5;7;9;11}, 80, 0.1, 2, 0.8, 0)
Expected output:
{"type":"Double","basicValue":0.999932,"properties":{"training_r2":{"type":"Double","basicValue":0.999932},"mean_squared_error":{"type":"Double","basicValue":0.000790625},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.0257}],[{"type":"Double","basicValue":2.96377}],[{"type":"Double","basicValue":5.04344}],[{"type":"Double","basicValue":6.97499}],[{"type":"Double","basicValue":9.00422}],[{"type":"Double","basicValue":10.9845}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0256953}],[{"type":"Double","basicValue":0.0362316}],[{"type":"Double","basicValue":-0.0434387}],[{"type":"Double","basicValue":0.0250072}],[{"type":"Double","basicValue":-0.00422126}],[{"type":"Double","basicValue":0.0155135}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":80}}}
Example 4: Fit a gradient boosting regressor on a piecewise two-feature target
Inputs:
| data | target | n_estimators | learning_rate | max_depth | subsample | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 80 | 0.1 | 2 | 1 | 0 |
| 0 | 1 | 1 | |||||
| 1 | 0 | 1 | |||||
| 1 | 1 | 3 | |||||
| 2 | 1 | 5 | |||||
| 2 | 2 | 8 |
Excel formula:
=GB_REGRESS({0,0;0,1;1,0;1,1;2,1;2,2}, {0;1;1;3;5;8}, 80, 0.1, 2, 1, 0)
Expected output:
{"type":"Double","basicValue":0.999835,"properties":{"training_r2":{"type":"Double","basicValue":0.999835},"mean_squared_error":{"type":"Double","basicValue":0.00126717},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.00288363}],[{"type":"Double","basicValue":1.02073}],[{"type":"Double","basicValue":1.02255}],[{"type":"Double","basicValue":2.93344}],[{"type":"Double","basicValue":5.04196}],[{"type":"Double","basicValue":7.97844}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.00288363}],[{"type":"Double","basicValue":-0.0207312}],[{"type":"Double","basicValue":-0.0225512}],[{"type":"Double","basicValue":0.0665643}],[{"type":"Double","basicValue":-0.0419602}],[{"type":"Double","basicValue":0.021562}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.836253}],[{"type":"Double","basicValue":0.163747}]]},"estimator_count":{"type":"Double","basicValue":80}}}
Python Code
Show Code
import numpy as np
from sklearn.ensemble import GradientBoostingRegressor as SklearnGradientBoostingRegressor
def gb_regress(data, target, n_estimators=100, learning_rate=0.1, max_depth=3, subsample=1, random_state=None):
"""
Fit a gradient boosting regressor and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
n_estimators (int, optional): Number of boosting stages to fit. Default is 100.
learning_rate (float, optional): Shrinkage factor applied to each boosting stage. Default is 0.1.
max_depth (int, optional): Maximum depth of each individual regression tree. Default is 3.
subsample (float, optional): Fraction of samples used to fit each boosting stage. Default is 1.
random_state (int, optional): Integer seed for reproducible boosting and tree construction. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted feature importances.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
if int(n_estimators) < 1:
return "Error: n_estimators must be at least 1"
if float(learning_rate) <= 0:
return "Error: learning_rate must be greater than 0"
if int(max_depth) < 1:
return "Error: max_depth must be at least 1"
if float(subsample) <= 0 or float(subsample) > 1:
return "Error: subsample must be greater than 0 and at most 1"
fitted = SklearnGradientBoostingRegressor(
n_estimators=int(n_estimators),
learning_rate=float(learning_rate),
max_depth=int(max_depth),
subsample=float(subsample),
random_state=None if random_state in (None, "") else int(random_state)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"feature_importances": {"type": "Array", "elements": col(fitted.feature_importances_.tolist())},
"estimator_count": {"type": "Double", "basicValue": float(fitted.n_estimators_)}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
HUBER_REG
Huber regression blends squared loss near the fitted trend with absolute loss on larger errors, making it more resistant to outliers than ordinary least squares. It is a robust linear regressor for contaminated tabular datasets.
The algorithm minimizes the loss function:
\min_{w, c, \sigma} \sum_{i=1}^n \left( \sigma + H_{\epsilon}\left(\frac{y_i - X_i w - c}{\sigma}\right)\sigma \right) + \alpha ||w||_2^2
where H_{\epsilon} is the Huber loss function, \sigma scales the residuals, and \epsilon defines the threshold for outliers.
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and robust-fit summary properties.
Excel Usage
=HUBER_REG(data, target, epsilon, max_iter, alpha, fit_intercept, tol)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.epsilon(float, optional, default: 1.35): Threshold that controls when observations are treated as outliers.max_iter(int, optional, default: 100): Maximum number of optimizer iterations.alpha(float, optional, default: 0.0001): Strength of the squared L2 regularization term.fit_intercept(bool, optional, default: true): Whether to include an intercept term in the linear model.tol(float, optional, default: 0.00001): Convergence tolerance for the optimizer.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and robust-fit summary properties.
Example 1: Fit Huber regression on a lightly noisy two-feature trend
Inputs:
| data | target | epsilon | max_iter | alpha | fit_intercept | tol | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 1.35 | 200 | 0.0001 | true | 0.00001 |
| 1 | 0 | 3.1 | |||||
| 0 | 1 | 3.9 | |||||
| 1 | 1 | 6.2 | |||||
| 2 | 1 | 7.8 | |||||
| 2 | 2 | 11.1 |
Excel formula:
=HUBER_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3.1;3.9;6.2;7.8;11.1}, 1.35, 200, 0.0001, TRUE, 0.00001)
Expected output:
{"type":"Double","basicValue":0.997742,"properties":{"training_r2":{"type":"Double","basicValue":0.997742},"mean_squared_error":{"type":"Double","basicValue":0.024728},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.988095}],[{"type":"Double","basicValue":3.09842}],[{"type":"Double","basicValue":3.92698}],[{"type":"Double","basicValue":6.0373}],[{"type":"Double","basicValue":8.14763}],[{"type":"Double","basicValue":11.0865}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0119049}],[{"type":"Double","basicValue":0.00158119}],[{"type":"Double","basicValue":-0.0269777}],[{"type":"Double","basicValue":0.162699}],[{"type":"Double","basicValue":-0.347625}],[{"type":"Double","basicValue":0.0134922}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2.11032},{"type":"Double","basicValue":2.93888}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0.988095}]]},"scale":{"type":"Double","basicValue":0.021156},"outlier_count":{"type":"Double","basicValue":2},"outlier_mask":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":false}]]},"iteration_count":{"type":"Double","basicValue":19}}}
Example 2: Flatten a single-row numeric target range for Huber regression
Inputs:
| data | target | epsilon | max_iter | alpha | fit_intercept | tol | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.1 | 3 | 5.2 | 6.8 | 9.1 | 11 | 1.35 | 200 | 0.0001 | true | 0.00001 |
| 1 | |||||||||||
| 2 | |||||||||||
| 3 | |||||||||||
| 4 | |||||||||||
| 5 |
Excel formula:
=HUBER_REG({0;1;2;3;4;5}, {1.1,3,5.2,6.8,9.1,11}, 1.35, 200, 0.0001, TRUE, 0.00001)
Expected output:
{"type":"Double","basicValue":0.998686,"properties":{"training_r2":{"type":"Double","basicValue":0.998686},"mean_squared_error":{"type":"Double","basicValue":0.0150855},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.07755}],[{"type":"Double","basicValue":3.06653}],[{"type":"Double","basicValue":5.05551}],[{"type":"Double","basicValue":7.04449}],[{"type":"Double","basicValue":9.03347}],[{"type":"Double","basicValue":11.0224}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0224459}],[{"type":"Double","basicValue":-0.0665325}],[{"type":"Double","basicValue":0.144489}],[{"type":"Double","basicValue":-0.244489}],[{"type":"Double","basicValue":0.0665325}],[{"type":"Double","basicValue":-0.0224459}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.98898}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.07755}]]},"scale":{"type":"Double","basicValue":0.0647083},"outlier_count":{"type":"Double","basicValue":2},"outlier_mask":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}]]},"iteration_count":{"type":"Double","basicValue":14}}}
Example 3: Fit Huber regression with one strong target outlier
Inputs:
| data | target | epsilon | max_iter | alpha | fit_intercept | tol |
|---|---|---|---|---|---|---|
| 0 | 1 | 1.2 | 200 | 0.0001 | true | 0.00001 |
| 1 | 3 | |||||
| 2 | 5 | |||||
| 3 | 7 | |||||
| 4 | 9 | |||||
| 5 | 25 |
Excel formula:
=HUBER_REG({0;1;2;3;4;5}, {1;3;5;7;9;25}, 1.2, 200, 0.0001, TRUE, 0.00001)
Expected output:
{"type":"Double","basicValue":0.475,"properties":{"training_r2":{"type":"Double","basicValue":0.475},"mean_squared_error":{"type":"Double","basicValue":32.6667},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":9}],[{"type":"Double","basicValue":11}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":2.00533e-9}],[{"type":"Double","basicValue":-8.37614e-9}],[{"type":"Double","basicValue":-1.87576e-8}],[{"type":"Double","basicValue":-2.91391e-8}],[{"type":"Double","basicValue":-3.95206e-8}],[{"type":"Double","basicValue":14}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"scale":{"type":"Double","basicValue":2.3514e-8},"outlier_count":{"type":"Double","basicValue":3},"outlier_mask":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"iteration_count":{"type":"Double","basicValue":57}}}
Example 4: Fit a no-intercept Huber model on a two-feature plane
Inputs:
| data | target | epsilon | max_iter | alpha | fit_intercept | tol | |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 2.1 | 1.35 | 200 | 0.0001 | false | 0.00001 |
| 0 | 1 | 2.9 | |||||
| 1 | 1 | 5 | |||||
| 2 | 1 | 7.1 | |||||
| 1 | 2 | 8 | |||||
| 2 | 2 | 10.2 |
Excel formula:
=HUBER_REG({1,0;0,1;1,1;2,1;1,2;2,2}, {2.1;2.9;5;7.1;8;10.2}, 1.35, 200, 0.0001, FALSE, 0.00001)
Expected output:
{"type":"Double","basicValue":0.999605,"properties":{"training_r2":{"type":"Double","basicValue":0.999605},"mean_squared_error":{"type":"Double","basicValue":0.00320129},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2.0955}],[{"type":"Double","basicValue":2.9455}],[{"type":"Double","basicValue":5.04101}],[{"type":"Double","basicValue":7.13651}],[{"type":"Double","basicValue":7.98651}],[{"type":"Double","basicValue":10.082}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0.00449516}],[{"type":"Double","basicValue":-0.0455033}],[{"type":"Double","basicValue":-0.0410081}],[{"type":"Double","basicValue":-0.0365129}],[{"type":"Double","basicValue":0.0134886}],[{"type":"Double","basicValue":0.117984}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2.0955},{"type":"Double","basicValue":2.9455}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]},"scale":{"type":"Double","basicValue":0.0355769},"outlier_count":{"type":"Double","basicValue":1},"outlier_mask":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"iteration_count":{"type":"Double","basicValue":16}}}
Python Code
Show Code
import numpy as np
from sklearn.linear_model import HuberRegressor as SklearnHuberRegressor
def huber_reg(data, target, epsilon=1.35, max_iter=100, alpha=0.0001, fit_intercept=True, tol=1e-05):
"""
Fit a Huber regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.HuberRegressor.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
epsilon (float, optional): Threshold that controls when observations are treated as outliers. Default is 1.35.
max_iter (int, optional): Maximum number of optimizer iterations. Default is 100.
alpha (float, optional): Strength of the squared L2 regularization term. Default is 0.0001.
fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
tol (float, optional): Convergence tolerance for the optimizer. Default is 1e-05.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and robust-fit summary properties.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
if float(epsilon) < 1:
return "Error: epsilon must be at least 1"
if int(max_iter) < 1:
return "Error: max_iter must be at least 1"
if float(alpha) < 0:
return "Error: alpha must be non-negative"
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnHuberRegressor(
epsilon=float(epsilon),
max_iter=int(max_iter),
alpha=float(alpha),
fit_intercept=bool(fit_intercept),
tol=float(tol)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
outlier_mask = [bool(value) for value in np.asarray(fitted.outliers_).reshape(-1).tolist()]
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())},
"scale": {"type": "Double", "basicValue": float(fitted.scale_)},
"outlier_count": {"type": "Double", "basicValue": float(sum(outlier_mask))},
"outlier_mask": {"type": "Array", "elements": col(outlier_mask)},
"iteration_count": {"type": "Double", "basicValue": float(fitted.n_iter_)}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
LASSO_REG
Lasso regression adds an L1 penalty to squared-error fitting, which can shrink some coefficients exactly to zero and provide sparse linear models. It is useful when feature selection and prediction need to be combined in one estimator.
The model estimates coefficients by minimizing the objective:
\min_{w} \frac{1}{2 n_{\text{samples}}} ||y - Xw||^2_2 + \alpha ||w||_1
where \alpha is the L1 regularization strength.
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and learned coefficient arrays.
Excel Usage
=LASSO_REG(data, target, alpha, fit_intercept, max_iter, tol, coord_selection, random_state)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.alpha(float, optional, default: 1): L1 regularization strength applied to the regression model.fit_intercept(bool, optional, default: true): Whether to include an intercept term in the linear model.max_iter(int, optional, default: 1000): Maximum number of coordinate-descent iterations.tol(float, optional, default: 0.0001): Optimization tolerance used for convergence checks.coord_selection(str, optional, default: “cyclic”): Coordinate update order used by the optimizer.random_state(int, optional, default: null): Integer seed used when random coordinate selection is enabled. Leave blank for the estimator default.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted coefficient arrays.
Example 1: Fit lasso regression on a two-feature linear trend
Inputs:
| data | target | alpha | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0.01 | true | 5000 | 0.0001 | cyclic | 0 |
| 1 | 0 | 3 | ||||||
| 0 | 1 | 4 | ||||||
| 1 | 1 | 6 | ||||||
| 2 | 1 | 8 | ||||||
| 2 | 2 | 11 |
Excel formula:
=LASSO_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 0.01, TRUE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999979,"properties":{"training_r2":{"type":"Double","basicValue":0.999979},"mean_squared_error":{"type":"Double","basicValue":0.000231079},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.0204}],[{"type":"Double","basicValue":3.01371}],[{"type":"Double","basicValue":4.00395}],[{"type":"Double","basicValue":5.99726}],[{"type":"Double","basicValue":7.99057}],[{"type":"Double","basicValue":10.9741}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0204023}],[{"type":"Double","basicValue":-0.0137109}],[{"type":"Double","basicValue":-0.00394922}],[{"type":"Double","basicValue":0.00274219}],[{"type":"Double","basicValue":0.00943359}],[{"type":"Double","basicValue":0.0258867}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99331},{"type":"Double","basicValue":2.98355}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.0204}]]},"dual_gap":{"type":"Double","basicValue":0.000109009},"iteration_count":{"type":"Double","basicValue":10}}}
Example 2: Flatten a single-row numeric target range for lasso regression
Inputs:
| data | target | alpha | fit_intercept | max_iter | tol | coord_selection | random_state | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 3 | 5 | 7 | 9 | 11 | 0.01 | true | 5000 | 0.0001 | cyclic | 0 |
| 1 | ||||||||||||
| 2 | ||||||||||||
| 3 | ||||||||||||
| 4 | ||||||||||||
| 5 |
Excel formula:
=LASSO_REG({0;1;2;3;4;5}, {1,3,5,7,9,11}, 0.01, TRUE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999997,"properties":{"training_r2":{"type":"Double","basicValue":0.999997},"mean_squared_error":{"type":"Double","basicValue":0.0000342857},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.00857}],[{"type":"Double","basicValue":3.00514}],[{"type":"Double","basicValue":5.00171}],[{"type":"Double","basicValue":6.99829}],[{"type":"Double","basicValue":8.99486}],[{"type":"Double","basicValue":10.9914}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.00857143}],[{"type":"Double","basicValue":-0.00514286}],[{"type":"Double","basicValue":-0.00171429}],[{"type":"Double","basicValue":0.00171429}],[{"type":"Double","basicValue":0.00514286}],[{"type":"Double","basicValue":0.00857143}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99657}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.00857}]]},"dual_gap":{"type":"Double","basicValue":-2.80086e-19},"iteration_count":{"type":"Double","basicValue":2}}}
Example 3: Fit a no-intercept lasso model on a two-feature plane
Inputs:
| data | target | alpha | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 2 | 0.01 | false | 5000 | 0.0001 | cyclic | 0 |
| 0 | 1 | 3 | ||||||
| 1 | 1 | 5 | ||||||
| 2 | 1 | 7 | ||||||
| 1 | 2 | 8 | ||||||
| 2 | 2 | 10 |
Excel formula:
=LASSO_REG({1,0;0,1;1,1;2,1;1,2;2,2}, {2;3;5;7;8;10}, 0.01, FALSE, 5000, 0.0001, "cyclic", 0)
Expected output:
{"type":"Double","basicValue":0.999993,"properties":{"training_r2":{"type":"Double","basicValue":0.999993},"mean_squared_error":{"type":"Double","basicValue":0.0000582248},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99754}],[{"type":"Double","basicValue":2.99656}],[{"type":"Double","basicValue":4.9941}],[{"type":"Double","basicValue":6.99163}],[{"type":"Double","basicValue":7.99066}],[{"type":"Double","basicValue":9.98819}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0.00246392}],[{"type":"Double","basicValue":0.00343861}],[{"type":"Double","basicValue":0.00590253}],[{"type":"Double","basicValue":0.00836645}],[{"type":"Double","basicValue":0.00934114}],[{"type":"Double","basicValue":0.0118051}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99754},{"type":"Double","basicValue":2.99656}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]},"dual_gap":{"type":"Double","basicValue":0.000648993},"iteration_count":{"type":"Double","basicValue":22}}}
Example 4: Use random coordinate updates on correlated features
Inputs:
| data | target | alpha | fit_intercept | max_iter | tol | coord_selection | random_state | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0.1 | 1 | 0.05 | true | 5000 | 0.0001 | random | 0 |
| 1 | 0.9 | 3.1 | ||||||
| 2 | 2.1 | 5.8 | ||||||
| 3 | 2.9 | 7.9 | ||||||
| 4 | 4.1 | 10.2 | ||||||
| 5 | 4.9 | 12.1 |
Excel formula:
=LASSO_REG({0,0.1;1,0.9;2,2.1;3,2.9;4,4.1;5,4.9}, {1;3.1;5.8;7.9;10.2;12.1}, 0.05, TRUE, 5000, 0.0001, "random", 0)
Expected output:
{"type":"Double","basicValue":0.998857,"properties":{"training_r2":{"type":"Double","basicValue":0.998857},"mean_squared_error":{"type":"Double","basicValue":0.0169803},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.16159}],[{"type":"Double","basicValue":3.17116}],[{"type":"Double","basicValue":5.67855}],[{"type":"Double","basicValue":7.68812}],[{"type":"Double","basicValue":10.1955}],[{"type":"Double","basicValue":12.2051}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.161595}],[{"type":"Double","basicValue":-0.0711582}],[{"type":"Double","basicValue":0.121448}],[{"type":"Double","basicValue":0.211885}],[{"type":"Double","basicValue":0.00449154}],[{"type":"Double","basicValue":-0.105072}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.0139},{"type":"Double","basicValue":1.24457}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.03714}]]},"dual_gap":{"type":"Double","basicValue":0.00147917},"iteration_count":{"type":"Double","basicValue":1367}}}
Python Code
Show Code
import numpy as np
from sklearn.linear_model import Lasso as SklearnLasso
def lasso_reg(data, target, alpha=1, fit_intercept=True, max_iter=1000, tol=0.0001, coord_selection='cyclic', random_state=None):
"""
Fit a lasso regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
alpha (float, optional): L1 regularization strength applied to the regression model. Default is 1.
fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
max_iter (int, optional): Maximum number of coordinate-descent iterations. Default is 1000.
tol (float, optional): Optimization tolerance used for convergence checks. Default is 0.0001.
coord_selection (str, optional): Coordinate update order used by the optimizer. Valid options: Cyclic, Random. Default is 'cyclic'.
random_state (int, optional): Integer seed used when random coordinate selection is enabled. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted coefficient arrays.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
selection_value = str(coord_selection).strip().lower()
if selection_value not in {"cyclic", "random"}:
return "Error: coord_selection must be 'cyclic' or 'random'"
if float(alpha) < 0:
return "Error: alpha must be non-negative"
if int(max_iter) < 1:
return "Error: max_iter must be at least 1"
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnLasso(
alpha=float(alpha),
fit_intercept=bool(fit_intercept),
max_iter=int(max_iter),
tol=float(tol),
selection=selection_value,
random_state=None if random_state in (None, "") else int(random_state)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
dual_gap = float(np.atleast_1d(fitted.dual_gap_).reshape(-1)[0])
iteration_count = float(np.atleast_1d(fitted.n_iter_).reshape(-1)[0])
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())},
"dual_gap": {"type": "Double", "basicValue": dual_gap},
"iteration_count": {"type": "Double", "basicValue": iteration_count}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
LINEAR_REG
Linear regression estimates coefficients that minimize squared residual error between observed targets and a linear combination of the input features. It is the standard baseline regressor for continuous outcomes.
The ordinary least squares algorithm finds coefficients w that solve the unconstrained minimization problem:
\min_{w} ||y - Xw||^2_2
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and learned coefficient arrays.
Excel Usage
=LINEAR_REG(data, target, fit_intercept, tol, positive)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.fit_intercept(bool, optional, default: true): Whether to include an intercept term in the linear model.tol(float, optional, default: 0.000001): Solver tolerance used when scipy’s LSQR path is active.positive(bool, optional, default: false): Whether to constrain fitted coefficients to non-negative values.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted coefficient arrays.
Example 1: Fit linear regression on a two-feature linear trend
Inputs:
| data | target | fit_intercept | tol | positive | |
|---|---|---|---|---|---|
| 0 | 0 | 1 | true | 0.000001 | false |
| 1 | 0 | 3 | |||
| 0 | 1 | 4 | |||
| 1 | 1 | 6 | |||
| 2 | 1 | 8 | |||
| 2 | 2 | 11 |
Excel formula:
=LINEAR_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, TRUE, 0.000001, FALSE)
Expected output:
{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":0},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":6}],[{"type":"Double","basicValue":8}],[{"type":"Double","basicValue":11}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"matrix_rank":{"type":"Double","basicValue":2},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":2.34521}],[{"type":"Double","basicValue":1.1547}]]}}}
Example 2: Flatten a single-row numeric target range for linear regression
Inputs:
| data | target | fit_intercept | tol | positive | |||||
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 3 | 5 | 7 | 9 | 11 | true | 0.000001 | false |
| 1 | |||||||||
| 2 | |||||||||
| 3 | |||||||||
| 4 | |||||||||
| 5 |
Excel formula:
=LINEAR_REG({0;1;2;3;4;5}, {1,3,5,7,9,11}, TRUE, 0.000001, FALSE)
Expected output:
{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":2.95823e-31},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":9}],[{"type":"Double","basicValue":11}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-8.88178e-16}],[{"type":"Double","basicValue":-4.44089e-16}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":8.88178e-16}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"matrix_rank":{"type":"Double","basicValue":1},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":4.1833}]]}}}
Example 3: Fit a no-intercept linear model on a two-feature plane
Inputs:
| data | target | fit_intercept | tol | positive | |
|---|---|---|---|---|---|
| 1 | 0 | 2 | false | 0.000001 | false |
| 0 | 1 | 3 | |||
| 1 | 1 | 5 | |||
| 2 | 1 | 7 | |||
| 1 | 2 | 8 | |||
| 2 | 2 | 10 |
Excel formula:
=LINEAR_REG({1,0;0,1;1,1;2,1;1,2;2,2}, {2;3;5;7;8;10}, FALSE, 0.000001, FALSE)
Expected output:
{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":1.35503e-29},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":8}],[{"type":"Double","basicValue":10}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":1.9984e-15}],[{"type":"Double","basicValue":8.88178e-16}],[{"type":"Double","basicValue":2.66454e-15}],[{"type":"Double","basicValue":5.32907e-15}],[{"type":"Double","basicValue":3.55271e-15}],[{"type":"Double","basicValue":5.32907e-15}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]},"matrix_rank":{"type":"Double","basicValue":2},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":4.47214}],[{"type":"Double","basicValue":1.41421}]]}}}
Example 4: Constrain coefficients to be non-negative during fitting
Inputs:
| data | target | fit_intercept | tol | positive | |
|---|---|---|---|---|---|
| 0 | 0 | 2 | true | 0.000001 | true |
| 1 | 0 | 4 | |||
| 0 | 1 | 5 | |||
| 1 | 1 | 7 | |||
| 2 | 1 | 9 | |||
| 2 | 2 | 12 |
Excel formula:
=LINEAR_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {2;4;5;7;9;12}, TRUE, 0.000001, TRUE)
Expected output:
{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":0},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":9}],[{"type":"Double","basicValue":12}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]}}}
Python Code
Show Code
import numpy as np
from sklearn.linear_model import LinearRegression as SklearnLinearRegression
def linear_reg(data, target, fit_intercept=True, tol=1e-06, positive=False):
"""
Fit an ordinary least-squares linear regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
tol (float, optional): Solver tolerance used when scipy's LSQR path is active. Default is 1e-06.
positive (bool, optional): Whether to constrain fitted coefficients to non-negative values. Default is False.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted coefficient arrays.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnLinearRegression(
fit_intercept=bool(fit_intercept),
tol=float(tol),
positive=bool(positive)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
properties = {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
}
if hasattr(fitted, "rank_"):
properties["matrix_rank"] = {"type": "Double", "basicValue": float(fitted.rank_)}
if hasattr(fitted, "singular_"):
properties["singular_values"] = {"type": "Array", "elements": col(np.atleast_1d(fitted.singular_).tolist())}
return {
"type": "Double",
"basicValue": training_r2,
"properties": properties
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
RF_REGRESS
Random forest regression averages predictions from many decision trees trained on resampled data and feature subsets. It is a strong nonlinear default for tabular regression and exposes feature-importance estimates.
The ensemble averages the individual tree predictions to reduce variance:
F(x) = \frac{1}{B} \sum_{b=1}^{B} T_b(x)
where B is the number of trees and T_b represents each decision tree fitted on a bootstrap sample.
This wrapper accepts rows as samples and a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and fitted feature importances.
Excel Usage
=RF_REGRESS(data, target, n_estimators, rf_reg_criterion, max_depth, min_samples_leaf, random_state)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.n_estimators(int, optional, default: 100): Number of trees in the forest.rf_reg_criterion(str, optional, default: “squared_error”): Split quality measure used by each decision tree.max_depth(int, optional, default: null): Maximum depth of each tree. Leave blank for unconstrained depth.min_samples_leaf(int, optional, default: 1): Minimum number of samples required in each leaf.random_state(int, optional, default: null): Integer seed for reproducible tree sampling. Leave blank for the estimator default.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted feature importances.
Example 1: Fit a random forest regressor on a two-feature linear trend
Inputs:
| data | target | n_estimators | rf_reg_criterion | max_depth | min_samples_leaf | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 50 | squared_error | 4 | 1 | 0 |
| 1 | 0 | 3 | |||||
| 0 | 1 | 4 | |||||
| 1 | 1 | 6 | |||||
| 2 | 1 | 8 | |||||
| 2 | 2 | 11 |
Excel formula:
=RF_REGRESS({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 50, "squared_error", 4, 1, 0)
Expected output:
{"type":"Double","basicValue":0.962113,"properties":{"training_r2":{"type":"Double","basicValue":0.962113},"mean_squared_error":{"type":"Double","basicValue":0.4136},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.88}],[{"type":"Double","basicValue":3.44}],[{"type":"Double","basicValue":4.6}],[{"type":"Double","basicValue":5.52}],[{"type":"Double","basicValue":7.96}],[{"type":"Double","basicValue":10.04}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.88}],[{"type":"Double","basicValue":-0.44}],[{"type":"Double","basicValue":-0.6}],[{"type":"Double","basicValue":0.48}],[{"type":"Double","basicValue":0.04}],[{"type":"Double","basicValue":0.96}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.515282}],[{"type":"Double","basicValue":0.484718}]]},"estimator_count":{"type":"Double","basicValue":50}}}
Example 2: Flatten a single-row numeric target range for random forest regression
Inputs:
| data | target | n_estimators | rf_reg_criterion | max_depth | min_samples_leaf | random_state | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 4 | 9 | 16 | 25 | 50 | squared_error | 5 | 1 | 0 |
| 1 | |||||||||||
| 2 | |||||||||||
| 3 | |||||||||||
| 4 | |||||||||||
| 5 |
Excel formula:
=RF_REGRESS({0;1;2;3;4;5}, {0,1,4,9,16,25}, 50, "squared_error", 5, 1, 0)
Expected output:
{"type":"Double","basicValue":0.974305,"properties":{"training_r2":{"type":"Double","basicValue":0.974305},"mean_squared_error":{"type":"Double","basicValue":2.03347},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.88}],[{"type":"Double","basicValue":1.32}],[{"type":"Double","basicValue":4.04}],[{"type":"Double","basicValue":7.98}],[{"type":"Double","basicValue":14.72}],[{"type":"Double","basicValue":22.06}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.88}],[{"type":"Double","basicValue":-0.32}],[{"type":"Double","basicValue":-0.04}],[{"type":"Double","basicValue":1.02}],[{"type":"Double","basicValue":1.28}],[{"type":"Double","basicValue":2.94}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":50}}}
Example 3: Use absolute-error splits for one-feature regression
Inputs:
| data | target | n_estimators | rf_reg_criterion | max_depth | min_samples_leaf | random_state |
|---|---|---|---|---|---|---|
| 0 | 1 | 50 | absolute_error | 4 | 1 | 0 |
| 1 | 3 | |||||
| 2 | 5 | |||||
| 3 | 7 | |||||
| 4 | 9 | |||||
| 5 | 11 |
Excel formula:
=RF_REGRESS({0;1;2;3;4;5}, {1;3;5;7;9;11}, 50, "absolute_error", 4, 1, 0)
Expected output:
{"type":"Double","basicValue":0.974446,"properties":{"training_r2":{"type":"Double","basicValue":0.974446},"mean_squared_error":{"type":"Double","basicValue":0.298133},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.96}],[{"type":"Double","basicValue":2.84}],[{"type":"Double","basicValue":4.84}],[{"type":"Double","basicValue":6.56}],[{"type":"Double","basicValue":8.6}],[{"type":"Double","basicValue":10.32}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.96}],[{"type":"Double","basicValue":0.16}],[{"type":"Double","basicValue":0.16}],[{"type":"Double","basicValue":0.44}],[{"type":"Double","basicValue":0.4}],[{"type":"Double","basicValue":0.68}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":50}}}
Example 4: Fit a random forest regressor on a piecewise two-feature target
Inputs:
| data | target | n_estimators | rf_reg_criterion | max_depth | min_samples_leaf | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 50 | friedman_mse | 4 | 1 | 0 |
| 0 | 1 | 1 | |||||
| 1 | 0 | 1 | |||||
| 1 | 1 | 3 | |||||
| 2 | 1 | 5 | |||||
| 2 | 2 | 8 |
Excel formula:
=RF_REGRESS({0,0;0,1;1,0;1,1;2,1;2,2}, {0;1;1;3;5;8}, 50, "friedman_mse", 4, 1, 0)
Expected output:
{"type":"Double","basicValue":0.960078,"properties":{"training_r2":{"type":"Double","basicValue":0.960078},"mean_squared_error":{"type":"Double","basicValue":0.306067},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.4}],[{"type":"Double","basicValue":1.5}],[{"type":"Double","basicValue":1.38}],[{"type":"Double","basicValue":2.4}],[{"type":"Double","basicValue":4.98}],[{"type":"Double","basicValue":7.04}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.4}],[{"type":"Double","basicValue":-0.5}],[{"type":"Double","basicValue":-0.38}],[{"type":"Double","basicValue":0.6}],[{"type":"Double","basicValue":0.02}],[{"type":"Double","basicValue":0.96}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.681547}],[{"type":"Double","basicValue":0.318453}]]},"estimator_count":{"type":"Double","basicValue":50}}}
Python Code
Show Code
import numpy as np
from sklearn.ensemble import RandomForestRegressor as SklearnRandomForestRegressor
def rf_regress(data, target, n_estimators=100, rf_reg_criterion='squared_error', max_depth=None, min_samples_leaf=1, random_state=None):
"""
Fit a random forest regressor and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
n_estimators (int, optional): Number of trees in the forest. Default is 100.
rf_reg_criterion (str, optional): Split quality measure used by each decision tree. Valid options: Squared Error, Absolute Error, Friedman MSE. Default is 'squared_error'.
max_depth (int, optional): Maximum depth of each tree. Leave blank for unconstrained depth. Default is None.
min_samples_leaf (int, optional): Minimum number of samples required in each leaf. Default is 1.
random_state (int, optional): Integer seed for reproducible tree sampling. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted feature importances.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
if int(n_estimators) < 1:
return "Error: n_estimators must be at least 1"
criterion_value = str(rf_reg_criterion).strip().lower()
if criterion_value not in {"squared_error", "absolute_error", "friedman_mse"}:
return "Error: rf_reg_criterion must be 'squared_error', 'absolute_error', or 'friedman_mse'"
depth = None if max_depth in (None, "") else int(max_depth)
if depth is not None and depth < 1:
return "Error: max_depth must be at least 1 when provided"
if int(min_samples_leaf) < 1:
return "Error: min_samples_leaf must be at least 1"
fitted = SklearnRandomForestRegressor(
n_estimators=int(n_estimators),
criterion=criterion_value,
max_depth=depth,
min_samples_leaf=int(min_samples_leaf),
random_state=None if random_state in (None, "") else int(random_state)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"feature_importances": {"type": "Array", "elements": col(fitted.feature_importances_.tolist())},
"estimator_count": {"type": "Double", "basicValue": float(len(fitted.estimators_))}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
RIDGE_REG
Ridge regression adds an L2 penalty to ordinary least squares to stabilize coefficient estimates when features are noisy or correlated. It is a common regularized baseline for continuous prediction problems.
The model addresses multicollinearity by minimizing the penalized residual sum of squares:
\min_{w} ||y - Xw||^2_2 + \alpha ||w||^2_2
where \alpha \ge 0 is the complexity parameter controlling the amount of shrinkage.
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and learned coefficient arrays.
Excel Usage
=RIDGE_REG(data, target, alpha, ridge_solver, fit_intercept, tol, random_state)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.alpha(float, optional, default: 1): L2 regularization strength applied to the regression model.ridge_solver(str, optional, default: “auto”): Linear algebra solver used to fit the ridge model.fit_intercept(bool, optional, default: true): Whether to include an intercept term in the linear model.tol(float, optional, default: 0.0001): Convergence tolerance for iterative solvers.random_state(int, optional, default: null): Integer seed for stochastic solvers. Leave blank for the estimator default.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted coefficient arrays.
Example 1: Fit ridge regression on a two-feature linear trend
Inputs:
| data | target | alpha | ridge_solver | fit_intercept | tol | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 1 | auto | true | 0.0001 | 0 |
| 1 | 0 | 3 | |||||
| 0 | 1 | 4 | |||||
| 1 | 1 | 6 | |||||
| 2 | 1 | 8 | |||||
| 2 | 2 | 11 |
Excel formula:
=RIDGE_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 1, "auto", TRUE, 0.0001, 0)
Expected output:
{"type":"Double","basicValue":0.971641,"properties":{"training_r2":{"type":"Double","basicValue":0.971641},"mean_squared_error":{"type":"Double","basicValue":0.309584},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.71429}],[{"type":"Double","basicValue":3.6044}],[{"type":"Double","basicValue":3.98901}],[{"type":"Double","basicValue":5.87912}],[{"type":"Double","basicValue":7.76923}],[{"type":"Double","basicValue":10.044}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.714286}],[{"type":"Double","basicValue":-0.604396}],[{"type":"Double","basicValue":0.010989}],[{"type":"Double","basicValue":0.120879}],[{"type":"Double","basicValue":0.230769}],[{"type":"Double","basicValue":0.956044}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.89011},{"type":"Double","basicValue":2.27473}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.71429}]]}}}
Example 2: Flatten a single-row numeric target range for ridge regression
Inputs:
| data | target | alpha | ridge_solver | fit_intercept | tol | random_state | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 3 | 5 | 7 | 9 | 11 | 0.5 | auto | true | 0.0001 | 0 |
| 1 | |||||||||||
| 2 | |||||||||||
| 3 | |||||||||||
| 4 | |||||||||||
| 5 |
Excel formula:
=RIDGE_REG({0;1;2;3;4;5}, {1,3,5,7,9,11}, 0.5, "auto", TRUE, 0.0001, 0)
Expected output:
{"type":"Double","basicValue":0.999228,"properties":{"training_r2":{"type":"Double","basicValue":0.999228},"mean_squared_error":{"type":"Double","basicValue":0.00900206},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.13889}],[{"type":"Double","basicValue":3.08333}],[{"type":"Double","basicValue":5.02778}],[{"type":"Double","basicValue":6.97222}],[{"type":"Double","basicValue":8.91667}],[{"type":"Double","basicValue":10.8611}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.138889}],[{"type":"Double","basicValue":-0.0833333}],[{"type":"Double","basicValue":-0.0277778}],[{"type":"Double","basicValue":0.0277778}],[{"type":"Double","basicValue":0.0833333}],[{"type":"Double","basicValue":0.138889}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.94444}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.13889}]]}}}
Example 3: Fit a no-intercept ridge model on a two-feature plane
Inputs:
| data | target | alpha | ridge_solver | fit_intercept | tol | random_state | |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 2 | 0.2 | svd | false | 0.0001 | 0 |
| 0 | 1 | 3 | |||||
| 1 | 1 | 5 | |||||
| 2 | 1 | 7 | |||||
| 1 | 2 | 8 | |||||
| 2 | 2 | 10 |
Excel formula:
=RIDGE_REG({1,0;0,1;1,1;2,1;1,2;2,2}, {2;3;5;7;8;10}, 0.2, "svd", FALSE, 0.0001, 0)
Expected output:
{"type":"Double","basicValue":0.9993,"properties":{"training_r2":{"type":"Double","basicValue":0.9993},"mean_squared_error":{"type":"Double","basicValue":0.00546198},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2.0207}],[{"type":"Double","basicValue":2.92979}],[{"type":"Double","basicValue":4.9505}],[{"type":"Double","basicValue":6.9712}],[{"type":"Double","basicValue":7.88029}],[{"type":"Double","basicValue":9.90099}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0207021}],[{"type":"Double","basicValue":0.070207}],[{"type":"Double","basicValue":0.049505}],[{"type":"Double","basicValue":0.0288029}],[{"type":"Double","basicValue":0.119712}],[{"type":"Double","basicValue":0.0990099}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2.0207},{"type":"Double","basicValue":2.92979}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]}}}
Example 4: Fit ridge regression on correlated features with the LSQR solver
Inputs:
| data | target | alpha | ridge_solver | fit_intercept | tol | random_state | |
|---|---|---|---|---|---|---|---|
| 0 | 0.1 | 1 | 0.3 | lsqr | true | 0.0001 | 0 |
| 1 | 0.9 | 3.1 | |||||
| 2 | 2.1 | 5.8 | |||||
| 3 | 2.9 | 7.9 | |||||
| 4 | 4.1 | 10.2 | |||||
| 5 | 4.9 | 12.1 |
Excel formula:
=RIDGE_REG({0,0.1;1,0.9;2,2.1;3,2.9;4,4.1;5,4.9}, {1;3.1;5.8;7.9;10.2;12.1}, 0.3, "lsqr", TRUE, 0.0001, 0)
Expected output:
{"type":"Double","basicValue":0.998837,"properties":{"training_r2":{"type":"Double","basicValue":0.998837},"mean_squared_error":{"type":"Double","basicValue":0.0172674},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.1603}],[{"type":"Double","basicValue":3.18951}],[{"type":"Double","basicValue":5.66873}],[{"type":"Double","basicValue":7.69794}],[{"type":"Double","basicValue":10.1772}],[{"type":"Double","basicValue":12.2064}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.1603}],[{"type":"Double","basicValue":-0.0895065}],[{"type":"Double","basicValue":0.13127}],[{"type":"Double","basicValue":0.202064}],[{"type":"Double","basicValue":0.0228398}],[{"type":"Double","basicValue":-0.106366}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.12917},{"type":"Double","basicValue":1.12504}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.0478}]]}}}
Python Code
Show Code
import numpy as np
from sklearn.linear_model import Ridge as SklearnRidge
def ridge_reg(data, target, alpha=1, ridge_solver='auto', fit_intercept=True, tol=0.0001, random_state=None):
"""
Fit a ridge regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
alpha (float, optional): L2 regularization strength applied to the regression model. Default is 1.
ridge_solver (str, optional): Linear algebra solver used to fit the ridge model. Valid options: Auto, SVD, LSQR, SAG, SAGA. Default is 'auto'.
fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
tol (float, optional): Convergence tolerance for iterative solvers. Default is 0.0001.
random_state (int, optional): Integer seed for stochastic solvers. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and fitted coefficient arrays.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
solver_value = str(ridge_solver).strip().lower()
if solver_value not in {"auto", "svd", "lsqr", "sag", "saga"}:
return "Error: ridge_solver must be 'auto', 'svd', 'lsqr', 'sag', or 'saga'"
if float(alpha) < 0:
return "Error: alpha must be non-negative"
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnRidge(
alpha=float(alpha),
solver=solver_value,
fit_intercept=bool(fit_intercept),
tol=float(tol),
random_state=None if random_state in (None, "") else int(random_state)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
return {
"type": "Double",
"basicValue": training_r2,
"properties": {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
}
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator
SVR_REGRESS
Support vector regression fits a margin-based model for continuous targets and can use linear or nonlinear kernels to capture curved relationships. It is a practical choice when a flexible predictor is needed on modestly sized tabular datasets.
The regressor constructs a tube of radius \epsilon and minimizes the dual objective:
\min_{w} \frac{1}{2} ||w||^2_2 + C \sum_{i=1}^n \max(0, |y_i - f(x_i)| - \epsilon)
where C is the regularization parameter penalizing predictions that fall outside the \epsilon-tube.
This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and support-vector summary properties.
Excel Usage
=SVR_REGRESS(data, target, C, svr_kernel, degree, svr_gamma, epsilon, tol)
data(list[list], required): 2D array of numeric feature data with rows as samples and columns as features.target(list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.C(float, optional, default: 1): Regularization parameter controlling the penalty for margin violations.svr_kernel(str, optional, default: “rbf”): Kernel function used to build the regression model.degree(int, optional, default: 3): Polynomial degree when the polynomial kernel is used.svr_gamma(str, optional, default: “scale”): Gamma scaling mode for non-linear kernels.epsilon(float, optional, default: 0.1): Width of the epsilon-insensitive tube around the fitted prediction function.tol(float, optional, default: 0.001): Convergence tolerance for the optimizer.
Returns (dict): Excel data type containing training R^2, predictions, residuals, and support-vector summary properties.
Example 1: Fit linear support vector regression on a one-feature trend
Inputs:
| data | target | C | svr_kernel | degree | svr_gamma | epsilon | tol |
|---|---|---|---|---|---|---|---|
| 0 | 1 | 10 | linear | 3 | scale | 0.01 | 0.001 |
| 1 | 3 | ||||||
| 2 | 5 | ||||||
| 3 | 7 | ||||||
| 4 | 9 | ||||||
| 5 | 11 |
Excel formula:
=SVR_REGRESS({0;1;2;3;4;5}, {1;3;5;7;9;11}, 10, "linear", 3, "scale", 0.01, 0.001)
Expected output:
{"type":"Double","basicValue":0.999996,"properties":{"training_r2":{"type":"Double","basicValue":0.999996},"mean_squared_error":{"type":"Double","basicValue":0.0000466667},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01}],[{"type":"Double","basicValue":3.006}],[{"type":"Double","basicValue":5.002}],[{"type":"Double","basicValue":6.998}],[{"type":"Double","basicValue":8.994}],[{"type":"Double","basicValue":10.99}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.01}],[{"type":"Double","basicValue":-0.006}],[{"type":"Double","basicValue":-0.002}],[{"type":"Double","basicValue":0.002}],[{"type":"Double","basicValue":0.006}],[{"type":"Double","basicValue":0.01}]]},"support_vector_count":{"type":"Double","basicValue":2},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":5}]]},"support_vectors":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":5}]]},"dual_coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.3992},{"type":"Double","basicValue":0.3992}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.996}]]}}}
Example 2: Fit RBF support vector regression on a curved target pattern
Inputs:
| data | target | C | svr_kernel | degree | svr_gamma | epsilon | tol | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0.8 | 1.2 | 0.8 | 0.1 | -0.6 | 20 | rbf | 3 | scale | 0.01 | 0.001 |
| 0.5 | ||||||||||||
| 1 | ||||||||||||
| 1.5 | ||||||||||||
| 2 | ||||||||||||
| 2.5 |
Excel formula:
=SVR_REGRESS({0;0.5;1;1.5;2;2.5}, {0,0.8,1.2,0.8,0.1,-0.6}, 20, "rbf", 3, "scale", 0.01, 0.001)
Expected output:
{"type":"Double","basicValue":0.99972,"properties":{"training_r2":{"type":"Double","basicValue":0.99972},"mean_squared_error":{"type":"Double","basicValue":0.000103229},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0101623}],[{"type":"Double","basicValue":0.790349}],[{"type":"Double","basicValue":1.18936}],[{"type":"Double","basicValue":0.810162}],[{"type":"Double","basicValue":0.0898215}],[{"type":"Double","basicValue":-0.589856}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0101623}],[{"type":"Double","basicValue":0.00965111}],[{"type":"Double","basicValue":0.0106389}],[{"type":"Double","basicValue":-0.0101623}],[{"type":"Double","basicValue":0.0101785}],[{"type":"Double","basicValue":-0.0101439}]]},"support_vector_count":{"type":"Double","basicValue":6},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":6}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":5}]]},"support_vectors":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0.5}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.5}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":2.5}]]},"dual_coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.71684},{"type":"Double","basicValue":0.486863},{"type":"Double","basicValue":1.14565},{"type":"Double","basicValue":-0.630659},{"type":"Double","basicValue":1.08314},{"type":"Double","basicValue":-1.36815}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0.115342}]]}}}
Example 3: Fit linear support vector regression on a two-feature plane
Inputs:
| data | target | C | svr_kernel | degree | svr_gamma | epsilon | tol | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 10 | linear | 3 | scale | 0.01 | 0.001 |
| 1 | 0 | 3 | ||||||
| 0 | 1 | 4 | ||||||
| 1 | 1 | 6 | ||||||
| 2 | 1 | 8 | ||||||
| 2 | 2 | 11 |
Excel formula:
=SVR_REGRESS({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, 10, "linear", 3, "scale", 0.01, 0.001)
Expected output:
{"type":"Double","basicValue":0.999995,"properties":{"training_r2":{"type":"Double","basicValue":0.999995},"mean_squared_error":{"type":"Double","basicValue":0.0000507931},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01022}],[{"type":"Double","basicValue":3.00989}],[{"type":"Double","basicValue":4.00039}],[{"type":"Double","basicValue":6.00006}],[{"type":"Double","basicValue":7.99972}],[{"type":"Double","basicValue":10.9899}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0102226}],[{"type":"Double","basicValue":-0.0098887}],[{"type":"Double","basicValue":-0.000389564}],[{"type":"Double","basicValue":-0.000055652}],[{"type":"Double","basicValue":0.00027826}],[{"type":"Double","basicValue":0.0101113}]]},"support_vector_count":{"type":"Double","basicValue":3},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":3}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":5}]]},"support_vectors":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":2},{"type":"Double","basicValue":2}]]},"dual_coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.504583},{"type":"Double","basicValue":-0.990501},{"type":"Double","basicValue":1.49508}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01022}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.99967},{"type":"Double","basicValue":2.99017}]]}}}
Example 4: Fit polynomial support vector regression on a quadratic curve
Inputs:
| data | target | C | svr_kernel | degree | svr_gamma | epsilon | tol |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 50 | poly | 2 | auto | 0.01 | 0.001 |
| 1 | 1 | ||||||
| 2 | 4 | ||||||
| 3 | 9 | ||||||
| 4 | 16 | ||||||
| 5 | 25 |
Excel formula:
=SVR_REGRESS({0;1;2;3;4;5}, {0;1;4;9;16;25}, 50, "poly", 2, "auto", 0.01, 0.001)
Expected output:
{"type":"Double","basicValue":0.999999,"properties":{"training_r2":{"type":"Double","basicValue":0.999999},"mean_squared_error":{"type":"Double","basicValue":0.00005776},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0.01}],[{"type":"Double","basicValue":1.0092}],[{"type":"Double","basicValue":4.0068}],[{"type":"Double","basicValue":9.0028}],[{"type":"Double","basicValue":15.9972}],[{"type":"Double","basicValue":24.99}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.01}],[{"type":"Double","basicValue":-0.0092}],[{"type":"Double","basicValue":-0.0068}],[{"type":"Double","basicValue":-0.0028}],[{"type":"Double","basicValue":0.0028}],[{"type":"Double","basicValue":0.01}]]},"support_vector_count":{"type":"Double","basicValue":2},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":5}]]},"support_vectors":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":5}]]},"dual_coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.039968},{"type":"Double","basicValue":0.039968}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0.01}]]}}}
Python Code
Show Code
import numpy as np
from sklearn.svm import SVR as SklearnSVR
def svr_regress(data, target, C=1, svr_kernel='rbf', degree=3, svr_gamma='scale', epsilon=0.1, tol=0.001):
"""
Fit a support vector regression model and return training predictions.
See: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
C (float, optional): Regularization parameter controlling the penalty for margin violations. Default is 1.
svr_kernel (str, optional): Kernel function used to build the regression model. Valid options: RBF, Linear, Polynomial, Sigmoid. Default is 'rbf'.
degree (int, optional): Polynomial degree when the polynomial kernel is used. Default is 3.
svr_gamma (str, optional): Gamma scaling mode for non-linear kernels. Valid options: Scale, Auto. Default is 'scale'.
epsilon (float, optional): Width of the epsilon-insensitive tube around the fitted prediction function. Default is 0.1.
tol (float, optional): Convergence tolerance for the optimizer. Default is 0.001.
Returns:
dict: Excel data type containing training $R^2$, predictions, residuals, and support-vector summary properties.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
return data_np, None
def parse_target(value, sample_count):
if not isinstance(value, list):
labels = [value]
elif not value:
return None, "Error: target must be non-empty"
elif all(not isinstance(item, list) for item in value):
labels = value
elif len(value) == 1:
labels = value[0]
elif all(isinstance(row, list) and len(row) == 1 for row in value):
labels = [row[0] for row in value]
else:
return None, "Error: target must be a single row or column"
if len(labels) != sample_count:
return None, "Error: target length must match sample count"
parsed = []
for item in labels:
item = py(item)
if isinstance(item, bool) or not isinstance(item, (int, float)):
return None, "Error: target values must be finite numeric scalars"
if not np.isfinite(float(item)):
return None, "Error: target values must be finite numeric scalars"
parsed.append(float(item))
return np.array(parsed, dtype=float), None
def flat_float_list(values):
return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]
try:
data_np, error = parse_data(data)
if error:
return error
target_np, error = parse_target(target, data_np.shape[0])
if error:
return error
if float(C) <= 0:
return "Error: C must be greater than 0"
kernel_value = str(svr_kernel).strip().lower()
if kernel_value not in {"rbf", "linear", "poly", "sigmoid"}:
return "Error: svr_kernel must be 'rbf', 'linear', 'poly', or 'sigmoid'"
if int(degree) < 1:
return "Error: degree must be at least 1"
gamma_value = str(svr_gamma).strip().lower()
if gamma_value not in {"scale", "auto"}:
return "Error: svr_gamma must be 'scale' or 'auto'"
if float(epsilon) < 0:
return "Error: epsilon must be non-negative"
if float(tol) <= 0:
return "Error: tol must be greater than 0"
fitted = SklearnSVR(
C=float(C),
kernel=kernel_value,
degree=int(degree),
gamma=gamma_value,
epsilon=float(epsilon),
tol=float(tol)
).fit(data_np, target_np)
prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
residual_array = target_np - prediction_array
predictions = flat_float_list(prediction_array)
residuals = flat_float_list(residual_array)
training_r2 = float(fitted.score(data_np, target_np))
mse = float(np.mean(np.square(residual_array)))
properties = {
"training_r2": {"type": "Double", "basicValue": training_r2},
"mean_squared_error": {"type": "Double", "basicValue": mse},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"predictions": {"type": "Array", "elements": col(predictions)},
"residuals": {"type": "Array", "elements": col(residuals)},
"support_vector_count": {"type": "Double", "basicValue": float(len(fitted.support_))},
"support_counts": {"type": "Array", "elements": col(np.atleast_1d(fitted.n_support_).tolist())},
"support_indices": {"type": "Array", "elements": col(fitted.support_.tolist())},
"support_vectors": {"type": "Array", "elements": mat(fitted.support_vectors_.tolist())},
"dual_coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.dual_coef_).tolist())},
"intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
}
if kernel_value == "linear":
properties["coefficients"] = {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())}
return {
"type": "Double",
"basicValue": training_r2,
"properties": properties
}
except Exception as e:
return f"Error: {str(e)}"Online Calculator