Friedman-1 synthetic dataset

Generate a synthetic dataset using the Friedman-1 function implemented in StatSim

In [1]:
from iframer import *

Friedman-1 model

This dataset was described in the Multivariate Adaptive Regression Splines paper by Jerome H. Friedman in 1991. All features $\mathbf{x}$ are independent uniform random variables taking values from 0 to 1. The output $y$ is calculated with the formula:

$$ f(\mathbf{x}) = 10 sin(\pi x_1 x_2) + 20 (x_3 - 0.5)^2 + 10 x_4 + 5 x_5 + \mathcal{N}(0, sigma) $$

By default ten $\mathbf{x}$ variables are generated, from which only first 5 are used in the formula. That makes this dataset useful for testing feature selection methods too.

After generating the data click Download (CSV) to download the dataset in the CSV format.

Also availaible in:

By Anton Zemlyansky in