RANDSUBSET Select a random subset of samples
SUB=RANDSUBSET(DATA,PAR)
[SUB,REST,IND1,IND2]=RANDSUBSET(DATA,PAR,options)
SUB=RANDSUBSET(DATA,PROP,PAR) % sample the property PROP
[SUB,REST]=RANDSUBSET(DATA) % bootstrapped sampling of DATA
INPUT
DATA Data set
PAR Selection parameter (scalar or vector)
01 Number of samples per class
PROP Name of property to sample (default: 'lab')
OUTPUT
SUB Selected subset
REST Remainder of the DATA
IND1,IND2 Indices of the samples in SUB and REST
OPTIONS
'all' Select the subset from a complete data set
'atmax' Take at maximum the specified number of samples
DESCRIPTION
RANDSUBSET selects a random subset of samples in DATA. The Selection is
performed per-class by default. If PAR is a vector, it specifies
selection for each class separately (zero entries are supported). To
select fraction or number of samples from the complete data set,
add 'all' option:
SUB=RANDSUBSET(DATA,PAR,'all')
If property name PROP is specified, RANDSUBSET works on this property.
For example, to select random 100 samples per-patient, use:
SUB=RANDSUBSET(DATA,'patient',100)
Note that the PROP property must be indexed (SDLAB object).
To make sure the subset SUB does not contain more than M samples, use
the 'atmax' option.
If RANDSUBSET is given only DATA, it performs bootstrap sampling,
i.e. sampling with replacement. The SUB set then contains the same number
of samples as DATA but some of them will be present multiple times. The
REST set contains out-of-bootstrap samples, not present in SUB.