What are holdout choice sets and holdout respondents?


To check the predictive accuracy and foster confidence in the results of conjoint analysis, Conjointly can incorporate two model quality control mechanisms into the estimation. Both work by keeping some data away from the model while it is being built, then checking how well the finished model predicts that “unseen” data.

  • Holdout choice sets test how well the model predicts choices that the same respondents actually made.
  • Holdout respondents test how well the model predicts the choices of people whose data played no part in building the model at all.

How to enable holdout validation

Under Advanced settings, in the Analytics options tab for your conjoint block, holdout validation is a single setting where you choose one of three options:

OptionDescriptionAvailability
No holdoutThe system uses all data for estimation.Always (default setting)
Hold out some respondents for validationThe system sets aside (randomly assigned) 5% of respondents whose data is held out entirely from the Hierarchical Bayes (HB) estimation. These respondents are used exclusively for the final testing of the completed model.Only if there are 50 or more respondents in the analysis and there are no covariates
Hold out some choice sets for validationThe system excludes some choice sets from the model estimation process to test its predictive power on “unseen” data from the same respondents. Specifically, one choice set is held out from (randomly assigned) half of the respondents, while the other half of respondents are unaffected.Only if there are 50 or more respondents in the analysis

Because it is a single selection, you can apply at most one of the two holdout methods to a study.

The holdout group is drawn at random each time the model is estimated. As a result, the specific respondents or choice sets that are held out, and therefore the exact validation metrics, can shift slightly if the report is recalculated.

Validation from holdout choice sets

This validation is shown in your survey report only when the Hold out some choice sets for validation option is selected and there are 50 or more respondents in the analysis. It reports two metrics.

Hit rate (first-choice modelling)

For every held-out choice set, the respondent’s own coefficients are used to calculate the utility of each option. The option with the highest calculated utility is flagged as the “predicted choice”. The hit rate is then the share of held-out tasks where the predicted choice matched the option the respondent actually chose:

$$ \text{Hit Rate} = \left( \frac{\text{Number of correct predictions}}{\text{Total number of holdout tasks}} \right) \times 100\% $$

Reference bands for interpreting the hit rate depend on the number of options in the choice sets.

Mean absolute error (share-of-preference modelling)

For each unique held-out choice set, Conjointly calculates the distribution of answers actually given, simulates the share of preference for each option from the individuals’ coefficients, and computes the mean absolute error across the options in that set:

$$ \text{MAE}_\text{set} = \frac{1}{n} \sum_{i=1}^{n} \left| \text{Observed}_i - \text{Simulated}_i \right| $$

where the sum runs over the n options (alternatives) in the choice set, and the observed and simulated terms are the observed and simulated shares of preference for option i. Shares are expressed as percentages, so the MAE is reported in percentage points.

The MAE is computed once per unique choice set, and the reported figure is a frequency-weighted average of those per-set values, each weighted by how often that set was held out. For example, if unique set A was held out twice and unique set B once, the overall value is

$$ \frac{ \text{MAE}_A \times 2 + \text{MAE}_B \times 1 }{ 3} $$

Reference bands for interpreting the MAE depend on the number of options in the choice sets, and are generally more forgiving than hit rate bands because the MAE is a more difficult metric to optimise.

Validation from holdout respondents

It reports the same two metrics as holdout choice sets validation (with the same interpretation bands), with one key difference: because the held-out respondents took no part in building the model, there are no individual coefficients for them. Both the hit rate and the mean absolute error are therefore calculated using the average (population-level) coefficients rather than each respondent’s individual coefficients.

Because of that, holdout choice sets validation usually produces stronger results than holdout respondents validation. Lower scores on holdout respondents validation are common and do not necessarily indicate a problem, especially when respondents are heterogeneous.

See also

Reviewing goodness of fit alongside these metrics gives a fuller picture of model quality.

For help interpreting holdout validation in your study, please contact Conjointly support for assistance.