Model card — default mode classifier

This card documents the experimental learned classifier shipped with VANE. It follows the spirit of a model card: what the model is, how it was built, where it may be used, and where it must not be trusted.

Warning

The default model is trained on synthetic priors, not measured turbine data. Its probabilities are indicative, not validated. Constructing the classifier emits an ExperimentalWarning. The rule-based physical label and the deterministic blade taxonomy are the primary, validated identification path.

Model details

  • Type: Gaussian-process classifier (scikit-learn GaussianProcessClassifier with a constant × RBF kernel).

  • Task: map a mode’s feature vector to a physical degree-of-freedom category, with a per-class model probability used as a confidence (indicative, not calibrated against measured data).

  • Inputs: the scale-independent features of vane.ai.features — natural frequency, damping ratio, and the per-category participation fractions.

  • Outputs: a predicted DofCategory and a full class probability vector.

Training data

  • Source: synthetic_training_set() generates labelled feature vectors from category prototypes with controlled noise. It is a bootstrap, not a measured corpus.

  • Determinism: the generator is seeded; the same seed yields the same training set and model.

Intended use

  • Exploratory, uncertainty-aware cross-checking of the rule-based labels.

  • Research and development of the identification pipeline.

Out-of-scope use

  • Engineering or certification decisions based on the learned label alone.

  • Any claim of validated accuracy on measured turbines.

Metrics

No held-out accuracy or calibration metrics on measured data are published, because no measured labelled corpus is available yet. Reported confidences are the model’s own class probabilities on synthetic data and should be read as relative, not absolute.

Path to graduation

The classifier graduates from experimental once it is trained and calibrated on a labelled, multi-turbine measured corpus, with published held-out accuracy and calibration curves and versioned model artefacts. Until then it remains gated behind the ExperimentalWarning.