For the
pruning of neural networks, which is described in section
2.8.8 in detail, separate neural networks for both analytes were trained
using the calibration data set. The networks were fully connected with 8 hidden
neurons serving as reference networks for the pruning algorithms. Then, the
two pruning algorithms Magnitude Based Pruning and Optimal Brain Surgeon were
used to remove network links until the estimated increase of the error for the
calibration data reached 2%. After that, the networks were retrained. This procedure
was repeated 3 times in total. Finally, the calibration data and the external
validation data were predicted. For both pruning algorithms, 50 networks were
trained and optimized by this procedure using different initial random weights.
Magnitude Based Pruning For R22, the network with the smallest crossvalidated
calibration error consisted of 20 input neurons, 2 hidden neurons and 25 links.
This network predicted the calibration data with a rel. RMSE of 2.34% and the
validation data with a rel. RMSE of 2.48% (see table
2). For R134a the network with the smallest crossvalidated calibration error
consisted of 33 input neurons, 3 hidden neurons and 64 links. The predictions
by this network showed relative errors of 3.16% for the calibration data and
3.34% for the validation data. Compared with the fully connected neural networks,
the number of adjustable parameters (27 respectively 67) were dramatically reduced
resulting in a smaller gap between the prediction errors of the calibration
data and the prediction errors of the validation error. Yet, the predictions
of the validation data are worse than the predictions of the fully connected
neural networks rendering this approach to improve the generalization ability
of neural networks useless.
Optimal Brain Surgeon For R22, the network with the smallest crossvalidated
calibration error consisted of 25 input neurons, 3 hidden units and 37 links.
This network predicted the calibration data with a rel. RMSE 2.10% and the validation
data with a rel. RMSE of 2.12% (see table 2).
For R134a the network with the smallest crossvalidated calibration error consisted
of 17 input neurons, 4 hidden neurons and 24 links. The predictions by this
network showed relative errors of 3.22% for the calibration data and 3.32% for
the validation data. The low number of adjustable parameters (40 respectively
26) successfully helped to prevent an overfitting with practically no gap between
the predictions of the calibration and validation data visible. Compared with
the fully connected neural networks the predictions of the validation data are
slightly better for R22 and slightly worse for R134a. This demonstrates the
possibility of modeling the relationship between the concentrations of the analytes
and the time-resolved sensor responses using by far less adjustable parameters.
It is also visible that the sophisticated OBS algorithm performs better than
the simple MP approach.
Summary The predictions of both pruning algorithms did not
show unmodeled nonlinearities and the true-predicted plots were similar to the
true-predicted plots of the fully connected networks (see figure
43). The most severe drawback of both pruning algorithms is the instability
of the algorithms resulting in a totally different network topology for each
run with different initial weights. For example, the 50 networks created by
OBS for R22 used 7 to 27 input neurons, 1 to 4 hidden neurons and 8 to 40 links
and showed prediction errors of the external validation data between 2.12% and
3.38%. The 50 networks for R134a used 8 to 36 input neurons, 2 to 6 hidden neurons
and 12 to 49 links with no repeated topology. The predictions of the validation
data varied between 3.32% and 5.48%. The variation of the networks created by
the MP algorithm was even worse. Although the pruning algorithms demonstrated
that significantly sparser network topologies are enough for modeling the relationships
between the time-resolved sensor responses and the concentrations of the analytes,
the high variations of the network topologies and of the qualities of prediction
render the pruning approach useless for an easy reproducible application.