8.5. Conclusions and Comparison of the Different Methods (Dr. Frank Dieterle)

In this chapter, a growing neural network algorithm for building non-uniform neural networks was applied to the refrigerant data set. The algorithms showed improved calibrations compared with the common non-optimized neural network. Yet, the variable selection and the topology of the networks were only partly reproducible. Thus, similar to the genetic algorithms a parallel framework and additionally a loop-based framework were introduced to improve the reproducibility and to improve the calibration quality further.

The loop-based framework showed the best generalization ability of all multivariate data analysis methods introduced and applied in this work as it allows building non-uniform neural networks of any arbitrary size and topology exploiting a data set limited in size to a maximum extent. The predictions of the external validation data showed impressively low rel. RMSE of 1.50% for R22 and 2.37% for R134a, which are only slightly higher than the standard deviations of the sensor signals of reproduced measurements. The loop-based framework needs a lot computing power (2 weeks for each analyte of the refrigerant data set using an up-to-date personal computer) and is hardly suited for parallel computing hardware as it is a loop based approach and not a parallel approach.

If a reproducible variable selection is important, both, the parallel growing network framework introduced in this chapter and the genetic algorithm framework introduced in the previous chapter are a good choice, whereby the latter scales better with an increasing number of variables, but shows a slightly worse generalization ability. Both parallel frameworks showed improved calibrations compared with the common neural networks. Both frameworks are well suited for parallel computer hardware rendering both methods ideally suited for computer pools.

The single run growing neural network algorithm is a good choice for data sets with not too many variables to find an optimized non-uniform network topology without the danger of overfitting (about 3 hours computing time for each analyte of the refrigerant data set). Different single runs of the growing neural network all showed better calibrations than the non-optimized neural networks.

Although single runs of genetic algorithms are frequently reported in literature, this method has been proven to be inferior compared with the different new algorithms and frameworks introduced in this work. Though being the fastest methods for a successful variable selection (about 1 hour for refrigerant data set), the instability of the variable selection and the resulting diversity of the quality of calibration and of prediction render single runs genetic algorithms rather useless for most applications.

In summary it may be said, that the growing neural networks and all three frameworks introduced in this work performed better than the common non-optimized neural networks. Among these new methods introduced in this work, no general recommendation for a specific method can be given as the method of choice for the optimization of neural networks depends on the needs of the user and on the data set.

Method	Adjustable Parameters		Calibration Data Set		Validation Data Set
Method	R22	R134a	R22	R134a	R22	R134a
Non-optimized Neural Networks	247	247	1.47	2.62	2.18	3.26
Growing Neural Networks (1^st run)	30	31	1.84	2.73	1.99	2.63
Growing Neural Networks (2^nd run)	31	24	2.14	2.73	2.12	2.87
Parallel Framework Growing Neural Networks	43	57	1.89	2.71	2.04	2.61
Loop-based Framework Growing Neural Networks	72	44	1.39	2.41	1.50	2.37

table 4: Comparison of the rel. RMSE of the calibration and validation data in % for the growing neural network approaches and the non-optimized neural networks. Addtionally the number of adjustable parameters used by the networks are listed.