10.3. Optimization of the Measurements (Dr. Frank Dieterle)

Among the many parameters to be decided on and to be adjusted, the scanning speed of the time-resolved sensor responses has been an often-discussed subject during the measurements for this work. A slow scanning of the sensor responses over time results in a low number of time points allowing a calibration without a variable selection or at least allows significantly speeding up the variable selection procedures. On the other hand, a slow scanning of the sensor responses might miss the differences between the sensor responses of analytes, which show a very similar kinetics. To investigate this topic a little bit more in detail, fully connected neural networks were trained using the refrigerant data set whereby the number of time points was systematically reduced by using only each 2^nd, each 3^rd... time point. In table 11, the prediction errors are shown, which decrease with an increasing number of time points corresponding with an increasing scanning speed. This table also demonstrates that only a sophisticated variable selection procedure improves the performance of calibration and prediction (compared with table 3 and table 4).

Method	Calibration Data Set		Validation Data Set
Method	R22	R134a	R22	R134a
Each Time Point	1.5	2.6	2.2	3.3
Each 2^nd Time Point	2.0	3.0	2.4	3.3
Each 3^rd Time Point	2.2	3.1	2.7	3.4
Each 4^th Time Point	2.4	3.2	2.8	3.5
Each 5^th Time Point	2.9	3.5	3.2	3.8
Each 10^th Time Point	4.5	3.7	4.9	4.1
Each 20^th Time Point	21.9	55.2	21.6	52.1

table 11: Relative RMSE in % for the prediction of the refrigerant data set by fully connected neural networks, which use each n^th time point simulating a slower scanning of the time-resolved sensor response.

Also the variable selection by the frameworks gives an indication of an optimal scanning speed for the time-resolved sensor responses. Practically for all variable selections by the frameworks of the previous chapters, many of the variables selected were adjacent in time. For example it is shown in figure 46 that 9 out of 12 time points within the time interval 67 s to 93 s are selected demonstrating that nearly all information of the selected interval is evaluated and that a further increase of the scanning speed might yield even more useful information.

The fact that variables are selected and used only within few intervals is also known in PLS and has been subject to some further developments of the PLS known as Interval Partial Least Squares (IPLS) [266]. It has often been stated that the collinearity of a certain number of variables stabilizes the predictions [41] whereby too high a number of collinear variables negatively affects the predictions (see also section 2.8).

For practically all selections of the variables by the frameworks (for example in the sections 8.4.1, 9.1.2, 9.2.3, 9.2.4 and 9.3.2), the variables are located directly after the beginning of exposure to analyte and directly after the end of exposure to analyte. This implies that not the complete measurement time is needed for the determination of the sample composition, but only a short interval of exposure and after that a short interval of analyte desorption. It also implies that the time of exposure to analyte can be reduced, which also results in a faster desorption of the analyte (like a synergetic effect) and consequently reduces the time needed between measurements. For this work, the time used for exposure to analyte and a subsequent recovery had been determined by visually inspecting the sensor responses of single analytes (like figure 24) and then by choosing the time interval, for which the shape of sensor responses significantly differ. For the routine analysis, the calibration should be repeated measuring only during the time intervals proposed by the frameworks, which will save time and money.

The number of measurements which have to be performed for a calibration is also a significant point, which has to be decided on when planning an experimental design. As the number of measurements for a full factorial design strongly increases with the number of analytes and the number of concentration levels (see equation (1)), the number of concentration levels for the calibration of ternary and quaternary mixtures was rather low compared with the binary mixtures of the refrigerants. The price to be paid for calibrating with a 4-level design (used for the calibration of the quaternary mixtures) instead of a 21-level design can be estimated by using only 16 calibration samples instead of 441 samples for the refrigerant data. The mean relative RMSE of the validation for the non-optimized neural networks increases thereby from 2.7% for the 21-level design to 6.7% for the 4-level design. Thus, it is expected that the calibrations of the ternary and especially of the quaternary mixtures can be significantly improved by measuring more calibration samples.

The choice of the optimal thickness of the sensitive layer depends on several parameters, which are partly discussed in chapter 5 and in the results in more detail and which will be only summarized here. A thick layer means a slow kinetics of the analytes allowing the discrimination of very small and similar analytes. On the other side, big analytes need a very long time until a sensible sensor response can be observed resulting in long measurement times. Thin layers, which allow fast measurements can only be used in some setups due to a low signal to noise ratio, whereby a smoothing of noisy signals can improve the calibration (in contrast to smoothing the nearly noise-free signals of thick layers). Among the different devices, the SPR setup is most appropriate for time-resolved measurements using Makrolon, but needs the most complex equipment (like an exact constancy of the temperature). The 4l setup is the smallest and cheapest device but is only fairly appropriate for Makrolon as sensitive layer, whereas the RIfS array setup can be found between the former two setups in respect to all concerns.

Thus, no general recommendation except of a highest possible scanning speed of the sensor responses in combination with a variable selection and a highest possible number of calibration samples can be given for most parameters, as the optimal solution is determined by the analytes under investigation, by external conditions like the allowed time for each measurement, the demanded robustness of the devices and much more.