time series. This implies that MLPs need less coefficients than the RBFs to return the
same accuracy, but it also means that it is a lot more difficult for an MLP to track change
(one slight evolution in the time series could require that all the weights of the network be
retrained, whereas only one RBF center would be moved). Besides, we only adapted the
output layer online, which makes it even more difficult for the MLP to adapt. This could
explain why the MLP does not achieve such better prediction (i.e., smaller criterion) than
the RBF although they have the same number of weights, and the MLP would be
expected to perform a lot better.
Interestingly, the three types of experts perform good segmentation, each with
some particularities (each RBF seems to predict better one of the regimes for example):
the importance in the choice of the type of model for segmentation is emphasized, and for
this data the RBF network is the most satisfying one.
5.2.2 Practical Considerations
Non-linear dynamics modeling is still part art part science, because there is no
universal technique that works for all data. The key in designing all these systems,
indeed, is to find the best set of parameters for the application at stake: the number of
experts, size of hidden layer of those experts and embedding size and lag, the memory
depth of the criterion, but also the learning rates for the adaptation algorithms, the
competition parameter of the gate and most of all the initial values of the networks. All
those parameters have to be set optimally by hand, and external knowledge about the data
is extremely useful in doing so. One important missing piece of knowledge is the actual
switching history that would allow us to numerically assess the accuracy of segmentation
in terms of delay to detect a change, and length of each detected regime. There is no