completely unsupervised, we are tackling a particular problem with its own specificities, and they should be included in the design of the system. One particularity is our goal to achieve online segmentation, which means only one epoch through the data and training at the same time as segmentation. Choosing the MSE estimate helps preventing spurious switching during segmentation by adding memory to the performance criterion, its memory depth has to be set for each expert. Since we have no reason for a priori distinguishing one expert or from another in our case, and considering the expert's return map have close trajectories as we show in the next chapter, Ye, is chosen to be the same for each expert, and since the real data is non- stationary, it will not be annealed. Therefore, after testing several values on the test set defined in the next chapter, yjs =0.01 is chosen to define a memory depth of one hundred sample points, which seems logical because it corresponds roughly the length of a breath cycle. But such a long memory depth also accounts for the "error buildup" issue evoked in approach III. Also, the more memory about the experts' performance is available, the more information the gate has about the experts' performance, but the less accurate the segmentation becomes, because the relative importance of a change point in the criterion decreases, so the switch might come a few samples late. Lastly, concerning approach IV, the same competition parameter is set for all models, for a really soft competition at M=2. This allows the non-winning expert to still adapt a little to the data without loosing its specialization (the learning rates are chosen so that the experts do not adapt too much).