In my diploma thesis (DVI [874k], PDF [1.8MB], searchable PDF [9.8MB]), unfortunately in German only (see also this directory of individual files or download it all at once [471k]), I implemented several algorithms for "supervised learning" of "backpropagation" neural networks (see the simulation program [118k] written in VAX Pascal - fortunately in English), which were taken from literature, and compared their performance in a benchmark. To make a long story short, Schmidhuber's algorithm for searching zeroes of the error function (instead of local minima like the other algorithms), by a kind of Newton approximation, was by far the best, especially when combined with random presentation of patterns with a selection probability proportional to their error (i.e., the measure of deviation from the expected output), an invention of mine (from a vocabulary training program I had written for myself when I was learning Spanish at the university). Another huge advantage of this algorithm is that it is free of parameters.

