A few months ago I released the first version of MiniML, a small machine learning framework powered by Jax. The main idea behind it was to have a very lean toolset to build models that would be as simple to train as a Scikit-learn one, while offering the same level of flexibility as PyTorch. I wrote all about it here.
A few versions later I’ve expanded on that base with a few much needed basic modules for machine learning - radial basis function networks, multi-head self attention, a testing system that uses gold standard data from Backblaze, support for non-Scipy optimizers and more. In this release though I want to focus on the addition of two features that support research in one particular phenomenon: “grokking”. Let’s see what is it!