Posts for: #Machine Learning

MiniML 0.7.0: the Grokking Update

A few months ago I released the first version of MiniML, a small machine learning framework powered by Jax. The main idea behind it was to have a very lean toolset to build models that would be as simple to train as a Scikit-learn one, while offering the same level of flexibility as PyTorch. I wrote all about it here.

A few versions later I’ve expanded on that base with a few much needed basic modules for machine learning - radial basis function networks, multi-head self attention, a testing system that uses gold standard data from Backblaze, support for non-Scipy optimizers and more. In this release though I want to focus on the addition of two features that support research in one particular phenomenon: “grokking”. Let’s see what is it!

[]

The Big Learning Set for Big World Helpers

On November 12, 2012, Randall Munroe’s famous xkcd comic published Up Goer Five, a blueprint and explanation of the Apollo V rocket written using only the 1000 most common words of the English language (as he estimated them). Later on, on November 24, 2015, came out Thing Explainer, an entire illustrated book of similar explanations for other objects and concepts. The “only the most common 1000 words” style of writing sounds sometimes stilted, sometimes a bit funny, but these texts certainly prove that it’s enough to talk virtually about anything.

In the age of LLMs, would it be possible to have a training set built only on the most common 1000 words of the English language?

Let’s try.

[]