https://huggingface.co/blog/onekq/adam-optimizer
If you know Adam(W) optimizer already, you can just skip and sorry for the wait. Otherwise, it should be a useful read.
I checked Codex and Claude, and they both use their own models to compress.
For OCR, as you stated lots of training would need to be done. It's an ecosystem problem.
Good stuff! I didn't consider token cost at all.
I'm thinking about an open source project for a context compressor (algorithmic, at most a small on-premise model) for agent builders. Does this make sense? If so, how should it look like?
Do you know any work that studied how agents use context?
That is the case.
I'm a developer at heart. As a developer, majority of your time is spent running things and hop around environments, e.g. IDE, Cloud, Github. These environments all happen to have full featured bash support, a perfect sandbox for the CLI form factor.
The paradigm change AI brought to the developer world is nothing short of meteoric, but also an exception. Lots of efforts try to generalize the momentum to the next area(s). I won't bet on them.