More

jballanc · 2026-05-10T21:39:56 1778449196

I've been working on RVW, my adaptation of the standard transformer model that is capable of online continual learning without catastrophic forgetting. I finally published the first pre-print of my early experiments: https://doi.org/10.5281/zenodo.20064617

Now I'm working on expanding the work into more parameters and improving performance. I just finished an extremely harsh test of a Nemotron-flavored RVW that consisted of stretches of a random assortment of domains interspersed with long runs of single domains. Across all of it the model didn't forget (and actually improved on some of the more challenging domains). PPL on SmolTalk is still in the ~18 range, which I'd like to get lower, but this is all with only 4B params.

Currently, I'm training a Llama 3.2-flavored RVW with only about 2B params to see how that turns out. Depending on results of that, I may take it to Gemma 4 next.

brianjlogan · 2026-05-11T15:19:18 1778512758

Super interesting. I'm also super into the idea of always online continual learning.

I'll check it out. Thanks for sharing.

jballanc · 2026-05-03T01:24:09 1777771449

IIRC, there was always a way to filter out certain messages (or that may be an alt.org customization, but it's been a part of my config file for a while now).

jballanc · 2026-05-03T01:22:06 1777771326

For real! Valkyrie is the perfect "just bash things while only half paying attention" class. Great for when I'm playing to unwind (as opposed to playing as a challenge to myself).

At least there's still Samurai.

chorizo · 2026-05-03T03:22:00 1777778520

That was my favorite class. Still remember the game where I mostly (t) threw my wakizashi (b) at enemies.

jballanc · 2026-04-30T19:23:05 1777576985

I think Douglas Adams had one of the best quotes regarding observing infinity:

"Infinity itself looks flat and uninteresting. Looking up into the night sky is looking into infinity – distance is incomprehensible and therefore meaningless."

jballanc · 2026-04-28T02:05:41 1777341941

It's been a while since I worked at Apple, but back in the day the entire OS X Server team made extensive use of kerberized NFS shares for moving around large files...

...the last version of Server shipped in 2021 (and the last real version shipped almost a decade before that).

saagarjha · 2026-04-28T08:07:35 1777363655

Apple was still using Kerberos when I was there not that long ago.

ninkendo · 2026-04-28T10:18:14 1777371494

Hmm, the more I think about I think you’re right, they likely still do use kerberized nfs, but I think the auth layer they use is… different. Without giving too much away, the internal SSO software ends up either wrapping or providing Kerberos tickets in some way, so I’m imagining that code path doesn’t panic.

In fact that’s probably the clue… everyone internally at Apple using krb5 auth with nfs is probably using the internal SSO software and the code path for “vanilla” Kerberos (ie. Ticket Viewer.app and so on) has zero testing. Maybe I’ll write that into the next crash tracer report I type up :-D

e28eta · 2026-04-28T15:01:33 1777388493

If you want a slightly different black hole to send your report to, you could use Feedback Assistant: https://developer.apple.com/feedback-assistant/

jballanc · 2026-04-27T19:17:17 1777317437

My first job after finishing my undergrad degree was performing quality analysis on corn starch. As a condition of employment, I had to sign a paper saying anything I invented related to corn was property of my employer.

jballanc · 2026-04-23T22:31:19 1776983479

It's been more than a few years since I worked at Apple, but they were always unique in the tech space in that their retail division dwarfed headcount. If I recall correctly all of OS X Lion was produced by around 3,000 engineers (and probably less, since I think that count included iLife and iWork).

bee_rider · 2026-04-24T04:52:12 1777006332

Aren’t they sort of unique in that they… have a retail division, as a real ongoing thing (I’m sure MS tried an MS store but I’ve never seen one).

Well, unique other than Amazon I guess.

jballanc · 2026-04-13T00:24:20 1776039860

I've been working on an ML model capable of robust continuous learning, resistant to catastrophic forgetting without relying on replay, an external memory system, or unbounded parameter growth. Last week I confirmed the first non-toy, 580M parameter version soundly beat LoRA, EWC, and full fine tuning. This week I'm scaling up to 4.4B parameters...

Findeton · 2026-04-13T15:36:43 1776094603

Do you have a public repo for that? I'm also trying to do that although I'm using "replay"/distillation and hopfield memory banks.

jballanc · 2026-04-22T00:45:48 1776818748

No public repo yet, but coming soon. Just filed for a patent on the technique and am preparing a paper. Posted the first figure I have for the paper here: https://dev.to/jballanc/what-would-you-do-with-an-ai-model-c...

jballanc · 2026-04-10T22:09:17 1775858957

We need benchmarks that can distinguish between continuous learning and long-context extrapolation.

vrighter · 2026-04-11T19:47:54 1775936874

oh that's easy: continuous learning is not something current architectures can do. So the benchmark for that can be done mentally

jballanc · 2026-04-06T17:42:55 1775497375

Based on what you've already mentioned, there's a good chance you're familiar, but on the off chance you're not: "Funkungfusion" (or, really, anything off the Ninja Tune label) might be right up your alley.