More

WithinReason · 2026-06-15T11:31:56 1781523116

Could Anthropic relocate to a different country?

comboy · 2026-06-15T12:51:40 1781527900

They cannot do it. Apart from all the practical, technical and talent reasons, it would still be exporting forbidden stuff.

The signal is clear enough though for the next Anthropic..

chasil · 2026-06-15T11:44:01 1781523841

Individuals can leave, but the company cannot transfer restricted intellectual property.

Europe has extradition treaties, so the U.S. can force anyone in Europe back to the U.S. for criminal indictment who demonstrates inappropriate possession of this technology.

marcyb5st · 2026-06-15T12:30:44 1781526644

Would be very hard to demonstrate that they did that. If all employees move to some country with a slow legal justice system and strong labor laws, they also recreate the training data because that can be transferred, they can train another version in said country which is perfectly legal.

Can you demonstrate beyond any reasonable doubt that the model weights have been transferred? No. Will the EU judges move to extradite said individuals (and many are EU citizens)? Also no, especially in the face of spurious accusations. And even if they were open to, you can stonewall everything and you will probably outlast any US administration pursuing that.

khalic · 2026-06-15T11:57:02 1781524622

Well, force is a strong word… it’s still just accords, that the US doesn’t seem to be valuing lately… so if they say no, what’s the US going to do? Start a war over a company?

WithinReason · 2026-06-15T08:54:15 1781513655

This fixes that: https://github.com/valinet/ExplorerPatcher/

WithinReason · 2026-06-15T07:25:27 1781508327

I have Win11 Pro and have yet to encounter any issues

WithinReason · 2026-06-14T09:16:57 1781428617

Which tools? Even file reads and writes?

bob1029 · 2026-06-14T09:26:54 1781429214

Especially these things.

The only tools permissible to root in my scheme are call() and return().

WithinReason · 2026-06-14T09:42:49 1781430169

Is it in pi.dev? Don't thinking tokens still take up context?

WithinReason · 2026-06-13T08:06:09 1781337969

folding@home reached 2.43 exaflops by April 12, 2020, which would make it the largest supercomputer on the planet.

sho · 2026-06-13T09:25:15 1781342715

it's down 99% since that peak. But let's compare to it anyway.

It's pretty useless to compare raw FLOPS, but as a general hand-waving guesstimate, F@H is currently doing about 25 petaflops in a mix of FP16 and 32. AI usually trains at FP8, but to keep things fair the H100 is quoted at 60 FP64 teraflops per unit, so that's 12 FP64 exaflops given its 200k count.

So F@H at its peak did 2.43 exaflops@FP16/32. Colossus 1 does 12@FP64. These numbers are very hand-wavy, but I think the point is made.

By the way, I'm not trying to crap on F@H - I think it's an outstanding project and I've run it in the past. But a volunteer group simply cannot compete with well-funded, concentrated effort like what's going into AI.

WithinReason · 2026-06-13T07:58:48 1781337528

Efficiency difference between training on GPUs and TPUs is 2x at best. You can get very efficient with tensorcores, converging to TPU efficiency. In the end math is math, you can't make a multiplication more efficient than it already is on GPU.

schobi · 2026-06-13T08:37:33 1781339853

I guess this was more related to syncing GPUs.

If you were to take 500 computers with older 1080 GPUs, you might have enough compute/ram equivalent to an H200 GPU for training such a model. Maybe take 10000.

But if those machines are spread over 10000 homes, wired with residential internet service, training a large model will not get anywhere.

You go from "data in the same HBM memory chip" at 4.8TB/s or "data in adjacent GPU" with NVlink at 1.2 TB/s down to 25 MBit/s upload speed. Accessing the next piece of data is going to be about a Million times slower. At the same time you will heat a thousand times more, for a Million times longer.

incrudible · 2026-06-13T09:21:32 1781342492

You need to train independently and merge rarely. The problem is the merge step. Weights are too entangled, you are not going to get an improvement commensurate to the effort. Otherwise, everyone would do it. It is an open research problem.

filup · 2026-06-13T10:33:03 1781346783

That sounds like the way. Everyone trains their own small problems to maximally compressed weights and then merges.

zozbot234 · 2026-06-13T08:35:21 1781339721

The power-constrained part of compute is data movement, not the elementary arithmetic per se. Anyway, it's very possible to tweak the underlying design to increase throughput a lot for any given power budget at the cost of high latency. This seems especially useful for training workloads where we don't really care about latency as much.

GeoAtreides · 2026-06-13T12:54:53 1781355293

Math is math, but sadly math isn't physics nor engineering.

pvirgiliu · 2026-06-13T15:48:03 1781365683

math has physics.

WithinReason · 2026-06-13T07:56:08 1781337368

The gradient info can be compressed 10000x with the right tricks, I think it is achievable. Nous claims they did it already:

https://github.com/NousResearch/DisTrO

There are other gradient compression papers from the past reporting large compression rates

WithinReason · 2026-06-12T10:22:44 1781259764

This likely says something about the harness Fable was trained in. It knows how to do this because it has done this millions of times during reinforcement learning.

WithinReason · 2026-06-10T06:38:21 1781073501

https://www.forbes.com/sites/anishasircar/2026/04/17/ai-solv...

WithinReason · 2026-06-09T19:51:38 1781034698

It's a meme, and HN loves upvoting memes. Just like Reddit!