Also their API customers and downstream customers (e.g. Cursor users) would also need similar infra, and probably a decent amount of users would just choose another model that doesn't require ID & an immigration status check.
And API is much more profitable (relatively) than subscribers for them.
I think it will atleast take one month to set up correct flow for user authentication for both their subscribers and for API's (Cursor etc).
Others are thinking such a export control ban is good for them as it shows entire world Anthropic models are best, I disagree. It will wreck the company, IPO considerations etc, when the models may be best but their are no users to use them.
If such an export control ban stays in place for this particular model or future models of Anthropic, revenue will be affected. My guesss is around 50% of subscribers are non-US citizens, meaning direct 50% revenue loss.
Altman/Musk used to also do moral superiority, "Dangers of AI", lobbying to ban "non-safe" models etc. But they largely stopped doing it after 2025. While Anthropic/Dario increased the intensity of moral authority, and got hit back with exactly they were asking for.
The Internet will find a way to route around censorship. I’ve already started downloading notable open weight models onto my NAS and will distribute via eg BitTorrents if needed. I’m not in the US, and you can’t ban VPNs overnight.
Let LLMs help you with coding. Design the game and the mechanics yourself. I can see this being an incredibly empowering tool in the right game developer's hands; but if you come into it with a token-maxxing / AI-maxxing mentality, I doubt you'll make a fun game to play.
I feel really bad for Anthropic right now. This should never have happened and seems like another arbitrary use of government power, Friday after market closes.
Whatever you feel about Anthropic, good or bad, this is not fair, and this is not good for the industry.
Yes, but presumably the authors are suggesting broader application than just caching a system prompt.
The paper's approach should work well if (a) you can calculate KV(A || B) as a function of KV(A) and KV(B) independently, (b) you can identify which documents A1, A2, A3, ... are used commonly enough to be worth caching, and (c) it is cheaper to buy and sell KV(A) on a market than to compute KV(A) when it is needed. Given the size of KV(A) I am not sure that (c) will become true even if people solve the open research problem represented by (a) and accept the state-of-the-art trade-offs known for (b).
> Yes, but presumably the authors are suggesting broader application than just caching a system prompt
The authors of the OP paper "Can I Buy Your KV Cache?" explicitly disregard anything involving KV not rooted at 0:
>> We deliberately study the simplest, safe form: a document treated as a shared prefix, with continuations appended after it
So no, I really think it's just prefix caching. That's actually far from the weirdest thing about that paper: they go on to "prove" that decoding from cached prefill gets the same result as prefilling and decoding on the same content, which... yes. That is how computation works.
Also, the thing they describe already exists: you pay your provider for their cache implementation as part of your token ingress costs. What is that if not paying for cached KV?
And API is much more profitable (relatively) than subscribers for them.
reply