> how HF can afford to be a CDN for such huge files
bandwidth and storage are literally free when compared to the cost of GPU clusters. HF gets rewarded heavily on capital market for being in AI without actually doing much AI stuff, that is a huge win when compared to costs they are paying for bandwidth and storage.
> The best chinese models are deepseek (general purpose)
DeepSeek is developed by the largest Chinese hedge fund, their models used to make them $ on the share market are very profitable, they've never ever released anything on those models.
Somehow you are claiming that those same group of people are going to totally change their very consistent long term behaviour and start promoting openness when they are in the global leading position in AI?
They would have a golden opportunity to inflict damage to a geopolitical adversary. The US economy is being propped up by AI, I'm not sure they'd miss the chance to blow that bubble if they could.
there are ongoing tough competitions between China and America, that is for sure. however, a bold however, it is not in China's interest to see a crashed America. as an export oriented economy, China needs a stable and functioning America to maintain global order, that is how China got free lunch for the last 3 decades.
just imagine a world without the US acting as the world police, you'd be seeing armed conflicts in middle east, Africa, South East Asia, even in Europe and North East Asia. that would make China extremely hard to extract 1 trillion USD trade surplus a year, which is now required for China to maintain employment back at home.
without the US, even for a relatively stable global environment, trade won't be possible as most countries are not capable of providing goods and services wanted by China. Their currencies are literally junk (including Japanese Yen and Euro), Chinese are not going to take those junk in exchange for real goods. Trade is now possible because, by one way or another, those countries have USD to pay. those USD are backed by 300+ million highly productive Americans who repeatedly proved that they can create values in the scale of dozens of trillions a year.
the best part of this whole thing - America is singlehandedly footing the whole bill to provide such trade friendly environment for China for FREE. this is not cold war v2, back in the days of cold war, the US didn't help USSR to such extreme extent.
It depends on the end goal. Free good enough models are a way to drastically devalue Anthropic and OpenAI. A well timed release of a capable model that can run on obtainable hardware, so that a small/medium company can afford self hosting, has the potential to destroy one or both of these companies. This would narrow down the frontier model oligopoly and give the Chinese government a lot more power beyond its borders.
It really depends on whether the Chinese government wants to make good money or "win" the current AI bubbke.
> being open is not compatible with the Chinese culture.
Hardly, it's one of the least IP-law burdened places in the world. Ready access to media, yes, but also scientific papers, books, etc. No real restrictions on duping products, so execution often becomes the winning ticket. That's all pretty open and good for consumers.
You could argue they won't allow SOTA models to be exported but it doesn't really have anything to do with Chinese culture not being compatible with openness.
> Hardly, it's one of the least IP-law burdened places in the world.
that is one of the major reasons why companies there won't be open - they know full well that anything made publicly available would be cloned/copied within days.
it is not only a part of the competition common in all countries, there are unique reasons in China - millions of graduate engineers join workforce every single year, there are not that many projects they can work on. starting copying & cloning some existing stuff even at someone's own cost is a pretty effective way to get into the game.
> Ready access to media, yes, but also scientific papers, books, etc.
There is this old Chinese saying "Teach your apprentice and your own ruin follows" (教会徒弟饿死师傅), that has been telling a completely different story for thousands of years. When they don't even want to hand over tech know-hows to their own apprentice, why would anyone be expecting them to have the desire to share it publicly?
You can find Chinese sayings for almost any position. It's orientalism to reduce modern Chinese society/culture/economy to proverbs and sayings.
You say that you're Chinese so there's no such stereotyping involved, but actually Chinese people commit this sin against themselves all the time.
己欲立而立人,己欲达而达人
"wishing to stand, one helps others stand; wishing to succeed, one helps others succeed"
> that is one of the major reasons why companies there won't be open
But the AI labs _are_ often being open. And cloning stuff more generally doesn't really require OSS anyway. Product features are easily cloned in most cases, without any secret knowledge.
> I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.
that doesn't require the model to be SOTA, it can be just a compact model capable of running on some inexpensive hardware. that is vastly different from SOTA models like Mythos which can potentially disrupt lots of things.
Of course it requires SOTA, people will always choose better models over some compact thing that is obviously more limited. You can't control the truth with models nobody wants to use.
People choose SOTA right now because of the heavily subsidised model subscriptions. People aren't going to pay 20x the price for a model that's maybe 10% better.
Because you communicate with it using natural language and real-world references and descriptions of what you want, you use emotion and emphasis (especially when re-prompting), you use examples and illustrative stories and common expressions. Understanding and interpreting all of that and replying in kind, to some degree, requires a large body of non-computation, cultural knowledge, or else the prompts are just meaningless words, and the replies will look like compiler output.
That sounds intuitively true, but I’m not convinced that it is actually the case. I don’t think we know enough about neural network training to say what training and how many parameters are necessary for what kind of performance on which tasks. To me it looks like we currently guess that more is better and try to throw as much compute and data at the problem as is economically feasible. There is little incentive for companies to invest into small model research since their moat is huge models that require special hardware to run.
no matter whether such directive is necessary or not, it is a clear message to everyone that you and your business need an alternative way to access AI models that is not controlled by <insert whatever government you dislike here>.
show me from which country or company did the Chinese copied their EV techs, batteries, drones, robotics etc. for their military gears, show me show they copied their J-36 and J-50 fighter jets, Type-055 destroyers, YJ-21 missile and the most recent 09x sub.
time to stop living your socially isolated life and start reading stuff other than whatever fed to you by your brainwashing media.
VW ID.3 is designed by a Chinese team in China, built in Shanghai backed entirely by the Chinese ecosystem.
You need to be totally blind to consider vw being able to design & produce such a car on its own. let's be straight - the software in the car is not something Germany can build on its own. That 20 years gap won't be filled overnight.
This is false, the ID.3 was designed primarily in Germany but they sell a model customised to the Chinese market. They're two different cars from different production facilities.
All the control modules for the German model at least are manufactured by Bosch and Valeo in France and Germany - where the software is also produced.
As an aside, Bosch produces control modules and the underlying software for just about every western car manufacturer. 80%+ cars sold worldwide use their traction control system for example so I'm not sure where this so-called gap is.
The kind of software needed in a car (minimal or ideally none at all except for DME and CarPlay display) is indeed not something that Germans were able to build on their own in the recent times.
Not according to what I've just read. It's sold in china via a joint venture with saic and seems to have been slightly redesigned for the china market.
What the hell are you talking about? Why are you lying?
ID.3 is a MEB car, its production started in Zwickau (Germany) in November 2019, two years before it even arrived to China, and one year before the SOP of ANY MEB car (ID.4 was the first one) in Anting (Shanghai).
bandwidth and storage are literally free when compared to the cost of GPU clusters. HF gets rewarded heavily on capital market for being in AI without actually doing much AI stuff, that is a huge win when compared to costs they are paying for bandwidth and storage.
reply