DwarfStar and other end-user inference engines should also support batched/concurrent inference IMHO. Not so much for the overly naïve "serving multiple users" case (the local hardware cannot really compete with ordinary datacenter gear, much less with the big proprietary suppliers; the compute headroom is too small to begin with once the model is in RAM) but rather to improve SSD streamed decode in the unattended inference scenario, where the goal is to meaningfully raise aggregate tok/s whilst facing an overly tight constraint on disk bandwidth, and CPU/GPU compute have a lot of slack.
Of course this requires wide enough batches to have at least some reuse of fetched experts across a batch, but that seems feasible in the "unattended" case, where firing off multiple inferences to be processed together seems quite natural. (We may also have some benefit from better use of the resident experts cache and/or of SSD transfer bandwidth.)
https://github.com/antirez/ds4/issues/275 seems to provide intriguing rough results while https://github.com/antirez/ds4/issues/314 is a valuable contrast where one commonly suggested solution ("just run multiple instances of the engine in parallel") ran into real issues. Neither of these discuss the combined use of batching and SSD streaming yet, so there's room for experimentation.
Exactly. Which is somewhat helpful for cyber defense because it helps prioritize fixes for those bugs that are in fact involved in a viable exploit chain. But it makes sense that one would want to restrict the ability of building those until the vulnerable software has been comprehensively fixed.
There is some meaningful evidence that Fable is fine-tuned or steered away from helping on this very task, which is not something that can be feasibly circumvented by a basic jailbreak.
The article does not state at any point that the written test cases involved actual exploit code, and this is also very unlikely given what we know about Fable. Even if they did, it would not in any way be exposing the ability that originally raised concern wrt. Mythos Preview, viz. staging realistic cyber attacks that would be able to work around non-trivial defenses and chain vulnerabilities in a goal-directed way.
Opus can very much "fix the code". Quite possibly even Sonnet can. This is a big fat nothingburger and it's increasingly looking like the political restriction of Fable at least (not Mythos itself, of course) was arbitrary and based on the flimsiest pretext.
The first part of implementing an exploit is finding a vulnerability, and "fix the vulnerabilities" accomplishes that just as well as "find the vulnerabilities".
No, market manipulation is influencing public perceptions of something the regime has little total control over - eg why Iran gets bombed late in the week, and then by Monday there is often a "peace agreement" in the wings. This is direct subjugation ahead of Anthropic's IPO - both for the customary bribes, and also to assert "you will obey all of our dictats about how we want to your use your models, and you will not speak up against the regime". The US is really no longer a safe place for business.
People can't seem to agree on what "Opus class" even means (the latest Opus is apparently pretty weak) but DeepSeek Pro, Kimi and GLM all are quite capable.
Nothing compares to Opus when it comes to "taste" in web design in my experience. Nothing compares to opus in very difficult HPC/model inference development. I worked on this with opus: https://github.com/computerex/dlgo
OpenAI was offering 2x usage at one point and I still used opus just because it's so much more effective.
Right. Local models haven't quite hit that level yet. The biggest open models, which you need tens of thousands of dollars of hardware to run at reasonable speed, have pretty much hit that level of capability, but most models you can reasonably run at home aren't quite there yet. But given the gap, if local models keep improving, you'd expect to maybe see that level by this November.
My understanding is that we could in fact run the largest models on "reasonable" home hardware by focusing on throughput rather than raw speed and having them do unattended inference in large batches. The big proprietary suppliers have no interest in this because their own incentive is to fill all the physical space available with top-performing hardware and doing huge amounts of inference as quickly as possible. A home user with limited hardware investment has very different constraints.
Academia has tenure which basically allows the most capable researchers to "coast" as long as they have previously proven their skills via their tenure-track work. It seems to work quite well.
> Try running the latest OS models on a normal Mac or PC.
It can be done through the magic of SSD offload. The worst case involves seconds-per-token speeds, but that's OK if you only care about low volumes of slow unattended inference, which maximizes utilization for the hardware.
(The real worst case, where you're streaming the whole model from the cheapest storage you could feasibly think of, involves multiple minutes per token for a single inference, or even hours per token batch if you're doing many inferences in bulk. That's a lot less helpful, so there's a space for smaller models at the edge, even for unattended workloads.)
I struggle with the practicality of the whole thing.
The amount of tokens required to properly distill a frontier model is so large that by the time you could consume the # of tokens you would either be banned for extremely obvious abuse or a new model would be released, rendering your efforts less and less valuable over time. Intelligence is not a linear thing. Being behind just a little bit can have exponential consequences.
> Being behind just a little bit can have exponential consequences.
That seems to be the argument of Dario, Sam et. al., but I'm not ready to believe it. Time will tell, but this can be a marathon and Anthropic and OpenAI is in getting ready to sprint the last lap of the first mile.
I'm uneducated on how distillation works at more than a basic level so forgive me if this is a stupid question.
Isn't "distillation" of another provider's model exactly how these models got training date in the first place: Massive amounts of the written word + Prompt -> Answer. Why wouldn't distillation produce similar "reasoning" in the new model? It's just inputs and outputs.
What you're describing is (pre-)training. Distillation requires richer labels, the probability distribution over tokens (it would be logits rather than probabilities but that's not important). From a chat transcript you can only understand the argmax/most likely token of that distribution (and only if the API allows you to set the temperature to 0). It's not impossible for an API to give you that but they won't if they don't want you distilling their models.
The intuition is that distillation exploits not only the "right" answer but the relationship between answers (what's the second most right answer? the third? etc).
Among other things, because you simply can't get those "massive amounts" of text from a SOTA model at reasonable cost. And complex reasoning cannot possibly be trained in a pure one-shot fashion, real post-training takes massive resources. The whole story doesn't add up.
This is totally inaccurate, the APIs provide the reasoning logs. You ABSOLUTELY can distill from APIs, in fact, that's the primary way distillation is done currently.
If "everyone understands religion is not literal", why do so many people take it literally? You could just as sensibly flip the argument and argue that the garden-variety 'nerdy' atheist is talking literally about atheism but really doing negative theology ("your idea of God is totally wrong and does not exist, because the true God is necessarily inaccessible to human reason") but that would be silly and make you look like a dork too.
As for the why, I don't have an answer, but I thought I addressed it with this:
> I'll admit that there are also groups of Christians that take the bible very literally, as I'm sure there are for other religions as well. From what I can see, these don't make up the canon of religion, and I kind of believe they're mostly concentrated in North America, but that might be my skewed perspective.
There will always be people falling off on one side of the spectrum or the other. Personally, I haven't met anyone who takes the bible literally, and I know a _lot_ of Christians, including pastors and priests. Some people simply just believe that there is something more, others have a feeling that you can sense that, some just need this believe to feel safe, etc. I guess it depends on where you're from, I believe biblicism is more common in North America, or at least more visible.
Additionally, the "everyone understands religion is not literal" was citing my parent. Usually, "everyone" is kind of understood not to mean "exactly 100%". It's a device to communicate intent.
> You could just as sensibly flip the argument and argue that the garden-variety 'nerdy' atheist is talking literally about atheism but really doing negative theology ("your idea of God is totally wrong and does not exist, because the true God is necessarily inaccessible to human reason") but that would be silly and make you look like a dork too.
Yeah, it'd make you look like a dork because it'd be obviously incorrect. The intentions of your garden-variety nerd talking about atheism are pretty clear, and it's not to make some greater theological point. When you talk to people who talk down on religion and believers, it's usually really easy to tell whether it's because only they themselves understand the True Intention Of God or whether they just think Christians are stupid and if you're smart you have to be an atheist. Said garden-variety nerd is the latter.
> There will always be people falling off on one side of the spectrum or the other. Personally, I haven't met anyone who takes the bible literally, and I know a _lot_ of Christians, including pastors and priests.
I grew up, and still live, in a conservative state and a conservative family. That hasn't been my experience at all: I know a lot of people for whom the bible is a literal truth.
That's fair. Experiences have a great influence on opinions you form, and everyone's experience is different. You live in a different part of the world than I do and know different people. I even conceded that in North America, the amount of people that take the bible literally might be higher or at least more visible than other parts of the world, and I'm going to assume you are from the USA because you said "state".
According to one survey I found[0], around ~20% of Americans (25% of Christian adults) say the bible is the literal word of god. Not exactly a huge amount of people, but a very considerable number nonetheless. I didn't find any numbers for other regions, but maybe it would help to see number of followers by denomination and try to derive some data from that. The official stance of the Catholic Church e.g. is that the bible should not be taken literally. Most protestants in Europe also don't practice much fundamentalism, but there are an estimated 25 million Evangelicals in Europe, around 2.5% to 3% of the population. There's probably more people preaching biblicism than only fundamental evangelicals, but I just wanted to look up two examples real quick.
> The intentions of your garden-variety nerd talking about atheism are pretty clear, and it's not to make some greater theological point.
I agree about the underlying intentions, but I was talking about the typical, literal arguments for garden-variety 'rational' atheism. The point that these arguments tend to map quite cleanly to negative theology would usually be considered a pretty strong one as a matter of philosophy. Of course, this can only be said to further highlight the difference in intentions.
Yeah, but only if you take the arguments fully out of context and only if you view a subset of the arguments.
First, the argument of your garden-variety nerd atheist don't always map so cleanly to negative theology. I've seen plenty of arguments in the realm of "The bible contradicts itself, so you're stupid if you believe in it. Checkmate", etc. You get the idea. Just some of the arguments map well to negative theology.
Secondly, the context is missing. In my original comment, I was talking about how being very literal is seen as poor social adaptation because subtext, inaccuracies etc are part of social communication. Pretending to not understand that, or not understanding that, does not make one a logical being, it just makes you look like a dork.
Your argument is applying a very literal take of the hypothetical garden variety atheist we've brewed up. This is the same as taking the bible very literal and then calling people stupid when they believe in it. It's not arguing the main point, but picking out something that's easy to criticize and building your argument around it. My point is that taking something very literal is exactly a sign of poor social adaptation when there is a relatively big agreement on not taking it literally by society.
Now, your garden-variety nerd's arguments hold up very well against people who actually do take the bible literally, but I'm getting to a point where I want to get off the religion debate, because that's not really what I wanted to point out originally.
Circling back to my original: Logic and reasoning is not against social norms. Being a dork who pretends to not understand or actually doesn't understand social norms just to make a point is. Being hurtful just to feel superior is against social norms. Pretending you're interested in "truth" and that's why you are not conforming to social norms is also a pretty stupid take, imo. Yeah, social norms aren't always great, and they certainly don't work for everyone and a lot of people are left out being the "weird" ones, it sucks. But the reason these people are the "weird" ones is not because they're on a noble crusade for truth and logic.
I'm pointing this out even though you're not the original commenter I was responding to because we kind of got derailed into the details of this thing that was more meant as an example than really the main argument.
> there is a relatively big agreement on not taking it literally by society.
I think you're seeing a relatively big agreement where there really isn't one. It's not just the U.S. or biblical literalism, it's also about most people not really being familiar with the notion that even the strangest religious doctrines might be "true in a non-literal (but still worthwhile!) sense". From that POV, the garden variety atheist's argument is raising that very point. You don't have to take the atheism literally to understand that it's hard to believe everything about religion in a literal sense.
That’s also the money in the US, not everywhere, and not every veterinarian goes into it for the rural stuff. I know plenty of veterinarians who have zero interest in that, and being vegan don’t even approve of the industry.
Of course this requires wide enough batches to have at least some reuse of fetched experts across a batch, but that seems feasible in the "unattended" case, where firing off multiple inferences to be processed together seems quite natural. (We may also have some benefit from better use of the resident experts cache and/or of SSD transfer bandwidth.)
https://github.com/antirez/ds4/issues/275 seems to provide intriguing rough results while https://github.com/antirez/ds4/issues/314 is a valuable contrast where one commonly suggested solution ("just run multiple instances of the engine in parallel") ran into real issues. Neither of these discuss the combined use of batching and SSD streaming yet, so there's room for experimentation.
reply