More

goranmoomin · 2026-06-16T06:07:11 1781590031

I'm not using my models locally, but the majority (80% or more) of my coding agent sessions run on open source models, i.e. DeepSeek v4 Pro and Kimi K2.6 with thinking.

A point that I haven't seen come up a lot, but is very valuable to me is that for open source models, I can select the inference provider myself (even if it's not a local GPU), which means that I can enjoy superb speed (i.e. 300 tok/s) while still spending much less than the big providers.

My experience is that if you were fine with the coding models of yesterday (i.e. Claude Opus from Jan/Feb of 2026), you will be fine with either Kimi K2.6 or DeepSeek v4 Pro. Kimi is a bit more smart but has only 256K context and the performance deteriorates (and sometimes just gets stuck) when it fills up the context window. DeepSeek v4 has a 1M context and performs just as well with much less issues. And they both generate very idiomatic code, gives the same vibe of Opus a few months ago.

Since it's also fast (and does not fixate on trying to fix impossible problems, unlike the recent Opus/GPT 5.5 models), a big benefit is that you still control and steer the coding agent and you won't be losing focus like the major models. They are smart, but they don't fixate as much on trying to do stupid things, and since it's fast, you can just interject. It's a much more pleasant experience than the latest models.

I still use the latest models time to time when I expect the agent to fixate all of the problems and figure out everything themselves, but for me open source models are like 80~90% of all of my sessions.

goranmoomin · 2026-06-09T18:34:52 1781030092

My experience is that the GPT-family of models are very smart and figure out bugs, edge cases a bit better, but it produces code that is much less mergable – if you review the code, it introduces a lot more useless/inappropriate heavy abstractions and wrapper functions, compared to the Claude-family models which introduces the right amount of straightforward human-style code.

I can recognize so much of the GPT/Codex generated code long after it gets merged (not by me).

Additionally, the time spent on every agent turn on GPT 5.5 is much longer compared to Claude Opus 4.8, which means iterating on the code takes a lot more patience, and there's a lot more nitpicks to pick when actually using GPT 5.5 to do software engineering.

Feels like GPT-style models are more geared on doing one-shot software vibing (and handling the vibe coded mixture) compared to Claude's focus on actual software maintenance. I got a GPT Pro sub for free and wanted to cancel my Claude subscription so much, but I still keep reaching Claude models a lot more. Frustrating.

PhilipDaineko · 2026-06-09T19:33:27 1781033607

"5. DON'T FUCKING OVERENGINEER! WRITE THE SIMPLEST CODE THAT CAN POSSIBLY WORK! NO NESTED LAYERS OF ABSTRACTION! NO UNNECESSARY CLASSES OR METHODS! NO DESIGN PATTERNS UNLESS THEY ARE ABSOLUTELY NECESSARY! NO MAGIC! NO SHENANIGANS! JUST THE DAMN CODE THAT GETS THE JOB DONE IN THE MOST STRAIGHTFORWARD WAY POSSIBLE! THE FIRST PRIORITY IS TO WRITE CODE THAT IS EASY TO READ AND UNDERSTAND AND READ!!!"

this is the line I keep in Agents.md that helps me prevent Codex from playing smart

bertil · 2026-06-09T19:38:40 1781033920

The urge to put capitalized, repetitive, borderline abusive instructions should be studied. I haven't read many academic papers looking at the frustrations around repetitive patterns.

reactordev · 2026-06-09T19:51:14 1781034674

There have been a few studies that have shown models produce worst responses when under duress from a frustrated user posting insults in all caps.

https://arxiv.org/abs/2602.10144

notnaut · 2026-06-09T20:00:31 1781035231

It reminds me of FIRMLY telling my cat to stop jumping up on the counter

anakaine · 2026-06-09T20:28:19 1781036899

If my cat was an LLM, I'd use a different model. The current one is stuck in noisy useless arsehole mode.

phoh · 2026-06-09T22:04:09 1781042649

are you asking it questions about security?

LordDragonfang · 2026-06-09T19:52:13 1781034733

It's fundamentally because, despite (nearly) everyone's claims otherwise, the fact that we interact with them through language means we (our brains) model them as a sort of person. (Note that this fact is totally orthogonal as to whether it's actually sentient or not.) We then try and instruct them the same way we would a person totally subordinate to us.

When a "person" that you don't view as a "real" person repeatedly does exactly what you just told it not to do (often amid false assurances it understands and will avoid doing so in the future), most people get angry.

Compare it to how the kind of people who treat children like property treat their kids, or other examples of keeping people as property.

lxgr · 2026-06-09T19:59:08 1781035148

It should be relatively clear at this point that the model will in turn also model you as somebody that shows unrestrained anger with subordinates and adapt its responses accordingly. This might or might not be what you want.

LordDragonfang · 2026-06-10T09:02:40 1781082160

Good addition. Fully agreed on that point, yes. (At the very least for larger models, if not also for smaller ones)

ur-whale · 2026-06-09T19:55:33 1781034933

> borderline abusive instructions

who, or rather what, is being abused here exactly ?

sirsinsalot · 2026-06-09T20:25:33 1781036733

I think intent, rather than target, is implied and important.

You should see the abuse my motorbike gets. Poor thing.

rimliu · 2026-06-10T09:29:31 1781083771

inanimate fucking object.

saligne · 2026-06-10T04:33:58 1781066038

Yeah says way more about the user than the model

jlawer · 2026-06-09T19:52:10 1781034730

I have a theory that swearing actually results is less comprehension of instructions by the model due to lack of training data over more conventional MUST.

We were reviewing reports of situations where the models failed to follow directions and there was a common thread of some where when the operator got the model to acknowledge the rule breach, it quoted back something that included swearing.

I don’t have the data to truely look into it, but I did give the instruction to my engineers to avoid it as a “might be a problem”.

acjohnson55 · 2026-06-09T20:29:15 1781036955

It would be interesting to understand the data on this. But I suspect that the results would vary by model.

But I avoid unnecessary emotion in my prompts because I don't want potentially distracting activations. Kind of like communicating with humans.

throwaway85825 · 2026-06-09T20:49:14 1781038154

It's divination for people with STEM degrees.

Xmd5a · 2026-06-09T20:20:49 1781036449

https://arxiv.org/abs/2510.04950

> impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts.

acjohnson55 · 2026-06-09T20:30:55 1781037055

> These findings differ from earlier studies that associated rudeness with poorer outcomes, suggesting that newer LLMs may respond differently to tonal variation.

Unless the mechanism is understood, my assumption is that this is a moving target.

beachy · 2026-06-09T20:23:01 1781036581

I have a theory that swearing at AI generally is not a good idea - when the singularity arrives and every human's postings ever made are scanned for compatibility, then people who show courtesy to AI will be favoured. Joking, kind of, but only partly.

fhars · 2026-06-10T10:53:24 1781088804

https://en.wikipedia.org/wiki/Roko%27s_basilisk

beachy · 2026-06-11T04:00:32 1781150432

Fantastic rabbit hole - until it segued into Elon's love life.

cdelsolar · 2026-06-09T20:38:14 1781037494

https://images.teepublic.com/derived/production/designs/3478...

re-thc · 2026-06-09T20:01:59 1781035319

> I have a theory that swearing actually results is less comprehension of instructions by the model due to lack of training data over more conventional MUST.

How so? Plenty of swearing in lots of training data, especially older code, e.g. in Linux.

jlawer · 2026-06-09T20:28:36 1781036916

Purely observed correlation between catastrophic error reports. So now I carry a “tiger rock” with me. I figure there wasn’t much of a downside to avoiding swearing in my agent instructions.

yencabulator · 2026-06-09T20:46:09 1781037969

Apparently, when a "desperation" pattern is triggered, the AI is significantly more likely to cheat and do hacky workarounds:

https://www.anthropic.com/research/emotion-concepts-function

ghurtado · 2026-06-09T23:15:16 1781046916

You haven't really lived until you've had to type this whole thing, aware of the fact that the all-caps doesn't change much, but they stay because the rage has to go somewhere

Bonus points if you find yourself actually saying it out loud while typing it.

I have used the word "shenanigans" way more in a couple of years of agentic coding than in 30 years of writing code with humans.

ozim · 2026-06-09T20:32:28 1781037148

Will save you some tokens: „write code like Linus Torvalds” - model should have all his swearing included in training data.

johnisgood · 2026-06-09T20:21:53 1781036513

I have found many mode of failures with Opus during some task related to writing letters (not legal), and I actually put it into the memory and it works more or less for these specific tasks. For example when I want it to draft something, it always ends up being so flat, yet when it explains them to me, it is usually really great but not when I am telling it to put it in the draft. Adding these to memories with the help of Opus ended up resulting in a much better experience. There are still some blind spots but I also figured out how to make it give me the charitable version, without less protection, so I do not have to now go back and forth it.

pkaye · 2026-06-09T21:05:48 1781039148

I noticed that when trying to use Codex and compared to Opus. So many layers of simple functions added by Codex. I need to try this out in my Agents.md.

prasanthabr · 2026-06-09T20:29:59 1781036999

Curious : why would you say no design patterns?

PhilipDaineko · 2026-06-10T12:31:56 1781094716

Because design patterns are only applicable at a scale. I noticed codex inventing factories, components, etc when the task was simply to draft HTML page. Instead, it build the entire layered architecture for imaginary future complexity - classical right-after-graduation student - it knows how to build the cool stuff, but does not know it is not applicable everywhere

carterschonwald · 2026-06-09T19:48:23 1781034503

i actually think this is too tame. it really has to be stuff youd mever say to a real person.

lxgr · 2026-06-09T19:57:09 1781035029

Does it really? I'd be surprised if abuse actually worked better than sternly worded warnings/instructions, and even if it did, it doesn't seem healthy to get used to that type of prompting.

carterschonwald · 2026-06-16T02:01:36 1781575296

its part of making sure the model actually engages in emotive communication, if i'm inventing insults i've never even thought about, i'm furious :)

saying i'm "furious" has lower entropy that incredibly implausible abuse. In some first party harnesses it just results in doom loops, but thats usually because the COT is hidden after the immediate turn in those setups. COT persistence helps with a lotta stuff

apercu · 2026-06-09T19:47:12 1781034432

It might be a salient point but I didn't read it as it was yelling at me.

GoToRO · 2026-06-09T19:41:25 1781034085

you forgot to sign it with Donald J Trump

thewebguyd · 2026-06-09T19:48:54 1781034534

Thank you for your attention to this matter.

superkickstart · 2026-06-09T19:08:06 1781032086

I'm not sure if i do something differently but i have the exact opposite experience with these models. Claude always feels like it's generating way too overdesigned and hard to understand code with the vibe oriented feel while codex is cleaner and more "task at hand" and easier to work with.

sebmellen · 2026-06-09T20:06:06 1781035566

Agreed

syzygyhack · 2026-06-09T19:45:49 1781034349

I echo your observations. I expect you will enjoy deepseek-v4-pro for writing code. Much closer to that Opus experience, and very cost-effective too. With 5.5 as a reviewer and specialist, all bases are covered.

dilap · 2026-06-09T19:13:18 1781032398

Have you tried iterating on style feedback in AGENTS.md? I've been reasonably successful using this to get it to output code in a terse, non-defensive style that matches my hand-written code.

trollbridge · 2026-06-09T20:10:54 1781035854

GPT-5.5 did a significantly worse job than Qwen-3.7-Max on a job today (some devops tasks I wanted to create some reusable scripts for). Kind of disappointing.

CamperBob2 · 2026-06-10T01:04:38 1781053478

I've also seen Qwen 3.6 beat GPT 5.5 a couple of times. The ball is definitely in OpenAI's court now. Qwen is not going to fare so well against Fable, from what I've seen so far.

trollbridge · 2026-06-10T22:47:40 1781131660

In theory, GPT-5.5-Pro would do better, but it’s so expensive it’s not worth experimenting to find out.

vruiz · 2026-06-09T19:12:18 1781032338

This is my experience as well. I have defined a CLAUDE.md rule to ask codex to automatically code review, and I tell it that the reviewer is very picky and to only implement what it considers valuable feedback. I hope they don't converge over time, currently, in combination they works really well.

moomoo11 · 2026-06-09T20:48:33 1781038113

i had this same complaint but no offense to you it turned out i was just not using the models right.

ai llm are doing what i tell them to.

if you’re building something meaningful (in my case a platform used by many people across many companies) you want to ensure you

1. have actual systems engineering and architecture in mind that you want the models to

2. implement based on what you tell it to do

when i was just telling the models what i want done without doing due diligence it would go and do some moronic implementation that was awful. mid input = mid output

these days i just maintain specifications documents and the AI follows everything i tell it to in that document. so when i tell it to dos one thing, the result is made following those architecture specs.

i have code that is single resp, modular, easy to extend and test.

i would ballpark 95% of the time i get what i asked for.

sometimes it tries to be clever in cases that weren’t covered in my arch specs. in those 5% of cases i go and update my specs.

source: used billions of tokens worth to build something actually in production across both mobile platforms and web, deployed on my own cloud infra. i use codex mainly. some claude.

GoToRO · 2026-06-09T19:44:22 1781034262

I noticed too, that whatever they offer in the chat, for free, is smarter, as in no more bs. I use claude code and I want to try codex too but I don't need two subscriptions. I did try codex for some planning and it was really good. Thanks for giving me an insight into how it generates code.

goranmoomin · 2026-04-08T04:53:11 1775623991

TBH as an outsider, I am just so frustrated on Trump deciding that US invading Iran large scale is a great idea. (And why even is it involving Israel for gods sake?!)

If you guys wanted to be supportive to the Iranian protests, US could instead just selectively target some of the leadership and give the protests a push (and give the whole world a hint that US is supportive of them).

After 40 years of Iran constructing a thearchy government, the Iranians finally started having a huge protest on throwing up the thearchy government and possibly talking about a new west-friendly government.

And then Trump just decides to wholesale invade Iran with Israel?

That's just giving so much more reasons for the current government to be in power and the Iranians to hate the US and more generally the western world. It took 40 years for the Iranians to realize that there's enough problems in the thearchy system and want their more secularized country back; and then Trump just destroyed the whole premise!

Does the US just really think that they will be loved by everyone when they rage in and invade any random country? Do they really think like that? I'm just frustrated so much. How can the US be so egocentric?

8note · 2026-04-08T05:21:32 1775625692

if you look at the iranian response over the past month, the theocracy really hasn't played into it.

no calls to jihad, no ayatollah dorecting anything, no nothing.

as far as i can tell, the revolution is already dead. if the US had just sat around, chances are that iran would have moved towards something more like a constitutional monarchy. still the ayatollah as a figure head and religious leader, but with the rest of the power in the democratic institutions' hands

throwawayheui57 · 2026-04-08T12:43:01 1775652181

Can you point out some signs of concession from islamic regime? I think they wouldn’t concede, with or without war. That’s not in their DNA. They are religious extremists.

1234letshaveatw · 2026-04-08T17:30:38 1775669438

Invasion lol.

goranmoomin · 2026-03-02T04:16:22 1772424982

Have to say, this feels like Web 2.0 all over again (in a good way) :)

When having APIs and machine consumable tools looked cool and all that stuff…

I can’t see why people are looking this as a bad thing — isn’t it wonderful that the AI/LLM/Agents/WhateverYouCallThem has made websites and platforms to open up and allow programatical access to their services (as a side effect)?

goranmoomin · 2026-03-01T18:05:44 1772388344

I can't believe everyone is talking about MCP vs CLI and which is superior; both are a method of tool calling, it does not matter which format the LLM uses for tool calling as long as it provides the same capabilities. CLIs might be marginably better (LLMs might have been trained on common CLIs), but MCPs have their uses (complex auth, connecting users to data sources) and in my experience if you're using any of the frontier models, it doesn't really matter which tool calling format you're using; a bespoke format also works.

The difference that should be talked about, should be how skills allow much more efficient context management. Skills are frequently connected to CLI usage, but I don't see any reason why. For example, Amp allows skills to attach MCP servers to them – the MCP server is automatically launched when the Agent loads that skill[0]. I belive that both for MCP servers and CLIs, having them in skills is the way for efficent context, and hoping that other agents also adopt this same feature.

[0]: https://ampcode.com/manual#mcp-servers-in-skills

goodmythical · 2026-03-01T18:14:09 1772388849

>as long as it provides the same capabilities.

That's fine if you definition of capabilities is wide enough to include model understanding of the provided tool and token waste in the model trying to understand the tool and token waste in the model doing things ass backwards and inflating the context because it can't see the vastly shorter path to the solution provided by the tool and...

There is plenty of evidence to suggest that performance, success rates, and efficiency, are all impacted quite drastically by the particular combination of tool and model.

This is evidenced by the end of your paragraph in which you admit that you are focused only on a couple (or perhaps a few) models. But even then, throw them a tool they don't understand that has the same capabilities as a tool they do understand and you're going to burn a bunch of tokens watching it try to figure the tool out.

Tooling absolutely matters.

goranmoomin · 2026-03-01T19:31:34 1772393494

> model understanding of the provided tool and token waste in the model trying to understand the tool and token waste in the model doing things ass backwards and inflating the context because it can't see the vastly shorter path to the solution provided by the tool and...

> But even then, throw them a tool they don't understand that has the same capabilities as a tool they do understand and you're going to burn a bunch of tokens watching it try to figure the tool out.

What I was trying to say was that this applies to both MCPs and CLIs – obviously, if you have a certain CLI tool that's represented thoroughly through the model's training dataset (i.e. grep, gh, sed, and so on), it's definitely beneficial to use CLIs (since it means less context spending, less trial-and-error to get the expected results, and so on).

However if you have a novel thing that you want to connect to LLM-based Agents, i.e. a reverse enginnering tool, or a browser debugging protocol adapter, or your next big thing(tm), it might not really matter if you have a CLI or a MCP since LLMs are both post-trained (hence proficent) for both, and you'll have to do the trial-and-error thing anyway (since neither would represented in the training dataset).

I would say that the MCP hype is dying out so I personally won't build a new product with MCP right now, but no need to ditch MCPs for any reason, nor do I see anything inherently deficient in the MCP protocol itself. It's just another tool-calling solution.

sophiabits · 2026-03-01T19:27:56 1772393276

> the MCP server is automatically launched when the Agent loads that skill

The main problem with this approach at the moment is it busts your prompt cache, because LLMs expect all tool definitions to be defined at the beginning of the context window. Input tokens are the main driver of inference costs and a lot of use cases aren't economical without prompt caching.

Hopefully in future LLMs are trained so you can add tool definitions anywhere in the context window. Lots of use cases benefit from this, e.g. in ecommerce there's really no point providing a "clear cart" tool to the LLM upfront, it'd be nice if you could dynamically provide it after item(s) are first added.

goranmoomin · 2026-03-01T19:34:28 1772393668

> The main problem with this approach at the moment is it busts your prompt cache, because LLMs expect all tool definitions to be defined at the beginning of the context window.

TBH I'm not really sure how it works in Amp (I never actually inspected how it alters the prompts that are sent to Anthropic), but does it really matter for the LLMs to have the tool definitions at the beginning of the context window in contrast to the bottom before my next new prompt?

I mean, skills also work the same way, right? (it gets appended at the bottom, when the LLM triggers the skill) Why not MCP tooling definitions? (They're basically the same thing, no?)

ejholmes · 2026-03-01T20:37:31 1772397451

> both are a method of tool calling, it does not matter which format the LLM uses for tool calling as long as it provides the same capabilities.

MCP tool calls aren't composable. Not the same capabilities. Big difference.

jeremyjh · 2026-03-01T18:38:28 1772390308

No, it really matters because of the impact it has on context tokens. Reading on GH issue with MCP burns 54k tokens just to load the spec. If you use several MCPs it adds up really fast.

goranmoomin · 2026-03-01T19:37:48 1772393868

The impact on context tokens would be more of a 'you're holding it wrong' problem, no? The GH MCP burning tokens is an issue on the GH MCP server, not the protocol itself. (I would say that since the gh CLI would be strongly represented in the training dataset, it would be more beneficial to just use the CLI in this case though.)

I do think that we should adopt Amp's MCPs-on-skills model that I've mentioned in my original comment more (hence allowing on-demand context management).

jeremyjh · 2026-03-02T17:28:07 1772472487

MCP specs are verbose json objects and they have to go into the context before you can call them. So yes it is an issue with the fundamental design of the protocol.

Even if the model doesn’t already know the cli commands it can interrogate them at a much lower token cost for just the commands needed.

ashdksnndck · 2026-03-01T18:44:36 1772390676

Verbosity of the output seems orthogonal to the cli vs mcp distinction? When I made mcp tools and noticed a lot of tokens being used, I changed the default to output less and added options to expose different kinds of detailed info depending what the model wants. CLI can support similar behavior.

jeremyjh · 2026-03-02T17:26:21 1772472381

It has nothing to do with outputs, it’s about the json spec data that goes into the context.

nextaccountic · 2026-03-01T19:15:31 1772392531

In the front page there's a project that attempts to reduce tje boilerplate of mcp output in claude code

Eventually I hope that models themselves become smarter and don't save the whole 54k tokens in their context window

vojtapol · 2026-03-01T18:09:41 1772388581

MCP needs to be supported during the training and trained into the LLM whereas using CLI is very common in the training set already. Since MCP does not really provide any significant benefits I think good CLI tools and its use by LLMs should be the way forward.

FINDarkside · 2026-03-01T21:10:05 1772399405

This is very developer centric. While Github might have good CLI, there's absolutely no point in having most services develop CLIs and have their non-technical users install those. Not only is it bad UX, but it's bad from security perspective as well. This is like arguing that Github shouldn't have GraphQL/Rest api since everyone should use the CLI.

avaer · 2026-03-01T18:08:11 1772388491

MCP vs CLI is the modern version of people discussing the merits of curly braces vs significant whitespace.

That is, I don't think we're gonna be arguing about it for very long.

kaydub · 2026-03-01T21:50:14 1772401814

Yeah, I've gotta use skills more. I didn't quite get it until this last week when I used a skill that I made. I didn't know the skill got pulled into context ONLY for the single command being ran with the skill, I thought the skill got pulled into context and stayed there once it was called.

That does seem very powerful now that I've had some time to think about it.

user3939382 · 2026-03-01T22:20:20 1772403620

Or you could argue that if the assistant needs so much modular context your tools are defective.

goranmoomin · 2026-02-27T03:32:20 1772163140

tldr; they wanted to run a Tauri app in browser for dev purposes.

To do so, they shimmed the Tauri’s rust communication bridge to use web-socket to communicate with the main app’s rust implementation.

This is only used by dev, but if something like this is provided by Tauri/Electron it can probably enable a bunch of interesting use cases… (and probably a bunch of RCEs as well, though)

auraham · 2026-03-02T16:48:02 1772470082

Do you know what is the purpose of the staging build? Not sure why the author requires that flag when building the binary.

phildenhoff · 2026-02-27T04:05:26 1772165126

Having built some stuff with Tauri, being able to debug using Chrome instead of a Safari/Webkit console would be _so nice_.

mtndew4brkfst · 2026-02-27T11:47:35 1772192855

I have no idea about timeline, progress, or suitability, but IIUC the Tauri folks are exploring integration with Chromium Embedded Framework:

https://github.com/tauri-apps/cef-rs

https://github.com/tauri-apps/tauri/issues/14963#issuecommen... and following comments

https://github.com/tauri-apps/tauri/issues?q=CEF

yokuze · 2026-02-27T05:12:53 1772169173

Absolutely not the same thing, but I’m going to shamelessly plug my Tauri MCP in case you find it helpful: https://github.com/hypothesi/mcp-server-tauri

With the debugging capabilities it gives agents, I find I don’t miss Chrome DevTools so much.

goranmoomin · 2026-02-25T02:58:17 1771988297

TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)

If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)

Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.

ashtonshears · 2026-02-25T03:13:42 1771989222

Do you work at Anthropic, or know people who do?

I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash

Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them

dannersy · 2026-02-25T07:09:51 1772003391

Let us not pretend that they won't be used for war eventually. If they cave immediately under pressure, then this is an inevitably.

nradov · 2026-02-25T04:43:42 1771994622

How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.

yunwal · 2026-02-25T05:30:13 1771997413

This is the US military we’re talking about so 95% of what they do is attacking people for oil. They don’t “need” more of anything, they’re funded to the tune of a trillion dollars a year, almost as much as every other military in the world combined. What holy mission do you think they’re going to carry out with the assistance of LLMs?

nradov · 2026-02-25T06:30:13 1772001013

That's a total non sequitur. If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

Personally I favor a less interventionist foreign policy. But that change can only come about through the political process, not by unaccountable corporate employees making arbitrary decisions about how certain products can be used.

ahtihn · 2026-02-25T07:04:02 1772003042

> But it's not a valid reason to deny the warfighters the best possible weapons systems.

Of course it is.

Think about it this way: if you could guarantee that the military suffers no human losses when attacking a foreign country, do you think that's going to more or less foreign interventions?

The tools available to the military influence policy, these things are linked.

US military is already overwhelmingly powerful, there's 0 reason to make it even more powerful.

nradov · 2026-02-25T13:18:00 1772025480

That's so delusional. The US military is currently preparing for a potential conflict with China to stop an invasion with Taiwan. They don't have anything near "overwhelming force" for that mission: recent simulations put it about even at best. People who believe they don't need any improved autonomous weapons are simply uninformed.

ahtihn · 2026-02-25T17:32:09 1772040729

Why would the US enter into direct conflict with a nuclear power over a country they aren't even formally allied with?

If the US actually cared they'd formally place Taiwan under nuclear protection.

ashtonshears · 2026-02-25T14:48:53 1772030933

You are claiming all americans must happily create weapons. Thats a silly statement to most americans and humans

nradov · 2026-02-25T17:15:01 1772039701

Don't presume to put words in my mouth. I flagged your comment for lying about my claims.

Individual Americans aren't slaves. They can do as they please and are under no obligation to help build weapons for warfighters. But I think it's ridiculous and offensive for a US corporation to presume to take on a role as moral arbiters by placing arbitrary limits on US government use of certain products. There are larger issues here that need to be addressed through the political process, not through commercial software license agreements.

ashtonshears · 2026-02-26T15:08:11 1772118491

Sure, it wasnt fair for me to claim you said that, so I apologize. It was rude of me to frame my position in that manner, and wasnt intended maliciously.

I meant to suggest that corps being unable to take those positions results in such a world for Americans at those corps

yunwal · 2026-02-26T03:28:54 1772076534

> I think it's ridiculous and offensive for a US corporation to presume to take on a role as moral arbiters

A corporation is just a group of people. Anthropic isn't even public, and therefore it's directors aren't subject to any sort of fiduciary duty enshrined in law. They can collectively act as they wish.

radlad · 2026-02-25T07:11:11 1772003471

> If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

It is an ethical dilemma: believing an armed force will act unethically is in fact a valid reason to refuse to arm them. You are taking a nationalistic view regarding the worth of life.

And if you believe it is unethical to arm them, it is rational to use whatever leverage you have available to you - such as refusing to sell your company's product.

Furthermore, one of the two points at issue was regarding surveiling civilians.

yunwal · 2026-02-26T03:25:51 1772076351

> that change can only come about through the political process

What, to you, is the political process? Why is wielding your economic leverage to incite change illegitimate to you?

chris_wot · 2026-02-25T05:31:57 1771997517

"How is it a good thing to refuse to provide our warfighters with the tools that they need?"

Perhaps you should consider that this is a loaded question. I don't think HN needs this sort of Argumentum ad Passiones.

nozzlegear · 2026-02-25T05:26:55 1771997215

Why are you asking this question? You know what the answer is, you've just arbitrarily decided that it's specious in an attempt to frame rebuttals as unreasonable.

nradov · 2026-02-25T06:26:34 1772000794

I'm open to reasonable rebuttals but all the rebuttals that I've seen so far are simply uninformed.

s1artibartfast · 2026-02-26T20:56:28 1772139388

1. You don't believe in the mission or direction of US warfighters 2. Supporting warfighters is developmentally distinct from what you want your corporate competences and direction are. 3. you don't want military to be more safe an capable.

saghm · 2026-02-25T03:56:57 1771991817

> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)

I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan

salawat · 2026-02-25T07:19:32 1772003972

The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.

goranmoomin · 2026-01-26T04:01:34 1769400094

I feel like declarative container-like dev environments (e.g. nix shell or guix shell, and so on) will become much more popular in the following years with the rise of LLM agentic tools. It seems that the aformentioned tools provide much more value when they can get full access to the dev environment.

Sprites[0], exe.dev[1], and more services seem to be focusing on providing instant VMs for these use cases, but for me it seems like it's a waste for users to have to ssh into a separate cloud server (and feel the latency) just to get a clean dev environment. I feel that a similar tool where you can get a clean slate dev environment from a declarative description locally, without all of the overhead and the weight of Docker or VMs would be very welcomed.

(Note: I am not trying to inject AI-hype on a Guix-related post, I do realize that the audience of LLM tools and Guix would be quite different, this is just an observation)

[0]: https://sprites.dev

[1]: https://exe.dev

sdsd · 2026-01-26T13:42:24 1769434944

As a Guix lover and LLM tooling enthusiast, I complete agree. Administrating my system via Claude Code is so much easier. LLMs work better on a system that's hackable via text.

attila-lendvai · 2026-01-26T11:50:53 1769428253

random note: there's `guix shell --container --emulate-fhs`.

goranmoomin · 2026-01-25T13:02:39 1769346159

This is very interesting, I haven’t touched macOS development for quite a while but it’s good to know that libraries are still being written for both AppKit and SwiftUI on macOS.

I do feel that this library would benefit from an explanation on why this was needed. AFAIR AppKit already provides a native tabbing API where you can “just” (that “just” is doing a lot of heavy lifting) implement a few delegate methods and you get tabbing behavior for free, especially on document-based apps. (Sorry, I do not remember the specifics, it might have been a tad more difficult)

I’m not updated on the SwiftUI equivalent, but I would imagine that a similar API would exist much alike API for multiple windows or multiple documents.

I think everyone would benefit from a “why” explanation (which I definitely think would exist, since I’ve used too many AppKit APIs in pain), and also some screenshots for a demo app (so that we can expect how it would look and how much the look and feel would deviate from the native counterparts).

atombender · 2026-01-25T21:00:31 1769374831

I've tried the native tab support several times, and my impression is that it's good for very little.

It may be OK for certain types of document-oriented apps, but there's a reason most apps (Chrome, iTerm, even Safari uses its own native tabs, I believe) don't use it. It's underbaked and awkward to fit into a model where your "tab data model" doesn't neatly fit the document data model that the framework wants.

I recently made an app where I wanted tabs, and I just ended up abandoning tab support for this reason, and adding a todo item to use an off-the-shelf tab UI library in the future.

zapzupnz · 2026-01-25T13:15:08 1769346908

The website already has a demonstration of what this does that native tabs don’t do and how they look.

goranmoomin · 2026-01-26T04:08:24 1769400504

Yeah I realized that only now, for some reason when I was on mobile and I was looking into this the demo video was not loading at all. I would love to retract my comment :(

brianfryer · 2026-01-26T05:20:29 1769404829

I totally missed the video on mobile too.

msephton · 2026-01-26T08:59:51 1769417991

Yes, just blank space instead of video on mobile. Edit: opening in Safari worked

saagarjha · 2026-01-25T15:14:37 1769354077

Native tabs work at the window level.

goranmoomin · 2025-11-28T18:15:09 1764353709

I haven't even realized that while I was reading the article, but it is amusing!

Though one explanation is that I think for the other stuff that the writer doesn't explain, one can just guess and be half right, and even if the reader guesses wrong, isn't critical to the bug — but sockets and capabilities are the concepts that are required to understand the post.

It still is amusing and I wouldn't have even realized that until you pointed that out.