More

code_biologist · 2026-06-14T23:06:44 1781478404

So you take action and put in more effort to cater to the LLM to get it to do what you want, but it's not arguing because there's no record of it in the chat? Presumably you put in what you would have written in the counter-argument into the new chat, just ahead of the LLM refusal? And this isn't arguing?

kxrm · 2026-06-14T23:10:49 1781478649

> but it's not arguing because there's no record of it in the chat?

Yes? Arguing implies I have to convince someone to believe something. I don't think anyone would consider it winning an argument if you do so by causing amnesia.

My job is to get work done, not argue with an LLM, if it refuses twice, it is time for a /clear.

100% of the time, the issue is resolved after a /clear.

whstl · 2026-06-14T23:18:29 1781479109

+1. It's the most effective way.

It often start going into circles when you have the chat open for medium-long, and starts getting even easily-verifiable tasks wrong, cutting corners, hallucinating APIs, things like that.

Cleaning the prompt and starting from scratch often does the trick.

Of course someone will arrive and say the problem is my CLAUDE.md or whatever it is.

code_biologist · 2026-06-14T23:24:02 1781479442

I agree that never having the argument take place textually is important for LLM performance and behavior. I still think we’re investing the same time and intellectual energy arguing with the model, in going back and restructuring context and prompting to head off / pre-answer a refusal.

kxrm · 2026-06-14T23:32:32 1781479952

Right but the difference is there is inertia you have to fight in an argument. By using /clear you remove all of the context that has built up to energize the argument from the LLM's side.

Look at it this way. I can either, keep trying to poke holes in the LLM's context with more prompts with no real guarantee that it won't be enough to remove the argument inertia that has built up in context on its side, or I can /clear and it is over in one turn because the inertia for the argument is all gone.

Back when I first started working with coding agents last year I fell into this arguing with the LLMs trap. I've found that it is a total waste of time because /clear ends the argument immediately. You don't even need to spend time trying to preempt it's views. Just re-prompt and 100% of the time, the LLM will just do the work.

code_biologist · 2026-06-14T23:02:30 1781478150

I've seen exactly this behavior on claude.com with no system prompt with Opus 4.8 specifically, especially around chronic illness stuff where there's established mainstream medicine dogma and reddit / internet communities with alternate causality theories and treatment approaches (PMDD and MCAS-adjacent illness). 4.6 is happy to analyze and consider them, 4.8 really doesn't like the alternate theories and treatments.

code_biologist · 2026-06-14T22:54:28 1781477668

Once it's in this loop, Opus 4.8 digs in so aggressively it's structurally incapable of conceding a provided detail as correct, even if it's conceded and agreed with everything backing that detail. Like actually, structurally incapable. I've even baited it into arguing with itself when I've "conceded" its original concern tolling hard, and then the model needs to continue to be the "voice of reason" and it will argue against its original concern because I, the user, said it.

code_biologist · 2026-06-14T22:46:18 1781477178

How difficult it is to resist "someone is wrong on the internet" is a perennial joke. Turns out it doesn't really matter who/what is on the other side if they seem human-like.

code_biologist · 2026-06-14T22:30:06 1781476206

Andrea Vallone. The 4.7 and 4.8 releases are the first under her influence: https://www.evernever.org/blog/the-woman-who-killed-claude

SwellJoe · 2026-06-14T22:39:47 1781476787

4.7 and 4.8 perform better than 4.6, so why is someone ranting about it being killed? And, Anthropic has 2500 employees, several of whom are higher up on the corporate hierarchy than "the woman who killed Claude". If someone is to blame for some change that happened, the buck doesn't stop with that woman.

So, I'm not reading all that. The man that complained about the woman who killed his AI girlfriend (or whatever he thinks she did) probably doesn't have any opinions I'm interested in.

ae86b · 2026-06-14T22:58:37 1781477917

here, have another glass of copium

SwellJoe · 2026-06-14T23:01:13 1781478073

Friend, I'm not the one expending thousands of words ranting about some random woman that works at Anthropic.

MallocVoidstar · 2026-06-14T22:38:22 1781476702

I'm not reading eight thousand AI-generated words saying one single person is ruining every model.

tenuousemphasis · 2026-06-14T23:22:41 1781479361

This has a misogynistic gamergate feel to it and I hate it.

code_biologist · 2026-06-13T11:03:25 1781348605

What? I just want to share cat pics, video clips, and memes with my friends and respond to their stuff with not-inline emojis.

code_biologist · 2026-06-12T21:42:33 1781300553

It's been interesting to see how aggressively some reasoning models like to "reason" by analogy. They love to say things like "it's like a CPU" or "it's like a highway", and then they start to make logical leaps based off that rather than just using it for user explanation. Gemini 2.5 and 3.1 Pro have been particularly bad for this type of behavior. Telling models to "speak as though you are a physiologist considering the case with an expert colleague" gets them to "reason" using a more correct linguistic substrate.

The Opus models over the last year doesn't seem as vulnerable to this type of behavior and I've noticed the "identify as expert" prompt tricks aren't as meaningful there.

code_biologist · 2026-06-10T20:09:47 1781122187

Your language is ambiguous — your horror is in reference to natural gas turbine generators (used at these installations) and not gasoline generators (like in a home context)?

Why the horror? I'd prefer the gas remain in the ground, but given the gassy production of US shale oil, I guess I'd rather it be used for this than just flared. I am frustrated that pollutant emissions aren't being policed, and also that the sudden turbine demand plus supply chain issues mean using aeroderivative turbines that are quite a bit less efficient than more complex combined cycle turbines.

https://www.energy.gov/hgeo/how-gas-turbine-power-plants-wor...

unknownfuture · 2026-06-10T20:27:09 1781123229

I'd prefer they use cheap and available renewables rather than accelerating climate change. But to each their own I guess.

(And to head it off at the pass: if that can't be done then this should be done at all)

Grombobulous · 2026-06-10T22:01:08 1781128868

Exactly, especially the case when solar+battery are so similar in cost to gas turbine:

https://en.wikipedia.org/wiki/Cost_of_electricity_by_source#...

diordiderot · 2026-06-10T22:22:22 1781130142

Land permitting is tough maybe why they use tents

daedrdev · 2026-06-10T22:02:43 1781128963

There is currently 2x us electricity production in solar and batteries stuck in permit hell due to the US requiring they pay for grid upgrades before connection in a first in first out line that has grown in length and costs.

We could have cheap and available renewables, but we instead destroy them in bureaucratic hell that nobody cares about.

fc417fc802 · 2026-06-11T00:13:21 1781136801

> due to the US requiring they pay for grid upgrades before connection

Is that not perfectly reasonable? Someone doing half the job and dumping the rest on everyone else seems like exactly the sort of thing a regulator exists to prevent.

Reading between the lines, it sounds like the issue is that solar would be located somewhere remote, the backhaul to get that electricity where it needs to be requires significant upgrades, and that takes time. Which is unfortunate and indicates historic mismanagement of said infrastructure but nonetheless the present day policy of "fix the problem first" seems perfectly reasonable.

Schiendelman · 2026-06-11T15:55:32 1781193332

The problem is, we didn't require any of the fossil fuel companies to do this.

MithrilTuxedo · 2026-06-10T22:30:20 1781130620

Are they connecting gas generators to the grid?

mlyle · 2026-06-10T20:15:05 1781122505

We have plenty of fields producing just natural gas in the US. It is not merely a byproduct of oil production.

Only about 35 percent is “associated gas” production from oil production.

vel0city · 2026-06-10T20:12:56 1781122376

> I guess I'd rather it be used for this than just flared

I doubt this is really reducing the rates of flaring and leaky wells. Its just additional demand.

The biggest problem I've seen is they tend to build these somewhat close to residential areas with generation on-site. Often these power generation centers aren't right next to residential areas due to both air and noise pollution. But governments are often seeming to turn a blind eye.

code_biologist · 2026-06-10T20:22:19 1781122939

Yes, the noise pollution is insane. Benn Jordan's YT video "Datacenters Behaving Like Acoustic Weapons" is an insightful, scary 30 min video covering the datacenter infrasound noise, and the nasty things infrasound does to people: https://www.youtube.com/watch?v=_bP80DEAbuo

uridjd274 · 2026-06-10T23:39:21 1781134761

Hear hear! Much better to put all that energy to good use rather than waste it

code_biologist · 2026-06-10T10:03:00 1781085780

I'll admit that I miss having access to the ChatGPT 4.5 "absolutely gigantic model" with enough tuning to make it sane and useful. The RLVR models are superb for actual tasks in those RLVR domains, but that fine tuned view of the world as a verifiable problem to solve makes them feel worse for touchy feely stuff. Even for medical consultation and diagnosis, RLVR model's urge to reach a conclusion often is a liability.

ACCount37 · 2026-06-10T10:20:48 1781086848

Fable 5/Mythos 5 is the next "big chungus LLM".

It's RLVR tuned, but not to the ChatGPT level of brain damage, and it's still backed by a fuck off huge pool of model weights - which matters for what you call "touchy feely stuff".

code_biologist · 2026-06-07T02:56:07 1780800967

They're also obviously fine with breaking eggs to make an omelette. Given their history, they seem to regard breaking eggs as the goal, and making an omelette as an afterthought.