Hacker Newsnew | past | comments | ask | show | jobs | submit | verdverm's commentslogin

I'm using SearXNG, EXA, Tavily, and soon (tm) Cloudflare

They all give slightly different results, you can dedup / fusion with heuristics / another agent


There is a lower bar (that gets lower over time), but ime, the config you are describing is too low still.

qwen/gemma in the 27/35B range @fp8 are better than gemini-2.5, but less than gemini-3.1, you can run DS4-flash @fp8 on two DGX spark, and things keep becoming better. DiffusionGemma came out recently with 4x token gen speeds.

tl;dr - the models you appear to be trying with are too small or too quant'd


slow down and put more effort into vetting, trust your gut, make them share all screens during (coding) interviews

They explicitly do not want them. They will pay companies to abandon renewable energy projects that have already planned / started.

Not just climate data, they are changing how census data is aggregated so they can reverse-individualize for political goals, discussed 3 days ago.

https://news.ycombinator.com/item?id=48517377

https://www.npr.org/2026/06/12/nx-s1-5855734/census-bureau-d...


51° FOV, not much improvement here, but at least it is a much smaller form factor

Second this

related: Electrostates, Petrostates, and the New Cold War

https://www.youtube.com/watch?v=gLnxzkiB-GI


There is a bug in llama-cpp for qwen/gemma models, use vLLM instead

what bug and it affects what?

it's a prompt cache invalidation bug that causes all input to be reprocessed instead of getting preloaded

There are other reasons to prefer vllm to llama-cpp as well


You can do this in opencode and pi (haven't used), by defining your own agents or overriding the built-in ones, so in your primary agent you can disable all tools and give it good instructions for how to delegate

I imagine most harnesses should have a way to do this today, if they don't, get a new one. OpenCode i.e. is highly customizable, Claude and VS Code both support a ton as well including custom agents (though unclear if you can create custom top-level in claude-code)

https://opencode.ai/docs/agents/

https://code.claude.com/docs/en/sub-agents

https://code.visualstudio.com/docs/agent-customization/custo...


Thanks, those don't deterministically prevent the main loop from using tools thought, unless I'm wrong that's just prompting the main agent on when to use specialized sub agents

you can configure tools, thinking, permissions et al on a per agent basis in the frontmatter, or via config (which they use in the examples), either location is valid, merging order (?)

the main agent would be very different, basically an orchestrator, and you are "loop engineering" it, and turning off all the things for this main agent besides being able to run subagents

for opencode:

https://opencode.ai/docs/agents/#permissions (what tools, mcp, etc...)

https://opencode.ai/docs/agents/#task-permissions (what subagents it can call)

https://opencode.ai/docs/agents/#additional (thinking effort)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: