Hacker Newsnew | past | comments | ask | show | jobs | submit | BoorishBears's commentslogin

Reasoning models can coaxed to reason like they do in dedicated reasoning blocks, outside of those blocks: in normal parts of the response.

But Anthropic at least has openly admitted they try to detect that and interfere


Woah now, I'm for headless browsers but let's not start comparing any of this to Rosa Parks lol.

The reality is a lot of interesting, trivially harmful to non harmful things are illegal and we still do them anyways.


Can you imagine how cringe it would be setting up that hero image in office?

I’m sure it wasn’t the intent but the halo really makes him look like a saint

Feels very much intended.

I guess their thought process is, both alias and non-alias accounts use @icloud.com

You were always able to reserve a normal icloud email address just like you would a GMail account, so banning all icloud email addresses would be banning non-alias Apple customers

That being said, I'm not convinced anyone who wanted to ban aliases couldn't have already. The alias emails look weird enough I'm guessing you could ban them with few false positives.


> The alias emails look weird enough I'm guessing you could ban them with few false positives.

While this is true not all of them been weird. Some can be just word + number + word without dots or underscores.

Also blanket banning whole domains is just much easier and already done for temporary emails. No false positives.


The point of the article is previously banning Apple's temp domain would create many false positives (all the normal Apple registered emails that chose @icloud.com during setup)

The problem the article is about is that suddenly even those of us who refuse to argue with a machine are being dragged into it.

I've had simple prompt engineering tasks that cause 4.8 to clamp down. In the past "browbeating" it (read: a sentence telling it not to read the task in bad faith) was enough.

Now it digs in and starts ranting about why it won't capitulate, I'm actually wrong, etc.

Extremely frustrating, and it became a problem with Opus 4.7 because they're trying to make up for the downgrade in parameter count with more RL, but RL does relatively poorly with non-trivially verified things like nuance in instructions.


I'm staying in a hotel right now and the TV is locked in hospitality mode and was blocking me from just installing Plex. It (Opus 4.8) gave me this whole jeremiad about how I need to be careful and it probably won't work and I should just watch on my laptop, but it did give me the service menu code. But man, it was such a downer.

Gemini gave it and clearly explained how best to get in, and then troubleshooted a few other weird issues that cropped up, without the moralizing.


This could be a good guardrailing technique. Keep people away from your hard limit refusals by ring fencing them with frustrating pedantry.

How are you connecting to various data sources?

We're offering secure connections to sources like SQL DBs, warehouses, file stores, and MCP/API sources like PostHog or Salesforce. Customers can choose to set up credentials in our key store. We also support directly dropping data into BitBoard (where we sync it to object storage).

You're laying on enough qualifiers that even a recent robbery of a Waymo is precluded, because (if we really want to victim blame) their window was down which is asking for it.

But overall, not sure why the tone of these replies: then Venn diagram of "wants to rob people" and "cares Google's AV will record it" doesn't include as much overlap as you're implying.

A Waymo has even been used as a getaway vehicle a few times now, once even successfully


Except they openly reject many many other classes of prompts, including extremely high stakes CBRN.

It's only the direction that has direct potential business impact they've decided to sabotage instead of reject.


Opus 4.7 was smaller and people still paid 4.6 prices.

gpt-5.5 isn't larger than gpt-5.4 but costs double.


Similar for GCP if anyone's wondering, and in fact a bit further in some ways: https://cloud.google.com/terms/advanced-ai-safety-addendum

60 days.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: