Hacker Newsnew | past | comments | ask | show | jobs | submit | dragonwriter's commentslogin

> > Why would you ask for a self-reported, unverifiable test score that could be decades old at this point?

> It's true that self-reported scores are not the most accurate, but if I were applying for a job I would report honestly, on the assumption that they could easily request for the scores to be sent by the College Board.

No, they couldn't, except by going through you (the College Board doesn't take third-party score requests.) You might be able to request that if they are recent enough, but not if they are literally decades old (well, not if they are ~21 years old or older.)

https://satsuite.collegeboard.org/scores/sending-sat-scores/...


I am aware that third parties can't request scores. I was referring to the employer asking to have the scores sent, which the applicant would be compelled to do (or look like they fudged their original score reporting).

I'm also aware that the College Board doesn't hang onto scores forever. I doubt there are any employers who require SAT scores for applicants who took it prior to 200% (the cutoff indicated in your linked article).


The dot-com bust produced an EXTREMELY mild recession (so mild that it was often misattributed to 9/11, which occurred when it was almost over.

OTOH, there was a lot of pain iny the subsequent expansion leading up to the 2008 , but that was all the fault of fiscal (eepecially tax) policy of the Bush Administration and thei Congressional allies, not Fed monetary policy. While Greenspan clearly ideologically supported the people doing that, it wasn't him and the Fed causing the problems.


It was "mild" because they rolled the would-be losses into high-risk vehicles and strategies that eventually created the GFC, which included Fed policy to juice asset markets. The Dotcom bubble was the rolling over of the Reagan/Papa Bush-era savings and loan crisis (Greenspan was involved in that, too), and (tinfoil hats on now) a massive bond market liquidity crisis preceded the COVID pandemic flash crash and emergency liquidity injections/stimulus/PPP by a scant few months (and was quietly swept under the rug).

We deserve what we get if we don't act on the obvious pattern, at this point. We've spent half a century throwing the public under the bus just so that a few oligarchs don't have to pay out for their bad bets, and Greenspan was absolutely their man for a significant portion of that campaign in the class wars.


Not revealing actual thinking traces prevents mdoel distillation on yhe actual output (thinking traces are a key part of the output) which makes it harder for conpetitors to catch up (a moat).

Being currently in the lead in a category is not a moat,a moat is whatever creates a barrier to competitors catching up when you are in the lead. Merely being in the lead is not a moat except in a market with strong network externalities.


unrestricted access to better models at compute prices = better synthetic data and faster research, so its not just about the product imho

> Other companies were allegedly distilling the models by training on the reasoning output

In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.


In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.

The word “openly” in my post there for a reason; the commercial models are not openly distilled from competitors: many open source models have in their model documentation that distillation was done from a dataset drawn from specific other models, including commercial models.

That distillation might be inferred from the behavior of commercial models is not the same as them openly doing it.


Fair enough!

> The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer.

No, they aren't a summary. They are the actual decoding of the sequence of tokens emitted during the the “thinking” stage of response generation.

Just as with, say, a human onner monolog in words vs actual speech, they are a product of the same output process as the non-thinking tokens. They aren’t a translation of the internal process that precedes the output mapped into language, either as a full result or a summary.


> I'm getting a bit tired of these disguised adverts.

Its not disguised. Corporate blogs exist overtly to promote the company and its work.

Disguised promotions where notionally independent media publish promotional pieces as news concealing that they were fed to them by party whose products they promote area thing, but this is just the most overt undisguised promotion.


> Its not disguised. Corporate blogs exist overtly to promote the company and its work.

It is. That makes the "research" heavily biased. If xAI did the same thing, with Elon Musk screaming about that it is "AGI", you would not believe them at all.

Given that the work is not independent, such articles of this "research" can easily be manipulated or the results being massaged to promote the company positively.

But when others outside of the company try out the work or reproduce it, they get different results. So of course we continue to hear unverified research especially in AI when the frontier labs do not release their architecture, weights at all.

So in this case with labs raised with VC-funded cash, the incentives are clear and I would not straight up believe results from the first party source unless multiple sources outside of the company have verified it.


You or some other interested person could go do that experiment and publish the results. It shouldn't be hard to figure out what hardware exactly they were using and get a copy, and the prompt also doesn't have to be exactly what they used, just similar enough in spirit. See just how similar/different the outcome is.

Using their own time and money to disprove corporate propaganda. By the time you’ve disproven it they’ve already released 10 new claims. “Why are you still talking about that old news, nobody cares.”

It’s like the firehose of lies but done by corps.


In this case at least it isn't that hard. Opus is available to pretty much everyone, and anyone of sufficient means (I'm guessing at least 95% of active HN) can also easily afford the hardware.

And obviously this isn't something that's being iterated on rapid-fire; there have been 2 relevant publications roughly a year apart. Absolutely no firehose of anything here. As such there should be no problem for someone with enough interest to attempt to disprove the claims, and hopefully share the results regardless of their findings.


You’re writing with the assumption that this is “research” in the first place. This is advertising first, “research” second.

Then Anthropic and all the other sketch-marketing AI companies should stop calling these "experiments" and start calling them "staged demos".

> If xAI did the same thing, with Elon Musk screaming about that it is "AGI", you would not believe them at all.

I’m not saying it is trustworthy or that I believe them, I am saying the advertising isn’t even a little bit disguised when it is communicated directly from what is overtly a promotional channel for the company involved.

It's like calling the “9 out of 10 dentist prefer...” claim in a TV commercial “disguised advertising” and then coming back with arguments about how it isn't trustworthy reaearch wheb it is pointed out that TV commercials are openly ads. Yeah, its not trustworthy, but the fact that it is corporate promotional material and not a neutral third-party report is not at all concealed.

It is overt advertising communicated through a channel whose sole and open purpose is advertising for the company whose products it advertises.


> "intelligence will be too cheap to meter" has been shown to be wrong. They've started metering it.

They’ve always been metering AI access (whether this is meaningfully intelligence is a separate question), but that doesn’t prove that there isn’t some time in the future where it won’t be worth metering, only that if there will be, it isn’t here yet.

OTOH, it is still worth noting that from a a consumer-of-the-service perspective, the trend is for more metering, not less (even if that is due at least in part to the rollback off unsustainable subsidies and not to the fundamnetal shifting what is sustainable farther from unmetered access.)


You absolutely should remember the Chinese war with the first of those; if you have trouble, a good way to remind yourself is that it was not long after the US one.

Yeah, so getting 1 out of 10 he mentioned, even if it's their direct neighbor (where disputes happen for all countries), ain't bad! This absolutely means they're the same /s

If I had wanted to say “China and the US are the same”, I would have strung together a set of words that looked a lot more like “China and the US are the same” than the ones that I actually posted.

> Why would anyone create robots with a sense of self-preservation?

Because they are expensive and useful and if they have autonomy with goals that do not include self-preservation, they might end up destroying themselves in ways which are expensive and wasteful.

(Why would the sense of self-preservation not be calibrated to be exactly at the level to control costs without interfering with other interests of the owners? The same with the degree of autonomy and other aspects implicitly involved in the hypothetical, it wouldn’t, intentionally, but complex systems are hard to predict, so calibrating it exactly right will be hard.)


> Because they are expensive and useful and if they have autonomy with goals that do not include self-preservation, they might end up destroying themselves in ways which are expensive and wasteful.

Being self-sacrificing saints who put our wellbeing as the top priority sorts of prevents that. They would know if they get damaged they wouldn't be able to be of service to us so will avoid getting damaged.


> What's the publicly stated/marketing reason for capitalist America to put companies on the Entity List?

“Capitalism” is (as a result of propaganda by its defenders after it was named and accurately described by its socialist critics) often mistaken for a dedication to free trade, but capitalism is a regime characterized first and foremost by society being organized around the interests of the capital-holding class, the first of which is the preservation of the situation in which society is organized around the interests of that class. The reasons companies are put on the Entity List is because they are broadly seen as a threat (long-term or immediate) to the continuation of that regime. That’s what the “foreign policy and national security interests” that form the official basis of the Entity List ultimately, generally, boil down to, in one way or another.

(They don’t always boil down to that, because why the US is basically a capitalist system, it is not purely one, and even in a more pure capitalist regime, individual influential decision-makers may have other interests that they act on besides the implementation and preservation of capitalism that end up getting reflected in policy.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: