Hacker Newsnew | past | comments | ask | show | jobs | submit | Borealid's commentslogin

I will have a stab at legitimately explaining the viewpoint you profess not to understand.

"Institutional" or "structural" racism doesn't just mean racism by one or two people in power. It's the idea that the majority of society demonstrates some kind of racial bias, by whatever means.

Society is made up of people.

One of two things must, logically, be true:

1. A SUBSTANTIAL portion of the people who make up society exhibit some kind of racist behavior, or

2. Structural racism is not a widespread issue

Which one of these two propositions must one believe is likely if one is researching the impact of structural racism? Keep in mind people do not generally don't go looking for things they do not believe exist.

In other words, people don't like other people believing they-en-masse discriminate (even IF they do), so taking actions that only make sense if you think that poorly of the everyman offends them. It's not about what someone wants to be true, it's that investigating implies a level of distrust in society some members of that society find uncivil.

To use a blunt analogy, "why not let me check your underwear to make sure you haven't soiled it? Do you just not want it to be true?".


You have misunderstood what structural racism is. It is not about the majority of people being racist. Is about the systems being constructed in ways that lead to racist outcomes. You can have a society with zero racist individuals and if they continue to enact the racist systems (perhaps created by racist folks long dead) you'll have structural racism. I don't disagree with the idea that the mis-understanding you have is widespread though, and would certainly be a cause for folks not being comfortable with the idea (as they have mis-understood it).

It's so disappointing that you have made the mistake of thinking that those two possibilities listed cover the entire set of possibilities.

The Parable of the Polygons is a cute case study that shows that it is possible, in a mathematical sense, to prefer diversity and yet end up segregated: https://ncase.me/polygons/

The whole point of studying institutional and structural racism is that no one needs to be racist per se to have racially discriminatory outcomes. Perhaps a good analogy is the higher mortality rates among left-handed people. We no longer persecute them and drive them out of society or beat them for their sin, and yet, they die earlier due to structural factors.

I agree with you that "people don't like other people believing they-en-masse discriminate." And that's why science in the US is f*(&ed, because somehow everyone takes intellectual inquiry as some sort of personal affront or verdict on individual virtue, and that's the one thing the American cannot abide, the thought that someone else is judging them and finding them wanting.


The state of the art, "xray-reality", is not blockable. It's a legit tls connection with data smuggled inside it.

Are you taking from the experience that this is not blockeable in Russia?

EDIT: I might be confusing vless/xray/reality but seems like there are no problems to block it based on ip reputation + tls fingerprint + amount of connections https://habr.com/ru/articles/1044396/

Of course this would block some valid websites but when has government cared about that


The IPs are Cloudflare, the TLS fingerprint is uTLS Chrome, and the number of connections with xhttp is the same as your normal browsing.

If you are willing to block browsing all ordinary web sites fronted with a CDN, then yes you can block reality/xhttp. You cannot, however, differentially block it via any of the three things you mentioned.


They are willing to break some cloudflare-fronted websites. That's already a reality in Russia.

The government (any government) hates its citizens and the freedoms it had to allow them


Technically no, because selling a thing is both a risk and a cost (of time and money).

Your casual understanding is imprecise.

At all times the LLM is, indeed, predicting the next token. Anything it does emerges from that.

It did not "figure anything out". It predicted that text describing the use of a radial gradient was likely to follow text describing your problem.


>At all times the LLM is, indeed, predicting the next token

The point is that saying they're just "predicting the next token" is not at all explanatory nor providing insight. Saying the brain is just firing action potentials gives you no understanding about how the brain does what it does or what the space of its capabilities are. Similarly, predicting the next token tells you nothing about the capabilities of LLMs.


True, but that is a great fact to start from, and understand.

Then the next question becomes "HOW do they predict the next token?" There are many ways that can be done, why is this particular algorithm so GOOD?"

When people say "We don't understand how LLM works" isn't it really saying we don't understand how this specific algorithm used to predict the next token works? No, it is not, because "we" do understand how all those algorithms work there are many descriptions of them available.

So the question then really is "Why is the prediction this algorithm makes, so good, as compared to some other statistical algorithms?"

It's not about "Why does AI work so well?". It should be "Why does this particular XYZ algorithm work so well?"


I think it's a perfectly fine one liner explanation. If a kid asks why grass is green, do you stop explaining when you say chlorophyll is green, or do you go on to explain electron hybridization and all the spectra stuff, or do you go further to explain the structure of our eyes and why we perceive that reflected light as green? Also why green? Why not red? Do you have to explain that? It all depends on the audience, the context, and how much space you have to explain as well as how much you know. For you and more experienced people of course this is not sufficient and so you need to know more being "predict tokens" and so that opens up follow up questions like "how does it do that".

The point is that the output is text that is statistically correlated with the input.

The capability of the LLM is not to reason, it's to generate text that matches the patterns seen in the training corpus. It's possible that all you need to "reason" is plausible text generation. I'm not saying it's not. But nothing the LLM does fails to be explained by plausible-text-generation.

I contend that the best way to understand an LLM's capabilities is to understand the nature of the probability distribution that produced it. For instance, why does an "angry" prompt tend to produce more help than a "polite" one? Trying to explain that in terms of emotions or reasoning doesn't make sense, but it's readily possible to explain through the connections between text in the training corpus...


>The point is that the output is text that is statistically correlated with the input.

But we can simply note that this description applies to any machine learning algorithm. Yet LLMs are lightyears better than, say, Markov chains. What people are after is something that elucidates the features of LLMs that allow them to be so productive over what came before.


There is absolutely nothing stopping someone from distilling a modern LLM into a very effective Markov chain. The physical size of the model would explode because a context window containing C tokens of size B would need B^C Markov prior states, but the actual output would be a deterministic version of the LLM's with top-n n=1 sampling.

In other words, a Markov chain and a Transformer model are exactly equivalent in power (there is NOTHING that can be done with one and not the other). The Transformer model is just better pretrained and a more efficient compression/generation.


>In other words, a Markov chain and a Transformer model are exactly equivalent in power

Nonsense. Markov chains treat the past context as a single unit, an N-tuple with no internal structure. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss.


No, not nonsense.

Both are a lookup table whose key is the entire context window and whose value is a probability distribution for what the next token should be.

You can say the choice of probability distribution in the value is "leveraging the internal structure of the context" or not, but the same tokens in two different orders are two different lookup keys and saying it's impossible to achieve some result with a Markov chain is factually incorrect.

https://arxiv.org/pdf/2410.02724 describes the equivalence formally.


That paper doesn't prove the equivalence of Transformers and Markov chains, it uses Markov chains as a theoretical model to understand the behavior of Transforms. The expressivity of the model matters, and Transformers just are more expressive than Markov chains.

>but the same tokens in two different orders are two different lookup keys

This is necessarily true for Markov chains and not necessarily true for Transformers. Transformers learn invariance over certain kinds of semantically irrelevant transformations. The Markov chain simply has to learn each input variant independently, resulting in an explosion of state space and data requirements compared to the functionally equivalent transformer. Expressive power matters.

I really don't get people's love for saying X is "just" Y (it's just a Markov chain, it's just a Kernel method). It's a strange pathology to focus on the superficial similarity while downplaying the boost in expressive power from where the models diverge.


The paper presents a constructive transformation from any finite-input (finite vocab, bounded length) transformer to an equivalent Markov chain.

Do you have some concrete example of a transformer that cannot be represented as a mapping from inputs to probability distribution of outputs?

I say they're equivalent because it is possible to losslessly convert one to the other by wasting massive amounts of disk space and time.

As a second example proving the point, imagine you sampled a transformer's output for a certain context 85 trillion times, and put the output token frequencies in a table. Repeat for all possible inputs (of which there are a finite number). Then you built literally a hash map looking up the context and spitting out the distribution. That certainly is NOT a transformer any more (it's a hash map!!!), but the output approaches indistinguishability as the sample count increases - if the transformer is reasoning, so is the hash map built from it.

I'm not talking hot air here, they really are provably equivalent because a 1:1, onto mapping exists.

For the record, "X is more expressive than Y" means "there exists at least one thing that Y cannot represent and X can". Nothing to do with size or time.


>I say they're equivalent because it is possible to losslessly convert one to the other by wasting massive amounts of disk space and time.

There is a classical algorithm for every quantum algorithm if you're willing to waste a massive amount of space and time. There is a finite-state automata that can recognize any string some Turing machine can recognize. Yet we recognize these as distinct classes of computation. Mathematicians can get away with ignoring the tractability of finding an object with such and such properties. The rest of us can't.

Sure, there is a formal equivalence between LLMs and Markov chains, and this formal equivalence is useful for analysis. But this equivalence is not a constraint on the nature of the computations LLMs are doing. The formal equivalence does not mean that LLMs are "just predicting the next token". A probability distribution is a formal characterization of the statistical relationships between inputs and outputs. But this formalization does not undermine potentially further structure underlying the probability distribution (e.g. a deterministic mapping from inputs to outputs).

>if the transformer is reasoning, so is the hash map built from it.

Definitely not. "Formal" reasoning is making deductions based on the "form" or shape of some statement. In other words, transitioning from some token sequence to another sequence in virtue of the semantic structure of the token sequence (as opposed to its semantic content). Thus a necessary condition for reasoning is the ability to inspect the structure of the input rather than see it as a formless blob. Transformers can plausibly do this; lookup tables, Markov chains, etc necessarily cannot.

>For the record, "X is more expressive than Y" means "there exists at least one thing that Y cannot represent and X can".

Maybe expressive is the wrong word. But when a model has to wait for someone else to do the work then copy the answer, I call bullshit on it being (computationally) equivalent.


Just to make sure I've understood you... Are you arguing that with a set of identically-behaving black boxes, one could be "reasoning" and one could be "not reasoning", and a person would need to look inside the boxes at how they function to decide?

Remember, if the mapping from input to output is identical, there exists no test operating on the machines' output that can differentiate them. You can't tell from "conversing with" a machine whether it is or is not doing what you say around "inspecting" the input.


>Are you arguing that with a set of identically-behaving black boxes, one could be "reasoning" and one could be "not reasoning", and a person would need to look inside the boxes at how they function to decide?

Absolutely! Inside one of the black boxes could be an audio device replaying a tape. The other could be a person thinking and responding. The massive lookup table construct people like to reference is just another kind of recorder, it takes every possible conversation that could happen in some finite sequence of characters and produces the precomputed continuation on demand. No one ever asks where those conversations came from. If God has to imagine them in his mind, conversing with the lookup table is just conversing with God.


Okay, understood. You are making a variant of the Chinese Room argument in which you allow some types of computer programs (but not others) to have reason/sentience. I'm not entirely sure what specific lines you're drawing between the programs (what makes a deterministic transformer with sampling temperature zero "not a recording" but a hash table "a recording"?) but that's not super important.

There is nothing wrong about having that philosophy, and I respect it, but personally I think if it's impossible to tell two things apart using any external observation there is not a meaningful difference between those two things. "Smells like a rose" and all that.


Lol, the bird did not 'fly' - it just flapped its wings and generated lift!

No. The how is relevant here because it leads to understanding of the resulting behavior.

If you train the LLM on a corpus that shows people saying the sky is red, you get an LLM that is predisposed to say the sky is red. This is true even if it's also trained on all of the science that explains how and why the sky is blue.

If it were to "figure out" or "reason", it would not have such a predisposition to emit "red" after "the sky is" just because that matches the reward during training.

In other words, the token prediction is important because it both explains the successes AND the failures of the LLM. If there were situations in which a bird could fail to fly, then how it tried to fly would also be crucial knowledge.


You can also teach humans science and math and then they can be trained by a cult to not use any of that reasoning when emitting canned responses that they were rewarded by the cult for internalizing during their training. "Fake News!"

You're caught up on the mechanics of token processing (floating point matrix ALU math) and ignoring the context that p(next token) as a function being "computed" is doing so over a trillion parameters. You can poorly train a model, sure, but assuming you don't indoctrinate it too much, properties like cognition emerge - it learns to reason; why? Reasoning is more efficient and compact than memorizing answers.


I completely agree that humans sometimes are not applying reasoning to things.

I'm not trying to argue a model cannot "reason" or have "cognition", whatever those things are. I'm only saying that it's absolutely the case that whatever those things are, they come from its mechanism of predicting one token at a time ad infinitum, and that throwing away a deep understanding in favor of a shallow one is foolish. Just because it might seem to be "reasoning" does not mean it IS doing so, and certainly giving the appears of reasoning does not mean it is NOT a token predictor.

If I knew deeply how the human brain works I would use that understanding instead of saying things like "this person reasons" or "this person thinks".

In summary, I'm not "caught up in" anything - I'm just trying to point out that the original poster here is incorrect in saying that clearly LLMs aren't working through token prediction. They are, and all their behavior is 100% explained by token prediction. That's more than enough for interesting behavior!


A dominant theory for human cognition is predictive coding

https://en.wikipedia.org/wiki/Predictive_coding


Not and expert but how does this explain planning or anything creative? That's just generating things according to the world model with no error correction afterwards.

More like being suspended by a thread...

Most spreadsheet engines are turing complete, so you could use them to run an LLM.

I don't think many people would say an LLM written in Python is conscious BUT an LLM written in Excel is not.

People just don't ascribe consciousness to things that can't converse (or at least emote or give the appearance of emoting), and spreadsheets don't do that.

The reason people are debating the consciousness of LLMs is - obviously - that the LLMs generate sufficiently plausible text that people using them think they're having a two-part conversation. Like I think I might be having a two-part conversation now. Turning your question around, why do you think Hacker News posters are conscious? You have no direct evidence they are.


I think it's not really about having a conversation - I mean, that's part of it, but alone it's an illusion that eventually fades quickly. It's more of because of how it demonstrates intelligent behavior in reaction to requests, both in trivial and complex matter, and all across the board. LLM's response may be completely incorrect or confused, but it's nearly always exactly what you expect from a human[0]. This creates a more general feeling you're dealing with a human-like intelligence.

To be clear: I'm not talking about surface level things like prose. I'm saying that no matter what you do - whether you just paste a truncated log of a command into it with no further comment, or talk like a drunk teenager with no appreciation for grammar, or mix natural languages, or mix natural languages and JSON, or whatever else, the reaction you get is always that you would expect of a helpful person that got your message. It'll try - and usually succeed - to parse out what you actually meant, and deal well with subtleties around it.

This alone may not be enough to call it conscious or intelligent, but at the very least it's a large leap in that direction, and a qualitatively new functionality that classical software does not posses.

--

[0] - This is by design, not accident. "Respond to arbitrary input in a way that makes sense to humans" is literally the overall goal function the LLMs are trained to.


Ok then, when my GPU runs No Man’s Sky, I don’t get confused and think it’s running a universe and that universe is real, nor that anything about that system is conscious. When I close the game and load the LLM, I still don’t think the same machine has a case for consciousness even though I think it’s super smart and helpful.


It is difficult when humour and trolling are forbidden. It was easier to tell Slashdot posters were conscious. We could easily reach the stage soon where your agent is responding to my agent and we just leave them to it to run HN automatically.


The Framework 16 lets you put a pad with twenty-four additional buttons on it next to the keyboard. These 24 buttons can be programmed to do whatever you want with reprogrammable firmware.

Also, there are twenty-three different keyboard layouts available (IN ADDITION TO the 24-key macropad).

I think there are legitimate arguments against Framework, but this one clearly isn't cogent.


The post to which you're replying is implying that the lift shafts are entirely opaque to the wireless signal.

So it's fine if the user doesn't have a shaft between them and the AP. When they move so a straight line from the device to the AP crosses through a lift shaft, they enter the wireless "shadow" cast by that shaft, preventing them from contacting the AP. When they take another step forward the device might "come out of the shadow".

This is a difficult situation to deal with in roaming, where the visible AP set changes rapidly as the user moves a small amount.

TL;DR: I'm pretty sure the parent you're replying to "got it" and you didn't understand what they were trying to say.


As a slight hint, one of the more common types of corporation is an "LLC". LLC stands for Limited Liability Company.

If the company's owners had unlimited liability for problems the company caused, that wouldn't be much of an LLC, would it? The primary purpose of an LLC is to make it so that the owners (often the founders) cannot personally be held responsible for debts the company incurs, even debts incurred through their instructions.

This also includes debts caused by punishment for the company breaking civil contracts, but doesn't make individuals who use the company to break the law immune to criminal charges. But the standard of evidence for prosecuting that type of malfeasance is pretty high...


> primary purpose of an LLC is to make it so that the owners (often the founders) cannot personally be held responsible for debts the company incurs

It’s more so investors who aren’t involved in day-to-day decision making can invest without worrying that the founders will create liability for them.


This. You can still go after management in certain circumstances


I said the owners can't be held liable, I meant the owners can't be held liable.

You can "in certain circumstances" (negligence, overt criminality...) go after the managers. You probably can't go after the managers for things like producing a business plan they could have plausibly believed was legal and causing the company to incur civil liability.

In the situation described in this article, probably both the owners and the managers (likely the same people!) get away without being held accountable, and the victims have no recompense because the company folds.


LLCs are the limited liability form also most easily subject to veil-piercing (meaning, the courts ignore the limited liability shield to go after the assets of the owners) as most LLCs fail to properly maintain all the technical minutae necessary to actually keep the liability shield in place.

Insufficient capitalization is the #1 reason for piercing the veil (and also works well against corporations). This involves not putting enough investment into a company to pay the foreseeable debts it would incur from its activities. This means: if your LLC incurs debts knowing it lacks the ability to pay them off, the courts can pierce the LLC and go after you.


> Note that absent reasonable articulable suspicion of a crime, law enforcement in the US cannot legally forcibly identify people.

Could you cite a source for this?

If a law enforcement officer personally recognizes someone's face, I don't believe that it's illegal for them to know who the person is.

If a law enforcement officer turns to their non-cop buddy and asks "do you know this person?" and their buddy says "yeah that's Joe", I don't believe it's illegal for them to identify the person that way.

If a law enforcement officer picks up a phone and describes the person's face to their non-cop buddy and the buddy says "that sounds like Joe's face you're describing", I don't believe it's illegal for them to identify the person that way.

You can see where this is going, right? At what point does it become illegal to look up a person's face in a store of the-way-faces-look? Where does that become the "forcible identification" you're talking about?

Generally speaking, people expose their faces in public, and so those exposed faces can be remembered, photographed, and recalled without the person's consent or any warrant. This is legal in the USA - there is no expectation of privacy in a public space, and the police don't have to give you any more privacy than a private citizen would. They just cannot search you - and looking at your face, and potentially recognizing it, is not a search.


The only arguement I can see, is that the police should also not expect any privacy and have their names and faces visable, but thats the only relatively modern issue I've had.

And maybe no database that is always on and always accessable state and federal wide, since thats removing just general public exposure expectation.


> They just cannot search you

If instead of looking up your name in a name list, I look up your face in a face list, ain’t that searching?

I feel like what you’re ultimately asking hinges on when, legally, a photo turns into biometrics.


No, looking up your face in a list of faces is searching FOR you, not searching you.

Searching you means taking an inventory of the items on your person. It does not mean looking at you. In some cases, imaging you can be a "search" even where there is no physical contact (for example: a millimeter-wave scanner), but a photograph in a public place has never qualified as a "search" in the USA.

I understand your feelings, but so far as I know things like gait recognition, facial recognition, or even an iris scan derived from an ordinary photograph have never qualified as a "search" under U.S. law. Feel free to correct me if I am wrong on this: it's quite difficult to prove something has _not_ happened.

I think the legal line is not when it "becomes biometrics" or becomes identification. The legal line is when you are revealing a private property about an individual (for example, their blood type, or the contents of their pockets). Nothing visible to the naked eye in public is considered private, so nothing that derives identity from that public information is a "search".

As a nice easy example... an officer smelling alchohol on the breath of a person in the course of a conversation is not a "search". An officer compelling that same person to breathe into a breathalyzer machine is. Applying the same standard to faces, you cannot make an individual put their eye to a scanner without reasonable suspicion, but if you can get a biometric scan from the invidual as they happen to walk about their day, doing that isn't searching them...


> Applying the same standard to faces, you cannot make an individual put their eye to a scanner without reasonable suspicion, but if you can get a biometric scan from the invidual as they happen to walk about their day, doing that isn't searching them...

There's a different standard that could apply with just a small amount of tweaking.

Many people's homes happen to leak IR out of their poorly-insulated windows, walls, and roofs. Anyone (police included) can get an IR camera and determine facts about what's going on in such a house. However, police must obtain a search warrant before doing so... despite the fact that the design of such a house means that it broadcasts all that information indiscriminately while one goes about one's day and the twin facts that IR cameras aren't dreadfully expensive and are available to anyone with the funds to get one.

The Supreme Court case that determined this hinges in part on things that would "previously have been unknowable without physical intrusion". This logic could be tweaked to cover things that would "previously have been unknowable without physical detention". The core problem with searches has never been the inconvenience involved in the search... it has always been the fact that you're being searched. New tech that removes the physical imposition of a type of search does not solve this problem. A long while back [0] NYPD was deploying mobile microwave scanners that they used to perform indiscriminate under-clothes searches of people who were committing the "crime" of being in public going about their day. IIRC, they wrapped this in «We're just looking for weapons, and it's only a pilot program, anyway!» language, but that doesn't change the fact that they were performing "stop and frisk" searches at scale.

[0] And maybe they're still doing this... I've not paid much attention to NYC's law enforcement insanity in quite a while.


The interior of the house is a private space. Things under your clothes are too. Revealing either of these things through whatever technology is a search.

The standard is whether something expected to be private is revealed. What your eyes look like (even in detail) is not that. There isn't a point on some line where the photograph becoming higher-quality applies a different standard than a low-resolution photo would.


> Could you cite a source for this?

Many states (something like half) in the US don't require people to respond to police requests to identify themselves.

While the courts might end up claiming that being grabbed by several cops and having an iris scanner forced onto one's face is "not testimonial" and therefor not covered by existing laws that permit one to not have to identify oneself, I would argue that such an outcome is -heh- rules lawyering and intensely unjust. IMNSHO the justice system should actively avoid producing unjust outcomes.


This is actually how it works in India with "e-mandates".

In order to set up a recurring bill the merchant must get a "mandate" from the customer, which involves them approving the amount/frequency/term of the payment. The customer can at any time view a list of open mandates on their bank's web site/app and cancel any they wish. Recurring payments only succeed when the mandate remains valid.

The payment amount may be revised downward without getting a new mandate, but raising it up requires replacing the old mandate with a new one.

In order to make a non-initial charge the merchant must pre-authorize it with the bank a few days prior (handing the ID of mandate under which the charge is made to the bank), and pass the confirmation they get back from the bank when they do the real charge. The bank notifies the customer about the upcoming renewal and its amount.

IMO this is exactly how it should work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: