Hacker Newsnew | past | comments | ask | show | jobs | submit | danielbln's commentslogin

Yeah 60k is ludicrous, I've barely seeded the context at that point and I don't see context related degradation until well into the 600-700k.

In this thread: People tossing coins independently and fighting over the result they got.

No it's not.

It seems that people have different workflows or repos, or memories or prompts or expectations.


For what it’s worth, as a third party I read your and qsera’s comments as saying the same thing.

Maybe I misread the comment then.

I read it as a models performance being random and observed differences in the opinions are the results of the overinterpretation of the random outcomes.

I think however that some people seem to be always lucky which indicates that it is not random but rather some fixed differences between people and their environments.


A models adherence to some configuration is a matter of probability. There might be some underlying pattern, but as far as I understand this is not documented and it may be even impossible to do so. So people are just trying stuff and sharing what appear to work. There is no causal link anywhere in this recommendations, and is just based on spurious correlations.

> I've barely seeded the context at that point

I think that's issue, rather than 60K being small.

Most of the actual edits/changes I request to codex are solved within 100-150K tokens, beyond 200K I'd definitively try to restart the session as soon as I could as all models are horrible once you get across ~20% of the total context size. And this is while working on +million LOC codebases.

Problem I guess is that there is no solid and concrete evidence of this (to me [and others seemingly] obvious) degradation, but should be easy to prove, yet no one has time to sit down and show it :)

But the likelihood of a model getting minor details wrong once you're above some magical threshold between 15-20%, seems to skyrocket, and I hit that issue sufficient amount of times that now my workflow is trying to prevent that.


what are y'all doing to hit that? Do you just not give it any pointers and let it churn away? What kind of context are you handing off?

I routinely get claude to do things pretty decently and finish up easily in the 4-5 digit range of tokens. It seems to be doing the right kind of thing to not waste its time looking at 1000 files.


There were no outages before AI, also no bad code and every PR was a work of art.

The joke was that the outage of the brain slop hq was a good thing and the person was a hero

This is me. I found AI to be an incredible provider of structure, focus and productivity, its an externalized executive function provider. No longer do I forget what last week's meetings were about, no longer am I paralyzed by seemingly I surmountable tasks it all just flows, and I get to rubber duck against an endlessly patient system. I love it, and I'm somewhat bewildered by some of the takes in this thread. Different strokes for different folks, I suppose.


Berlin "boutique" tech consultancy, we are seeing a noticable increase in Israeli and US engineers into our hiring pipeline. The braindrain from the autocratic countries is real.


Please no, I need my vacuum to work reliably in every room.


I can (and do!) pay the power company a pittance to run a surprisingly strong local model on a little box next to my keyboard, that does a fine job. Maybe not as fine as the billionaires thinking machine, but good enough and often better. Given that fact, I consider reliance on LLMs as much of an "issue" as reliance on a computer.


I would have killed for access to an LLM during school. Not to do my homework (though that too, homework is an antipattern imo) but to fill my gaps at my own pace and level of patience. Just endlessly pestering the AI "ok, but why?" until I grokked it.


You handle large code base by enforcing best practices that should have always been enforced. Proper up to date documentation, strict adherence to conventions and coding guidelines, cross review of deliverables, TDD, and so on. Just whispering "make me a dashboard" into the machine's ear is not how you drive agents to create maintainable and understandable code.


So you are saying Ai is a smart/fast auto complete and the actual intelligence is driven by humans? Reality aside it took me a lot of time to just having the version i wanted. But i find increasing frustrating is that solution chose by the ai is not optional, not production grade. Which require even more use if tokens and more time waste. Its good for management but for us difficult to maintain.


Don't be so dismissive. Every person is different, and you struggling with multitasking doesn't mean everyone is.


From [1]

The scientific study of multitasking over the past few decades has revealed important principles about the operations, and processing limitations, of our minds and brains. One critical finding to emerge is that we inflate our perceived ability to multitask: there is little correlation with our actual ability. In fact, multitasking is almost always a misnomer, as the human mind and brain lack the architecture to perform two or more tasks simultaneously. By architecture, we mean the cognitive and neural building blocks and systems that give rise to mental functioning. We have a hard time multitasking because of the ways that our building blocks of attention and executive control inherently work. To this end, when we attempt to multitask, we are usually switching between one task and another. The human brain has evolved to single task.

[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC7075496/


Fair enough, so it's a misnomer. Let's call it task switching then, since we don't actually do tasks at the same time, but switch from one to the other. A Claude Code session helpfully prints a small tldr summary of the ongoing session, so that one can quickly onboard again to the task at hand. I do not find that draining, personally.


Just today I installed the CLI version of antigravity (agy) and have been using it as a headless subagent from within Claude, so uh this works today?


And how do you get this to work exactly? I keep getting variations of "Missing required parameter: redirect_uri" in the OAuth flow.

The solutions proposed by Gemini and Google's AI summaries all hallucinate agy subcommands that don't exist, hilariously.

Edit: after bouncing around several GitHub threads, I realized that the agy TUI framework is wrapping the URL in a way that causes spaces to be inserted where the URL wraps. That's hilarious.


Right above the CLI download link on that page, there's a warning icon with "Authenticate with Antigravity or Antigravity IDE before using the CLI."


Ok, I guess this is outdated then because I did what I said I did.


Ah!

You're right, I just re-tested on my server and was able to get it to work now. Thank you! Does appear to just be stale documentation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: