Hacker Newsnew | past | comments | ask | show | jobs | submit | pixlmint's commentslogin

I’m working on my developer portfolio that deeply incorporates the forgejo API, which is where I host my code. It basically gives a personalized dashboard for my personal projects.

Quit my Claude pro subscription last week and purchased credits for an API inference provider. I think I might even end up saving money, since I really don’t use AI that much, and I actually found that gemma4:31b is fine for most of my non-coding inquiries.

Gemma is amazing with tools for anything that is not crazy complex. I think a lot of people have a wrong perception of it because Google's new prompt format broke implementations like llama.cpp and it took quite a while to get everything sorted. But even the tiny variants running on edge devices are surprisingly capable when used right.

The frontier will probably keep moving for a while, but it will be increasingly disconnected from normal human use. In the future, if you're not trying to solve a research level math problem, you'll probably do it locally and fully privately. Which also means the payday when they will fundamentally no longer be able to reach a billion users with frontier models will come soon for the labs. Even if they do get their IPO out, it will probably crash and burn at current valuations.


Do you guys actually work with these models?

I have to use GPT 5.4 Mini at work. It benchmarks higher than that Gemma 4 model.

In my experience it's next to useless. It cannot even move 20 existing lines of code from A to B without breaking them half of the time.

If you tell it to look something up in your dependencies, it's 50/50 on whether the answer is correct, incorrect, or it simply didn't perform the search at all.

I find it next to useless, and I'm mostly better off doing the work manually.

It's a night and day difference to even Sonnet, not to mention the SOTA.


Counter: I use 5.4 mini all time for coding. No trouble letting it implement features. Entire new screens, APIs and various components.

It ain’t the best for sure, but if you have trouble letting it move 20 lines I don’t know what’s the cause but that’s not my experience at all. I do make pretty extensive use of guardrails and proper instructions in my AGENTS.md.

I also value super boring code bases with an as much as possible uniform shape. I guess that’s also helping out.


>It benchmarks higher than that Gemma 4 model.

Depends on what you look at. Gemma 4 31B without reasoning benchmarks significantly higher than GPT-5.4 without reasoning on artificial analysis. Even the new Gemma 4 12B beats it. And while GPT-5.4 with xhigh reasoning beats the reasoning version of Gemma 4 31B, the question is why you would throw such a complicated task that needs so much reasoning at such a small model to begin with. So if you do coding, you'll probably not have much success with either model. But for actual simple tasks that these models were made for, they are extremely capable. E.g. hook it up to the Atlassian MCP and have it do all the stuff that is supplemental to coding in big enterprises.


Like I said in my original comment, it’s fine for non-coding tasks, meaning I primarily use it to answer questions

The MoE variant was perfect for speedily generating hundreds of vocabulary mnemonic flash cards for my daughter to study for the SAT. "Ant bait abates our ant problem" and "A droid adroitly fixes things around the house," for example.

We also used z-image to generate accompanying illustrations.


“Moving lines of code” is a very peculiar eval tbh. I’ve never used Gemma for agentic tasks, but did have it write code, including multi-turn, and I was very positively surprised how well it performed.

It wasn't so much an eval, I really just wanted a small change moved out to another branch.

GPT 5.4 mini couldn't do it. Not even on the second attempt, where it went from obviously wrong to a subtly wrong copy.

In the end I had to manually copy and paste the 10-20 lines over.

If it can't even do that job, I seriously doubt it's going to be adequate for implementing a plan, like people often seem to suggest it could do, in order to save output tokens of a better model.


Like I said, I never really used it for agentic work. I had previously evaluated locally runnable models with opencode (such as qwen3-coder), but found that it wasn't really feasible.

Since then I've adopted a different philosophy, and I actually prefer it this way.

I still very much enjoy doing most coding myself, but when I tried using tools like Claude Code, it felt very difficult to return to the codebase after letting Claude make some changes. Maybe that's just because of poor AI-use discipline, I don't know. But with smaller models, that's not even an issue. I can't just let it do all the coding and thinking for me, however if I can describe a function I want to great detail in plain english, then Gemma can write it for me, and it will most likely work. It's perfect for boilerplate.

I also recently worked with a web framework I'd never worked before, though I'm deeply familiar with other ones. So I asked it "I know how to do this in Y framework, what's the best-practice approach to doing it in Z framework?" and it was incredibly helpful, even pushing back on some of my 'bad' attempts at solving a problem.

I think GPT5.4 mini might fall into a similar category, in that it probably performs best when not overwhelmed with too many tools/ skills/ mcps, instead being given clearly defined tasks by an orchestrator model. I call those my token burners, as they're super cheap to run and have high tokens/second.


Cursor 2.5 is essentially kimi and I find it eminently usable.

i use for tasks like object recognition in my family photos and cooking videos . seems to be fine

Got a link to that API inference provider?

Just look up OpenRouter, OpenCode Go/Zen, Together, Fireworks, Cerebras, etc.

DeepSeek Platform API is worth checking out too, due to their insanely good caching and token costs.


I use DeepSeek via OpenRouter, the caching seems to work there too, you just need to force it to use DeepSeek as a provider otherwise it picks a random one every time. (You can pass a provider option in the call, or better, create a preset in your account.)

I'm Ollama Cloud which has a coding plan style model but without restrictions on the harness or direct API calls from your code.

I use novita ai

Yup, I always phrased this as “if you can’t be arsed to write it, I won’t read it”

Holy typo… does the one world not have proofreaders?

Nope! They were sent off along with other useful professionals such as telephone cleaners on a big space ship!

I absolutely love the design! Thanks for sharing

the idea that llm's can't help if you're missing domain knowledge is crazy.


I mean if you ask the AI to make something to manage the inventory in a warehouse without any detail about how the warehouses operate then you're going to get a worse result than a domain expert talking to the AI.

The problem is that more and more people are getting convinced by the AI's that they're domain experts when they're really not.


As a software engineer I am so deeply ashamed of how quick so many in the field have done a complete 180 on "productivity cannot be measured by lines of code" to wearing lines of code like a badge of honour.


I wonder what will happen once these guidelines end up in the LLM training datasets


if an LLM says "I can't open a PR automatically until you solicit a review from a maintainer", i think that's good actually. likewise for proactively following the rest of the rules.


It's not the submitter who solicits, but the reviewer. They can't give code, AND THEN get approval, they need to be asked specifically for an llm created PR.



Content not viewable in I and I's region


it says “that’s right” in bold red cursive font. i know what the policy says, i wrote it, you nincompoop.


Such a cool project! Now it's just missing search and a request for donations


It's also missing the defrag tool. Without it, it's going to be very slow as the disk fills up.

Should put a shortcut to it on the desktop as well, so that users who experience significant lag can defrag at will.


GitHub centralizes 2 things: Authentication, as well as Repository Hosting.

Does the code really need to be hosted in a central location like this? (Clearly not, which is why people are leaving GitHub in the first place)

But the one part GitHub provides that's genuinely valuable is the social aspect, and when you get a PR from a user named torvalds you can trust that this is in fact Linus. This isn't the case with more distributed systems.

That's why I'd really like to see some entity handle just the auth/identity providing. Forgejo/ Gitea/ Gitlab instances can then choose to use that. Then, for example if you want to take on another contributor and they have their own forgejo instances, you can invite them through this provider, when they fork your repo it ends up in their own forgejo, and they can easily create PR's into your repo.


Tangled is working on something like that. I believe they are federating on the @protocol.

https://tangled.org/


I am very active on bsky and I also use some other ATProto applications like tangled. I think this is the first time I have seen anyone refer to ATProto with an '@'


It's less used but the @ is the atproto logo. I default to saying aye-tee instead of at though. It just sounds better.


GitHub also centralises abuse detection. I'm not thinking about sophisticated attacks here so much as dealing with plain old spam. That's fairly easy to deal with on a tiny scale, and possible on a huge scale, but it's a great pain at a medium scale.


Do they though? Slow PRs have been a consistent pain point for GitHub hosted projects lately.


I would argue GitHub does a lot more centralization than just those two. It's an entire developer platform centered around Git. It does hundreds of other things that some developers use, and some don't.


GitHub really doesn't have hundreds of additional working features beyond git.


Collaboration, issue tracking, Actions (CI/CD), Codespaces, Security, AI, Identity, social, hosting. Those are like broad categories I can think of off top of my head too you could fit probably 10-15 "features" into each of those.


Lots of those aren't centralisation/decentralisation issues though - codespaces, actions, issue tracking, multiple users, all that is repo/org level.


GitHub is centralizing all those features. If GitHub was just for repo hosting, then you would need to link your repo to another platform to do CI, another platform for issue tracking relative to the code, etc.


They have those features but those aren’t decentralisation issues, as far as I’d understand the term, but those already can be done elsewhere right now. They’re purely tied to one repo really, it’s only the user accounts that I can see being more of a cross repo concern.

And global search but I don’t feel like that even really works.


> That's why I'd really like to see some entity handle just the auth/identity providing. Forgejo/ Gitea/ Gitlab instances can then choose to use that. Then, for example if you want to take on another contributor and they have their own forgejo instances, you can invite them through this provider, when they fork your repo it ends up in their own forgejo, and they can easily create PR's into your repo.

Agree, I feel like a true alternative should focus on this missing piece to bridge the gap.


The "missing" piece is just everyone implementing OAuth Dynamic Client Registration. Then kernel.org could be its own OAuth provider, and Linus could log into someone's Forgejo with his kernel.org login.

Just like "log in with Google", you should be able to do "log in with OAuth", you type your email or domain (or your browser fills it), and it triggers a redirect flow for login. Then people can use GitHub or Google or Apple or their own provider, just like email. Every email provider could also be an OAuth provider.


GitHub is to git like Reddit was to forums. Centralized usernames and such were very nice, but it also has downsides that we’re now living with.

GitHub is still really, really nice in that it’s five seconds to throw up a repo that’s accessible worldwide (98% of the time lol) and everyone’s on there. Whatever replaces it (just like whatever replaces twitter) may be better in many ways, but it will be “worse” in others, even if just in splintering.


Signed commits could solve this in a more decentralized way if people post their public keys on their own domains.


Own domains is the real deal. My preffered model is tarball releases with checksums, or better yet, with signatures (like remind[0] or msmtp[1]). Such pages are trivial to host properly and loads quickly.

[0]: https://dianne.skoll.ca/projects/remind/

[1]: https://marlam.de/msmtp/download/


I was confused for a bit what those two projects have to do with signatures but I guess you are just using them as examples of having (PGP) signatures for downloads?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: