Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've repeated the argument over and over since the GPT-2 days, when I derived it theoretically by inspecting the architecture of the model. I am now fatigued, and enough other people have taken up similar arguments – some developed half-way to a mathematical proof – that I no longer feel the obligation to keep repeating myself.


You could post a link.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: