This is a very long document that says nothing about chunking at first skim. If chunking is actually wrong, then just explain why, here. Wasting space is not actually a problem if it’s optimized for other purposes instead.
When it comes to large assets, wasting large chunks of space is a problem. If your chunks are 64 kib average (from the Lore document), but changes only average 1 kib (which could be a high estimate), then you will still run out of space 64 times faster and need to read 64 times more data off of the disk for certain operations.
It also makes diffing hard, as well as diff viewing.
Seems like if Lore wants to reduce space usage, they could apply something like Git's delta compression (as used in packfiles) to the chunks.
Suppose you make a 1 kB change in a 50 MB file. That causes a 64 kB chunk to be created and stored. Disk space is wasted.
But since the 50 MB file was already stored as a sequence of 64 kB chunks, there is an existing 64 kB chunk that is very similar to your new 64 kB chunk. You can store your new chunk as a delta to that, so only ~1 kB of disk space is used.
Admittedly, it's complicated and inelegant. But it allows both deduplication between files (one of the reasons Lore chose chunks, apparently) and efficient space usage for small changes.
I tried to give that section of the doc a fair read.
Looks like operational transforms to me.
The doc claims it's the first with this technique. A 30 second search reminded me of Darcs, and taught me about Pijul, and Weave. And yes, Google Docs storage works the same way - there are probably papers documenting how efficient Google Docs storage is, but it's not wrapped up in a full VCS that folks can use.
The example in the doc uses text, and unfortunately I think it's for a reason. I think with large, binary game assets, the most common operation is going to be strings of "replace A with B", and depending on your chunk size relative to the distribution of changes you make on your assets, I see it as pretty close to a wash, for efficiency. Especially considering that content-addressable blocks also solves de-duplication, which for a multi-game studio is probably going to be significant. Especially if they're managing multiple releases, patches, development branches, etc.
The idea that progress is “slow” in the AI space is absurd. These are some of the fastest growing products and companies of all time. The models are still improving a surprising amount.
It's not that absolute progress is slow, it's extremely slow compared to the predictions. It might be fast in absolute terms, but the "50% of coders will be obsolete by 2023" has been renewed every six months, and it's becoming increasingly clear that there's a real chance it might not ever happen.
„Coders being obsolete” is not a measure of AI capabilities. I see coders being more busy than ever before. I see people without coding knowledge getting more behind. The gap is widening, not shrinking.
I see people without coding ability catching up to learned coders. AI is a huge force multiplier for people who don't have hands on, detailed technical knowledge, AI can increasingly handle that and just needs a human to steer it more broadly.
It's huge force multiplier for people who have hands on, detailed technical knowledge as well.
3 * N < 42 * N
42 - 3 < N * (42 - 3)
It helps to know layer you're working on.
People seem to make mistake of thinking how good LLMs are around tasks that they are familiar with and extrapolating it to whole population.
It's good mental exercise to think about how little you can do compared to expert on tasks you never thought of working.
Ie. if you're programmer or know something about finance, don't think how much it enables you to do better coding or investing, think instead how much it doesn't enable you to work on something you don't know like maybe molecular biology or visual special effects – it's all there but it's much better multiplier for people who do know their shit.
Knowing layer you're working on helps a lot, it gets multiplied.
Knowing programming is becoming more fundamental skill than ever before as it lies at the foundation of almost everything else.
I suspected you felt that way even though it hasn’t been my personal experience.
I’ve heard people say older models can’t do X, when I used that way etc. I suspect people are applying their own learning curve as part of their assessment of progress, you get better at writing prompts and it feels like the model improved.
Which is why I’m saying we need some objective metrics to judge predictions of actual capacity.
I mean you wanted something objective, and they are. I don’t know why you’re being dismissive of them, they’re a huge element of what drives model development forward.
These companies aren’t just making stuff up, they really do want to improve the models, and the models really are improving.
I don't know what you're reading, but he's talking about the fact that her startup is growing, and has happy end users, who are purchasing her product, and telling their friends.
That would also be inaccurate. They made $4B last year in profit. They also had the second best selling car in the world. I don't think you're using a normal definition of "crash."
A Wall St analyst was asking questions about it on Tesla earnings calls in 2016 and later wrote a research note about the idea. Been widely discussed since.
I mean maybe I’m missing the point but it seems like it’s intended to be more of a tool for debugging race conditions, than a runtime you actually ship with. For that purpose I think using an LLM is fine.
Do you think it would be a good thing if all music (for example) was made with AI and not by humans? What a lot of people in tech right now don’t seem to get is that art is about how it transforms the artist, not just the final product.
reply