Git on IPFS - Links and References

So perhaps the naive way isn’t so bad in relative terms? To elaborate on my reply in Handle blob objects larger than MessageSizeMax · Issue #18 · ipfs/go-ipld-git · GitHub

The attack is

But for any merkelized data, we have

hash := targetHash
for {
    chunkWithHash := solveForChunk(hash)
    send(chunkWithHash)
    hash := extractSha1Cid(chunkWithHash)
}

i.e. the fraudulent chunk can utilize “plausible CIDs” which keep the attack alive for future rounds.

Also, thanks to Merkle–Damgård construction - Wikipedia, SHA-1 will put the length in padding at the end, so attack up front still needs to commit to a length. This means no attack can waste resources indefinitely. Furthermore, Users are free to set policies specifying some function of the largest blob they’ll try to receive given the degree they trust the peer, to further mitigate spam.

Now it could well be that solveForChunk is substantially harder than solveForHashChunkPair, but given that SHA-1 is kinda “hosed anyways”, I’d consider cautiously banning for most CIDs, but making an exception for git-raw + sha1 given it’s ubiquity.

Ultimately, I think it’s in IPFS’s best interest to lobby hard for git + sha256 to chunk blobs, but IPFS will have more clout if there’s a stop-gap solution for git + sha1 to attract git users.