How to understand "IPFS removes duplications across the network"

When add a new file to node-A, then add(no pin) the same file to node-B, what happened in IPFS? If I use ipfs get or ipfs pin command on node-C, will it get the file chunks from both node-A and node-B simultaneously? If so, how to understand “IPFS removes duplications across the network”?

1 Like

Not sure I can give you correct answer. But I can give you answer from my point of view. When you add file in node, it will announce you have this file to network. So in your case, both node-A and node-B announce they have this file. When you get this file in node-C, usually you will get chunks from both node-A and node-B simultaneously, which depends on routing. In one IPFS node, there’s only one copy of same content if you have same file with different file name.

Thanks for your answer. I understand that there’s only one copy of the same file in one node, but from the view of the ipfs network, there are many copies of the same file, right? Hence those are duplications for the same file…so I’m confused about this “many copies across the network” fact and “removes duplications across the network” part.

I copy some description from IPFS introduction, you may get point what duplication means.
Same copy of block in different nodes so that you can get this block as fast as possible, you can think it as CDN. Removing duplications across the network means there’s no duplicated hash for same block in network.

  1. Deduplication: all objects that hold the exact same
    content are equal, and only stored once. This is par-
    ticularly useful with index objects, such as git trees
    and commits, or common portions of data.

Thank you. I think I can understand now