Not sure to understanding everything about the chunking

Hi,

I have read a lot of documentation, but I haven’t found a clear answer of my 2 interrogation:

1 - If I set a node and add a file to it, this file is like duplicated and be cuted into multiple chunk? So, everything take twice the size on the owner HDD?

Is there a way to avoid that?

2 - If another node download this file, it automatically become a seeder of this file? Or, like first node, it have to have 2 copy of the file, one usable by the system (a .jpg, for exemple) and another one chuncked to be a seeder? So, use twice of the size on HDD too?

Thanks :slight_smile:

Yes

Yes, see https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#ipfs-filestore

Yes. In this case the node would only have one copy, the one stored inside the “ipfs datastore” as chunks (blocks).

Well, interesting, I don’t know about this feature.

How it work exactly? That chunk the file on the fly? The node who get the file get also the whole file or all chunk element? If this other node have activated this feature, do it need to have the chunk version too (if it only download the chunk that recreate the original file, do this node keep the chunk?)

thanks

What filestore does is, rather than copying a chunk into the datastore, just keeps a reference to the original file+position.

It still sends chunks over the network and the node downloading the file would receive chunks and store chunks.

Ok, I understand, that mean if I want that each node don’t have double sized files, after the file was downloaded, I have to erase all chunks and re-add the file to get the same CID and be a filestore seeder?

No. Nodes will not have double sized files. You don’t need to erase or re-download anything.

Sorry, I did not say what I’m thinking about: I’m looking for a solution that every user who download something become also a seeder, so, everyone have to be a node to be a seeder. (or I missed something?)

I’m working for a small compagny who works with pretty large file. And, in my country, internet connection aren’t very good. For now, we are covid free, but everyone think we get the virus soon or late, so, my boss ask me to find a way to do telework easely as possible with only one server and an old ADSL connection to send these file to some user when needed by them. So, to avoid saturate the server connection, I’m looking for a solution that make everyone to get needed file with the maximum bandwidth possible (if that not a problem, I just set up an FTP server and that it). But, it’s not possible to have double sized file on every user computer and with the financial crysis, it’s not possible to rent a server somewhere so, I try to found a solution to meet the needs and constraints. Actually, I use SyncThing, with a custom interface, but every update broke something and I have to modify it and send patch to everyone… IPFS look to be more simple to manage, since I just have to provide the CID of files, but this double sized file will be problematic