Does "ipfs add" duplicate content?

Whenever I add a new file to IPFS, are files duplicated? In other words, if I want to share a 1GB file, would I have to host 2GB of data (my original file + a copy for IPFS)?

1 Like

Currently, yes, all files added to IPFS are also added to the IPFS blockstore.
This is currently being worked on, to allow file referenced blocks in the blockstore, in order to avoid duplicating data.

This is the PR that aims to implement this:

I’m not 100% sure what the progress on that is, but it’s a massive change, so it will definitely take a while to complete.

The Filestore feature (aka “add without duplicating”) will be getting some extra attention this week during the 300 TB Challenge Sprint

This feature is especially important for people who have large amounts of data. For example if the Internet Archive wanted to put a petabyte of data on IPFS right now they would have to spin up an extra petabyte of storage. That’s a major blocker. We’re working hard to un-block it, but we have to be careful to do it right.

if the Internet Archive wanted to put a petabyte of data on IPFS

is this just an hypothetical dream, or have they expressed any serious interest in using IPFS once this feature is available?

I’m speaking hypothetically, but Brewster Kahle and the IA have been some of the most vocal advocates of the decentralized web, pointing out the systemic benefits of these technologies as well as the ways in which tools like ipfs will allow IA to fulfill their mission in a more robust way. I get the impression that they’re waiting for someone to finish “drawing the rest of the owl” so they can use it. The two remaining things preventing us from saying “Hey IA, take it for a spin” are 1) finishing the filestore implementation and 2) load testing the system. Both of those will get attention this quarter.

1 Like

This seems to be possible now:

The filestore code has been merged, and will be shipped in 0.4.7

For some notes and usage instructions, see this comment: #3397 (comment):

How to enable

Modify your ipfs config:

ipfs config --json Experimental.FilestoreEnabled true

And then pass the --nocopy flag when running ipfs add

where is it? the blockstore

I use Win 10. I add files from the folder C:\Users\Admin\site
After adding, I delete it. So the folder 'site" is empty, but files are still available in ipfs.
If I unpin those files, where are they? Are they deleted? or should I manually delete those somewhere?

By default, the blockstore is located within C:\Users\<Your username>\.ipfs, but you can change this location with the IPFS_PATH environment variable.
If you unpin files you added, they remain in the block store until the next garbage collection is ran, either manually with ipfs repo gc, or automatically if you started your daemon with ipfs daemon --enable-gc.

2 Likes

Am I able to apply --nocopy or a similar feature after running ipfs add?

1 Like

Adding a file I knew existed in the block store using the --nocopy flag increased the size of the block store by a small amount instead of reducing it by the size of the file I added, so it seems not.

My guess is that you need to (unpin &) remove old version from block store (via forced-GC) before adding it again with --nocopy.

1 Like