Why there is some increase in disk usage even after file is deleted? Does the file Metadata gets left behind?

Setup: 2 nodes private IPFS network.
Whenever there is a requirement for file deletion it is marked as unpin and then garbage collection in both the nodes, so that file never accessible in the network. After garbage collection that space occupied by IPFS is more than initial stage(before uploading the file). Does IPFS store metadata of the file even after garbage collection?.
Please refer this table for Disk stats.

Which datastore are you using? The default (flatfs) or badger?

I believe I am using the default one.
Using jbenet/go-ipfs docker image for IPFS.

I just ran a test with flatfs too and see the same behavior. It looks like it’s leaving something behind in both the datastore and blocks directories.

1 Like

@leerspace
Could it be data synced from another node like Nodes list, Routing tables?
I just need to make sure nothing of file data remains in the node.

That would be my guess. A developer would need to weigh in on whether this is a bug, there’s some kind of metadata getting stored in the repo, or if maybe not being able to shrink all of the way back to its original size is a limitation of how the datastore works.

That’s due to leftover empty directories (each will be at least 4KiB in size). Basically, the current default datastore stores each file “chunk” in a separate file. To avoid having a single massive directory (filesystems tend to behave poorly with regards to large directories) we shard these files into different directories. Given the number of chunks you’ll likely have, I’d expect each one to be stored in a separate directory and I’d expect you to have about 416 directories. 416 * 4KiB = 1.6MiB. The rest is probably due to other random small bits of data.

2 Likes

Why don’t IPFS deletes these directories during garbage collection? and

if you are referring to file data then Is it possible to reverse engineer this small bits of data to get small bits of original content?

In practice, each will store many files so there’s no point in repeatedly creating and deleting them.

if you are referring to file data then Is it possible to reverse engineer this small bits of data to get small bits of original content?

Sorry, small bits of other data (e.g., data from the DHT). Note: IPFS hasn’t been designed to completely wipe-out any traces of a file being added so don’t rely on it for that. We don’t intentionally leave any pieces of the file behind but we likely leave behind evidence that the file was there.

1 Like

Thanks for the useful information.

Is there any way i can clear this evidence without modifying other files content?

Not with any amount of certainty (we may log CIDs, etc.). FYI, your operating system won’t even provide any guarantees around things like this (and will often simply mark data as deleted so it can come and clean it up later when it needs the space).

1 Like