Garbage collector removes all the data?

Hi, I would like to clarify how the garbage collector works.
I have this in my config file:

“Datastore”: {
“StorageMax”: “600GB”,
“StorageGCWatermark”: 90,
“GCPeriod”: “1h”
… }

What I would expect:
When the reported size is above 600 GB, the garbage collector starts to remove data until the reported size is 600 * 0.9 = 540 GB.

What I actually see:
Garbage collector starts to remove data but stops on ~20 GB of reportedly used data. I have no idea why so little. It removed 580 GB of data, ~97 % of all data.

Why it happens so? It does a similar thing when I set 30 GB of repo size (it stops on ~2 GB or so). How I could define it does not need to remove data too much? My version is the newest 0.9.0 but it happened the same thing on 0.7.0 as well.

Another issue is the garbage collector is very disk-intensive and IPFS starts to timeout – so for several hours, the IPFS daemon is almost unusable. So this useless deleting is pain even more. What works is to rename the data folder, start to (slowly) delete, and let IPFS start again. But I have plenty of disk space, I can participate instead of deleting all and downloading some stuff again.

What should I do? Thanks.

No, when reported size is above 90% of StorageMax, the GC kicks in and deletes everything that is not pinned or on MFS.

I’m pretty sure there are issues about this in the go-ipfs repo. Re-working the GC system is a thing that we’ve wanted to do for long (but there were other blockers and what not).

1 Like

Ouch, you are right, I totally overwatch this! I don’t know where I had that information from. Thanks.

Thanks. Is there another way to say “delete files a little but not all of them”?
Or I would ideally fulfill the scenario “let use the whole disk but keep 100 GB of free space” or so and it would automatically remove data if needed.

I’m afraid not. When using badger as datastore, the GC process has 2 steps, first it deletes the blocks and then it calls the datastore GC (otherwise no space is freed). There is no way to know how much space badger will actually be able to liberate even if blocks have been “deleted”. So that makes your approach impractical for the different datastore options I guess.

1 Like