IPFS Cluster REST Api '--no-status' flag

Hi all,

i wanted to ask if i can also provide the --no-status flag like in the ipfs-cluster-ctl pin add also in the REST API, because we’re currently seeing that every PIN ADD operation on the ipfs-cluster takes around 1,5-2secs

Would this be possible?

Regards
Sebastian

If you are calling the API directly, then there is no status-fetching involved. The cli would call PinPath (POST /pins) followed by Status (GET /pins/cid) and skip the second call if no-status is set.

So, in order to figure out why the POST /pins is taking long for you I’d need more info on your setup: how big is the cluster in terms of peerst? Raft/Crdt? etc.

Thanks for the response. We currently have 3 peers in the set with CRDT with around 300k pins.

Hmm, I’m afraid I have to ask you to run ipfs-cluster-service with --loglevel debug and send or study the logs from the moment you send the pin to the moment the call returns. I would look to identify which component is taking the longest.

2s seems very high to me for a CRDT Pin call. Has it been degrading as the pinset got bigger? Double-check that you are using the POST /pins endpoint and just that, not /add’ing ?

If you don’t want to post logs here, you can send them to hector@ipfs.io.

Yeah will try to catch the logs for you. Just started another instance with fresh ipfs node and responses from this server are around 100-200ms… I think i will also dig a little bit deeper.

I don’t think cluster needs to contact IPFS when you post a new pin though, as this happens async, so it should not affect how long the call takes. Do you mean that it is way faster with a clean ipfs node but a with a cluster peer of 300k pins, or with both clean?

Do you, by any chance, re-pin the same CID over and over?

Hi Hector,

nope i’m not pinning the same hash. I’ve monitored the systems and think this is related to the CPU usage of the system … I did an import of around 50000 hashes and during this import the pinning was slow but the cpu was around 75% (2 vCPUs)

But did it become slower and slower? I’ll test myself when I have some time. Anything non default in the config?

Nope, now we are around 300-350ms for a pin add request.

So an original peer with 300k pins is slow to pin. And a new peer with 50k is faster to pin, but it has only a fraction of those 300k so it may well become slower if it had 300k items right?

Also with 300k pins it now takes around 300ms… The longer duration was only when importing 50000 pin requests in parallel with 20 threads.

The next thing i try to achieve are lower the memory spikes when “pin_recover_interval” occurs… because it takes around 1.5gb RAM then. Already adjusted the Badger options but it is not getting smaller after the pin recover is done the cluster application has around 320mb ram.

The next thing i try to achieve are lower the memory spikes

I don’t think there’s much you can do (other than disabling the recover interval). The spike is essentially an in-memory version of ipfs pin ls --type recursive (with some more info). The alternative would be to call ipfs pin ls $cid 300k times.

Recover checks that all the things that should be pinned are pinned, therefore needs a full pin ls. Maybe in some future we can stream all the way, but pin ls does not support streaming yet (it is PR’ed though).

Hi Hector,

but isn’t the streaming pin ls included in the 0.5.0 release?

Do you have generally any recommendations regarding the sizing of the machines? or any graphs how the cluster performs in cpu/memory usage with 1k/10k/100k/1000k pins?

regards

Ah, apparently yes. However there is a lot of rewiring needed to take advantage of this, but at least it is there now.

Do you have generally any recommendations regarding the sizing of the machines? or any graphs how the cluster performs in cpu/memory usage with 1k/10k/100k/1000k pins?

I’m afraid I don’t have relevant graphs. Normally, the heavy part of the binome IPFSCluster+IPFS is the IPFS daemon. The problem would be memory spikes upon certain operations (anything that triggers a pin-ls in cluster or in ipfs is the main thing I can think of).

Now, if you have 300k pins and 1.5GB ram spike is a problem and have only 2vCPUs, you are probably using something quite small. For our large storage cluster (86k pins) we have machines with 64GB, which is overkill (around 8GB used), and 12vCPUs. You should be fine with 16GB and 4CPUs for standard configurations of cluster + ipfs. But then your disk speed, the ipfs datastore you choose, the dht mode, the re-providing settings and the connection manager settings will all affect how much the IPFS daemon takes.

Setting this in the badger options helps with low memory environments (at the cost of speed I guess):

"table_loading_mode": 0,
"value_log_loading_mode": 0,