Kubernetes best practices in regards to IPFS-Cluster

Hello

TLDR:
Is nginx ingress the way to go, or is it possible to work around the GKE Ingress http status 200 healtcheck requirement for each endpoint? If so, how?

I have been tasked with deploying an IPFS-Cluster on Kubernetes, the goal is to achieve the following:

https://domain.com/ipfs/{{hash}} -> :8080 - IPFS Gateway
https://domain.com:5001 ->          :9095 - IPFS Proxy-Api

With said setup, all the data uploaded through port 5001 would be pinned on all the IPFS nodes within the cluster. Does this make sense or do I miss something obvious?

I have been following along with the official tutorial: Deployment on Kubernetes - Pinset orchestration for IPFS but I have quite some questions, i had no exposure to Kubernetes until a week ago:

  • The pods can not start the first time they are created, because the Volumes are also just being created. How can I wait until a Volume is ready, before proceeding?
  • What is the recommended way of exposing the needed endpoints? It would be nice if we could benefit from the Google Cloud CDN by using the GKE L7 LoadBalancer. However I have failed to work around the “status 200 health-check” requirement on all the exposed endpoints.

There is an open issue regarding custom healthchecks:

Any help is highly appreciated.

Your understanding is correct. However, because the api/v0/add request results in simultaneously adding to all peers in the cluster (using blockPuts), this incurrs in significicant overhead for large files. If you are working with small files (~5MB or so), you will not notice much. When working with larger files you will notice.

One way to speed up is to use the rest api /add endpoint (:9094) setting local=true. This will just add on the node receiving the request (and not all nodes at the same time), and cluster-pin when finished. The rest of the nodes will copy the content via pubsub, much faster.

I have no much idea about kubernetes, surely there is a way to specify dependencies among resources.

What is the problem with this health check. You can probably point to to api/v0/version or some dummy endpoint that returns 200 unless the node is down. Wouldn’t that work? Edit: from the linked issue it may seem that kubernetes is very stupid if it does not allow to query other than /. I can’t hardly believe that, there has to be a better way.

1 Like

Thank you very much for getting back to me so quickly. I will look into it.

I’m just thinking out loud here: would it not make sense to also enable this option/feature on the IPFS-Proxy-Api? Essentially the Idea is for this IPFS-Cluster to be of public nature and people should be able to interact with the cluster as if it was a normal IPFS node. Also this way we will not need to open up another port just for one operation.

What do you think?

I will keep digging in regards to Kubernetes best practices.

The local thing? Yeah, we should probably make it a configurable option for the proxy. Can you open an issue?

Will do. Thanks a lot!

Edit: Support `local=true` as a configurable option for the proxy endpoints (:9095) · Issue #1292 · ipfs/ipfs-cluster · GitHub