IPFS DHT problem with private swarm

We are running a private ipfs swarm. Many of the connected devices are developer machines that are not running constantly. There are also some android devices that are using --routing=dhtclient to reduce network bandwidth, so these are not participating in the DHT at all. There is one node running on an ec2 instance that is permanently available (soon to be upgraded to an ipfs cluster).

We are having problems with DHT resolution. When we add some data on some machine a that is directly connected to b, resolution works. But if we have nodes a and b that are both behind a NAT and can only talk via an intermediary node, hash resolution does not work.

When troubleshooting this, I tried using ipfs dht findprovs <the-hash> and never got any answer, even for hashes that can be resolved. So it seems that the DHT does not work at all in our private swarm for some reason, and the only reason things usually work is that nodes talk directly to each other.

Is there some minimum number of permanently available nodes that is required for the DHT to work? What can we do to fix this?

Update: I looked into this some more, and it turns out that the DAG is working fine. a adds something, and in a short time ipfs dag findprovs <hash> on b produces the node id of a. But even then I am still not able to get the content for the hash on machine b. If I get the content on a machine that is a peer of both, suddenly the resolution works at b.

So if both a and b are behind a NAT, they should be able to communicate with each other, right???

Update 2: we have a hunch that the cause of the issues is one of our team members being behind a dual NAT. It seems that this ticket does apply: https://github.com/ipfs/go-ipfs/issues/2879

Is a solution for this in the works? Any info we could provide to debug this?

To debug, you can enable verbose logging and see if IPFS is complaining about not being able to establish connections (see the ipfs log commands).

In terms of NATs, we’ve been working on a relay system to traverse nats like this. Unfortunately, we don’t yet advertise these relay addresses so IPFS won’t automatically try to use them (yet).

1 Like

Have you considered looking into https://ngrok.com/ to punch through the double NAT?
This would be a way to use ngrok and set it up with puppet: https://forge.puppet.com/gabe/ngrok/readme

1 Like

Thanks for the link. However, we would prefer that IPFS takes care of that issue for us. I think it is part of the attraction of IPFS that it sorts out these issues for you.

I totally agree with you - in theory - it was just a suggestion to maybe think about having an emergency back-channel that will always work…