Private network with 2 nodes, unable to broadcast new CID to the other node

IPFS v0.9.1,

I have a two node private network. I am running a test which adds content on one node and tries to fetch its from other. 2nd node is unable to find the content. Based on the logs, provider on 1st node didn’t broadcast the CID.

checked ipfs ids are correct, both nodes are peers in the swarm network (checked using ipfs swarm peers) . Both peers can find each other using (ipfs dht findpeer <>) . One thing I observed that in bit swap stat , node 2 doesn’t have node 1 as partner. But bitswap stat on 1st node show that it has node2 as partner.

node 2’s bitswap stats

bitswap status
	provides buffer: 0 / 256
	blocks received: 0
	blocks sent: 0
	data received: 0
	data sent: 0
	dup blocks received: 0
	dup data received: 0
	wantlist [1 keys]
		QmcD4SAMXhr4Upko9eC5XKDuGYyP2Q4GRYindJEE3qmesH
	partners [0]

node 1’s bitswap stats

bitswap status
	provides buffer: 0 / 256
	blocks received: 0
	blocks sent: 0
	data received: 0
	data sent: 0
	dup blocks received: 0
	dup data received: 0
	wantlist [0 keys]
	partners [1]

Restarted the node 1, after that test worked. bitswap stat from both nodes had each other as partners, and test succeeded.
Is there reason why this could be happening ?, Is there a way to recover from such situation without restarting nodes?

I have a two node private networ
What do you mean ? Are you using PNET or you are just running a LAN without any broadcast node?

Can you dump your config pls ? (ìpfs config show to filter out keys)

Using PNET.

Node2 's config

{
  "API": {
    "HTTPHeaders": {
      "Access-Control-Allow-Origin": [
        "*"
      ],
      "Server": [
        "go-ipfs/0.9.1"
      ]
    }
  },
  "Addresses": {
    "API": "/ip4/0.0.0.0/tcp/5001",
    "Announce": [],
    "Gateway": "/ip4/0.0.0.0/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001",
      "/ip4/0.0.0.0/udp/4001/quic",
      "/ip6/::/udp/4001/quic"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/ip4/<ip>/tcp/4001/p2p/QmXGpykMLCbNuXDjccuEu1Uc8Papbdt8Q5iggXrswqeWZx"
  ],
  "DNS": {
    "Resolvers": null
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": false,
      "Interval": 10
    }
  },
  "Experimental": {
    "AcceleratedDHTClient": false,
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "P2pHttpProxy": false,
    "ShardingEnabled": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": null,
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "QmPHhWeVNw4ACGfEwLNeA8JfLvjixQHm7WYs3i35zpDx3T"
  },
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": null,
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": "all"
  },
  "Routing": {
    "Type": "dht"
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "20s",
      "HighWater": 900,
      "LowWater": 600,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "EnableAutoRelay": false,
    "EnableRelayHop": false,
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

node1 's config

{
  "API": {
    "HTTPHeaders": {
      "Access-Control-Allow-Origin": [
        "*"
      ],
      "Server": [
        "go-ipfs/0.9.1"
      ]
    }
  },
  "Addresses": {
    "API": "/ip4/0.0.0.0/tcp/5001",
    "Announce": [],
    "Gateway": "/ip4/0.0.0.0/tcp/8080",
    "NoAnnounce": [],
    "Swarm": [
      "/ip4/0.0.0.0/tcp/4001",
      "/ip6/::/tcp/4001",
      "/ip4/0.0.0.0/udp/4001/quic",
      "/ip6/::/udp/4001/quic"
    ]
  },
  "AutoNAT": {},
  "Bootstrap": [
    "/ip4/<ip>/tcp/4001/p2p/QmPHhWeVNw4ACGfEwLNeA8JfLvjixQHm7WYs3i35zpDx3T"
  ],
  "DNS": {
    "Resolvers": null
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": false,
      "Interval": 10
    }
  },
  "Experimental": {
    "AcceleratedDHTClient": false,
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "P2pHttpProxy": false,
    "ShardingEnabled": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": null,
    "HTTPHeaders": {
      "Access-Control-Allow-Headers": [
        "X-Requested-With",
        "Range",
        "User-Agent"
      ],
      "Access-Control-Allow-Methods": [
        "GET"
      ],
      "Access-Control-Allow-Origin": [
        "*"
      ]
    },
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "QmXGpykMLCbNuXDjccuEu1Uc8Papbdt8Q5iggXrswqeWZx"
  },
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": null,
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {
    "Interval": "12h",
    "Strategy": "all"
  },
  "Routing": {
    "Type": "dht"
  },
  "Swarm": {
    "AddrFilters": null,
    "ConnMgr": {
      "GracePeriod": "20s",
      "HighWater": 900,
      "LowWater": 600,
      "Type": "basic"
    },
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": false,
    "EnableAutoRelay": false,
    "EnableRelayHop": false,
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

Also found that the 2 nodes seem to be updating DHT information. Here’s the output of ipfs stats dht from another private network where we recreated the same issue above:

## Node1

ipfs stats dht 

DHT wan (0 peers):
DHT lan (1 peers):
  Bucket  0 (0 peers) - refreshed 7s ago:
    Peer                                            last useful  last queried  Agent Version

  Bucket  1 (1 peers) - refreshed 2m40s ago:
    Peer                                            last useful  last queried  Agent Version
  @ QmWwaegHzNhBewhj97fbZdSrcT6YXuq2wEwAorrhnWi4T9  7s ago       7s ago        go-ipfs/0.9.1/dc2715a

## Node2

ipfs stats dht

DHT wan (0 peers):
DHT lan (1 peers):
  Bucket  0 (0 peers) - refreshed 3m3s ago:
    Peer                                            last useful  last queried  Agent Version

  Bucket  1 (1 peers) - refreshed 3m3s ago:
    Peer                                            last useful  last queried  Agent Version
  @ QmPaCd2ckD9ffRwTzmQJNrGQS8J6JoAHvyP1MsWSCdcaqn  1m10s ago    24s ago       go-ipfs/0.9.1/dc2715a

Attempt to ipfs cat a file hangs. The logs contain the following messages periodically (every 10 mins?):

2021-08-16T19:43:29.286Z        ^[[34mINFO^[[0m dht/RtRefreshManager    rtrefresh/rt_refresh_manager.go:276     starting refreshing cpl 1 with key CIQAAAGIMQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (routing table size was 1)
2021-08-16T19:43:29.287Z        ^[[34mINFO^[[0m dht/RtRefreshManager    rtrefresh/rt_refresh_manager.go:276     starting refreshing cpl 0 with key CIQAAAWIAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (routing table size was 0)
2021-08-16T19:43:29.287Z        ^[[33mWARN^[[0m dht/RtRefreshManager    rtrefresh/rt_refresh_manager.go:196     failed when refreshing routing table    {"error": "2 errors occurred:\n\t* failed to query for self, err=failed to find any peer in table\n\t* failed to refresh cpl=0, err=failed to find any peer in table\n\n"}
2021-08-16T19:43:29.287Z        ^[[34mINFO^[[0m dht/RtRefreshManager    rtrefresh/rt_refresh_manager.go:283     finished refreshing cpl 1, routing table size is now 1