I was exploring IPFS-DHT implementation using the go-libp2p and go-libp2p-kad-dht libraries. As per the documentation (dht - GoDoc), GetClosestPeers is a kademlia Node lookup operation which gives K(=20) closest peers to the key (https://docs.ipfs.io/concepts/dht/#lookup-algorithm), and to measure the closeness, it takes XOR of hash(peerID) and hash(key).

I fed keys from “0” to “1000” to the GetClosestPeers API in a network of 50 peers, and observed that the K(=20) closest peers were not coming out to be uniformly distributed from the network, but is skewed **[A]**. I was expecting it to be nearly uniformly distributed on average. To verify my understanding, I tried manually taking XOR of hash of some dummy peers’ IDs and hash of keys ranging from “0” to “1000” to take out K(=20) closest peers for each key, and observed a similar skewness **[B]**.

Below is the result for my experiments:

**[A]**: Distribution of peers after running GetClosestPeers() on kad-dht for 1000 keys in a network of 50 peers.

[How to infer] The result is pair of {NodeID, percentage occurrences of this node out of 1000 KClosestPeer calls}

Total peers involved in this experiment 49

(Mentioned top 2 and bottom 2)

QmWJKJMGaw2474QQizyERjspkTjd8amtUMgpjBdx6bbtkc , 47.4 %

QmXcpUgJ1mNfQ9SJkiLNGMsBHzXdwFAYbPaAxnEPYz2WeA , 47.1 %

…

…

…

QmWATi2voJbFW2QhJ1cjVgBCezEif5sTJVuk9oLJi8m2sU , 26.499998 %

QmTALJQqCS6nLMDyWREdBzUYMnKSHnS38imG9dAfeeLJfo , 25.9 %

Mean : 40.81632618

Median: 42.899998

SD : 6.502569226

Min : 25.9 %

Max : 47.4 %

Ideally I expect each node to appear in ~40.8% of the GetClosestPeers() calls considering the network size of 50 Peers

**[B]**: Distribution of peer IDs after manually taking XOR of hash of 1000 keys with hash of 50 peer IDs.

[How to infer] Result is pair {NodeID, percentage occurrences of this node among K closest peers}

Mean : 40.0%

Median: 43.4%

SD : 10.412012293500234%

Min : 14.2 %

Max : 50.2 %

I wanted to know if this behaviour is expected, or is there something wrong in my understanding of how GetClosestPeers node lookup works

Thanks