I am generating a highly concurrent load (~100 agents doing object-put, stat, pubsub, object-fetch etc) against one (or a small number of) ipfs node(s). Objects generated by the test are small (<100bytes). I use latencies of object-put (HTTP round-trip time) as a measure of system performance. I find that performance degrades after a few hundred thousand object creates.
The latency for object-put starts out at around 200ms. After a few hundred thousand object-puts during which around a few Gigs of data has been put into the datastore, round trip times degrade to around 2seconds. This degradation continues over time, and I’ve seen it grow up to ~15seconds, at which point I’ve felt the need to reset my benchmark.
If a ‘degraded’ system is made to serve light loads, i.e. very little concurrency, then object-put latencies aren’t too bad (around 400ms).
Deleting *.ldb files under /datastore causes performance to be restored. The .ldb extension suggests that these are leveldb files.
Can someone help explain this behaviour? What is contained in the .ldb files, and is it safe to reset them from time to time?