S3 Datastore too many requests that causes increasing AWS costs


I currently use GitHub - ipfs/go-ds-s3: An s3 datastore implementation as the datastore for my IPFS. But there are too many requests (HeadObject requests) to the s3 bucket that results in significantly increasing in costs.

As I look at the code of go-ds-s3, it seems to come from the function GetSize (go-ds-s3/s3.go at master · ipfs/go-ds-s3 · GitHub).

Any advice on this? Can we configure the frequency of this routine?

And in the worst case, we may need to no longer use s3 as datastore. So i am looking for an tutorial on how to migrate the data from s3 to local machine.

Many thanks!

Take a look at Filebase - S3-Compatible, Edge-Caching and at a fraction of the cost of the other pinning services out there.

5GB always free - one month 5TB trial with code “IPFS”

1 Like

Sorry for mentioning you @hector, but do you have any idea on this? About how to decrease the number of requests to S3 or migrating data from s3 to local machines (flatfs datastore).

Here is my current datastore_spec


I want to change it into


Notes: we also use ipfs cluster along with ipfs so please advice if anything else needs to be done

I think datastore.Has() is implemented via GetSize().

If I’m not mistaken however, usually the response to such requests should be cached. Increasing the sizes of the cache might be one way to reduce them:

(unfortunately only bloom filter size is configurable).

Otherwise, if the requests are very random for very random keys there is not much to do other than not using S3. If nodes are meant to provide content publicly, they need to check if they have it when requested.

1 Like