We’re running into an issue here at Pinata where our gateways are being hit pretty hard for content that we’re not actually storing on our host nodes.
Over the past couple of days this has gotten super bad to the point where over 1TB of unique data is being cached per day (Our gateway’s max cache is 1TB max otherwise I’m sure it would have been more) and roughly 10TB of bandwidth is being consumed per day.
The simplest solution I have to solve this is to prevent our gateway from serving content that our users aren’t actually storing on our platform. In order to do this we would need some type of custom “filtering functionality”. I’d love to hear others’ opinions on the best / most efficient way to handle something like this.
The three best ideas I have are:
Have something native to IPFS where gateways can simply ask an api endpoint (could be running locally). “Can I serve this content?” before serving the content. This would be pretty flexible so gateway providers could implement any logic that custom fits their needs in regards to content blocking without IPFS having to get too opinionated.
Implement some kind of special NGINX filtering that can do the same as mentioned in option 1 before the request even gets to IPFS. I was talking with @adin and he recommended tagging @olizilla and @lidel as two that might have thoughts on how this could be solved as well.
Implement a NodeJS proxy that sits behind NGINX and performs the operations I was mentioning above. If the request passes, then the request gets forwarded onto the IPFS gateway and streamed back to the user.
@mburns feel free to chime in with any magic dev ops knowledge you may have as well.
I really appreciate any thoughts as this is starting to get fairly expensive for us to handle.