Best IPFS practice for event or time series data

I have a small Particle network with sensors periodically reporting data with time stamps, temperature, humidity etc. I want to store that data… IPFS seems like a perfect distributed way to do so. And I want to view the last say 48 hours on a web page.

I have a few go-ipfs and ipfs-cluster processes running, so I kind of understand that part.

I “think” the answer has these ingredients:

  1. ipfs gateway (for posting update)
  2. IPNS pubsub to be aware of newly posted data
  3. some kind of ipfs-cluster awareness of updates to ensure distributed storage.

What is the recommended way do go about this?

It is difficult to answer this without knowing more about your setup.

  • Are you running IPFS nodes at the places where the data is gathered?
  • What quantities of data are we talking about?

With the ingredients you should definitely be able to roll your own solution. However, you might consider using something like https://github.com/orbitdb/orbit-db if the amount of data is not that high.

FYI I have added this to ipfs-cluster:

If you end up coding something along those lines, let us know!

1 Like

The data is gathered using a sensor connected to an arduino-ish device e.g. Particle. Via the Particle cloud service, the sensor data is published via HTTP to a data aggregation/collection golang service. The data rate is approx 1K per minute per sensor. I currently only have about 5 sensors… designing for 10^3

Data aggregation/collection options I see:

  1. Use a go-ipfs gateway… where does pubsub fit here?
  2. add IPFS client libs to my golang service… how do I know if/when data is stored?

User interface use cases:

  1. Latest measurement per sensor: Use IPFS JS client to subscribe via pubsub to new data. Whats the high end for number of topics? Is there and advised message/topic ratio? e.g. given 10^3 sensors, create topic per sensor?
  2. Spark line per sensor: Use IPFS JS client to fetch a “window” of data… not sure how to fetch a window. Encode time in file name?

Thank you in advance for any/all time spent thinking about this. If you happen to be in Portland, Oregon I’d be happy to buy you a beer/coffee or two for your time.

1 Like
  • So 1K samples per minute per sensor? What are the latency requirements? When do you need to see new data in your web ui?

  • So the “data aggregation/collection golang service” is not really distributed, but a central cluster that can possibly run ipfs?

Regarding number of topics: I would try a single one and see how far that gets you. 1000 messages per second are not exactly much in a data center. And if you don’t have enough network bandwidth, splitting the load over multiple topics is not going to help.