How long can we safely store IPFS Files

Assuming IPFS is not using cloud cluster subject to Company closures bankruptcies and mergers if the IPFS is stored only in peers what happens when they eventually die are files disposed of? in the case of medical records or Big data, original owners may never be found in 30 years. Original computers seem to be replaced on average every 3 years now. I burn ours up in 18 months before replacing will we forever be backing up other peers files?

This is what “pinning” is about. Files are not “stored” (i.e. not guaranteed available) on any given instance, unless that instance “pins” them. When you pin a file you’re telling your gateway to keep that file forever (or until you ‘unpin’). BTW: By “gateway” what is normally meant by that is your IPFS instance. That terminology confused me for a while, because the word ‘gateway’ is widely used with different meanings.

To be clear then, if I co-locate a Linux box at my local data center, and it becomes my pinning IPFS mode…
I could charge a customer for pinning and invest in a long term maintenance and growth plan to handle their needs over time.
How is this any different than the Web 2.0 model?

1 Like

The most significant difference is the “content addressing”, where files can be identified without having to involve any ‘specific’ domain who ‘owns’ it. That gives people the ability to relocate data and even change their pinning service, without it affecting any links. No broken links. Data portability. You own your own data. Also it’s ‘decentralized’ in IPFS …although not “strong” decentralization like a blockchain, because yeah, if you let pins expire, and you don’t save data yourself, you can loose it.

2 Likes

I think they don’t store files. This is what peers-to-peers system means, I am afraid if I store my tennis students’ session data. Are files safe here or should I look forward to some other ways?

yes, OK, but what happens as peer Computers wear out every 4 years or the peers drop out or die I m looking for 15 -30 years of safe records like a mortgage would need Certainly for many AI uses I need lots of old Data maybe even generational… What Can I use short of creating a 1000 year Foundation?

thanks will need to rethink our AI use for IPFS i missed the whole Pinterest type thing
What about useing the DTN prototcal and simply address it to be delivered in the future then it could reside on a sever until sloser to deliver date.

seems like it can only be decentrailized file storage i understanf the file path can be decentalized peer to peer for as long as oeers remain active. if I pay for the Data to be Stored in a minimum of 3 Data centers and then stripe it as a Raid0 storage would be like a Cloud Cluster. If storing on peers is not possible,

You can use filecoin contract or pining service to store your data long term. They can replicate your data to multiple data centers and increase storage safety. Without replication data will be lost due to hw failures.

IPFS gives you more control over your data then typical cloud drive storage. Storage providers like S3 can lose data too. Data in IPFS can be easily replicated (pin) while traditional cloud storage has to be backed up manually. You can always setup small local machine with cheap HDD storage and replicate your data from IPFS there.

2 Likes

File persistence is built into the protocol itself. But incentive to persist files is not a built-in feature of the IPFS protocol per se. That was supposed to be taken care of by Filecoin which sadly enough is no longer the case…

P.S. Filecoin does leverage the IPFS stack but it is no longer an incentive layer for pinning. It has developed into its own “storage contract” thing into which I bet you can take a look.

1 Like

Btw I think it is easier or more “natural” to reason IPFS with analogy to BitTorrent or eMule as they are very similar in construct.

You may consider “pinning” analogous to “seeding” in BitTorrent or eMule. As long as there is at least one person aka node seeding all pieces of a particular file, you or others can retrieve that file in full. And once you have successfully retrieved that file and decide to continue seeding/pinning, that particular file will then have two “seeds” on the network from which others can retrieve pieces of that file in full.

1 Like

Thanks will try that this year but have to consider Filecoin as one single entity for the long term as they could disappear even with 3 copies of data.

Is it also necessary to pin data shared on a private ipfs network in order to persist the data?

Yes, if it’s not pinned then it is subject to GC if and when it is run.

1 Like

But will a file uploaded to a private network also be GC if it is downloaded by at least one of the nodes in the network? Shouldn’t that guarantee persistence?

There are very few guarantees in life. It will be GC’ed on any node that runs GC and the file is not pinned. Your file will be more likely to be available the more nodes that pin the file but there is no guarantee that those nodes won’t go offline or unpin it. If they’re nodes that are under your control you can take steps to make sure that doesn’t happen or you can pay someone who you trust to do that on your behalf.

Thanks. I am asking because we would like to set up a private ipfs network for our company, and we want to be sure that data uploaded to the network remains available and accessible. So, if I use a pinning service like Pinata to pin data on my node after uploading it to the network, will that ensure that the data is always available and accessible on the network, as long as the pinning service is up and running and I don’t later unpin the data file? I am sorry if my questions sound too elementary.

Not too elementary at all. It’s a good question. If you want to run a third party pinning service with a private IPFS Network the only pinning service that I know of that supports that is Temporal. (That may have changed so I’d look around)

I think you’ve got it. The only thing I’d add is with IPFS there are a couple of things people seem to bring up that probably comes from how they’re used to working that causes some confusion with IPFS. You don’t really upload content to IPFS which suggests a sort of client/server setup. Since IPFS is peer to peer it’s more like you make content available to the p2p network. If you do think of it as uploading it’s just uploading to the one node which is connected to the p2p network not the entire network. The other thing that causes confusion is when the content is “on IPFS” people get the idea that the content is then automatically replicated and highly available. It’s very easy to do that but you have to actively set that up with either a pinning service, an IPFS cluster or even just a second node where you manually pin data.

It’s easier to understand this when you think of IPFS as a global system. You can’t just upload data to the world and say, “hey world store my stuff” and have any expectations about what actually might happen. If you want guarantees you need to take matters into your own hands.

One last thing, there has been a recent development in pinning with the specification of a pinning service API. I don’t know of any open source implementations yet but I’m sure it’s just a matter of time but when one is available you will be able to run your own pinning service.

Thanks a lot Zachary; very informative reply.

My pleasure. I’m glad to hear you found it helpful. If you’re running a private network it sounds like what you’re looking for is IPFS-Cluster. Don’t worry about the word “cluster”. It’s not nearly as complicated to get running as getting something like Spark or Kubernetes. It’s comparatively simple. It’s a set of otherwise independent IPFS nodes with a small supervisory component that jacks into their brains and diddles with their pin set so you get what you’re looking for, automated replication and high availability.

1 Like