History + Versioning of documents (IPFS/IPNS)

First, let me explain my use case :

Assume that I want to deal with documents (which are stored in a folder for example).
So, I add my documents in IPFS, then I publish the hash to IPNS.
In this case, the new version of my documents will be reachable through IPNS, and the previous still remains in IPFS but I can’t retrieve it using my IPNS PeerID.
So, what I want to do is to keep tracking all previous versions to be able to return to a previous version (such as Git allows you to do it).

My question is :
How to create a history of my versioned documents and be able to return to a certain version ?
Is there a system as described above in IPFS ?

If not :disappointed:, what I suggest :

  1. make a basic hash-chain with added metadata such as timestamp, version number…
  • implement a Git-style versioning (but I am not really comfortable with because I don’t know how it works but Git book should help)

I already read the following, but I haven’t found what I want :

5 Likes

I just read in the IPFS paper in the Object Merkle DAG section :

A raw data fi eld and a common link structure are the necessary components for constructing arbitrary data structures on top of IPFS. While it is easy to see how the Git object model ts on top of this DAG, consider these other potential data structures: (a) key-value stores (b) traditional relational databases (c) Linked Data triple stores (d) linked document publishing systems (e) linked communications platforms (f) cryptocurrency blockchains. These can all be modeled on top of the IPFS Merkle DAG, which allows any of these systems to use IPFS as a transport protocol for more complex applications.

Is that there are implementations of these data structures ?

Found implementations :

1 Like

@raucoule1u you’re on the right trail. We need someone to implement a version graph in IPLD and add the necessary minimal toolchain that lets you manipulate commits/versions in that graph. I’ve had a sketch for this lying on my laptop for months. Stuck it in a gist in case anyone wants to run with it: https://gist.github.com/flyingzumwalt/a6821e843366d606aeb1ba53525b8669

4 Likes

Good sketch.
Thank you I will think about it.

This is a much needed feature for many other use cases. For example, with the implementation of history in IPNS we can redesign the model of IPWB to be fully implemented on IPFS, IPNS, and IPLD without the need of an external index. This new model is briefly discussed in the following ticket.

We had some conversation over email and some in person discussions about it. We think, implementing IPNS over Blockchain is the right way to go forward. A test implementation can be written in a private blockchain of Ethereum and run by IPFS community. Once the history of IPNS chages is recorded in the blockchain, it will be very easy to implement Memento protocol on top of it to allow datetime-based negotiation which will open a a big door of its usage in the archiving community.

3 Likes

Could this kind of versioning include the possibility of “tagging” (v1.0.0, v. 1.1.0, etc) for each file in a hash chain?

One problem I have with Git right now is that, when versioning the history of a textual edition (rather than the history of a software program), I can only create version tags for an entire repo rather than for an individual file. Thus, if I want to create version numbers for an individual text file, I end up having create a new repo for each file. But I have over 15,000 texts that I’m trying to version and this would require 15,000 Git repos. This becomes very difficult to manage.

This isn’t really GITs fault since Git was designed for versioning software and not texts.

But if ipfs allowed for creating hash chains for individual files rather for directories, this would be a kind of versioning system much better suited to the versioning of text editions than GIT currently is.

I understand your idea but if you want each file to be dynamic and track updates on each one, it means that you have to use 1 IPNS PeerID for each file, which can be very heavy and difficult to manage. That’s what I think and maybe someone can bring a different point of view.