Hey, that's nice
A few comments:
Because Bob would also see another chain, his client would also provide a new version of the profile (F') where E and E' are merged - one of the problem which must be sorted out. But a rather trivial one in my opinion, as the clients need only to do some sort of leader-election. And this election is temporary until a new node is published - so not really a complicated form of concensus-finding!
It's the kind of problem that can become messy real quick in a distributed system. Maybe you should have a look at CRDTs (https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type, https://medium.com/@istanbul_techie/a-look-at-conflict-free-replicated-data-types-crdt-221a5f629e7e). They are a really helpful form of data structure that will enventually converge to the same state for everyone. It's what is used behind the scene in Orbitdb, a distributed database built on top of ipfs.
.... and I see that you talk about CRDT later. My point is that instead of rewriting the history, you might want to publish a 'revert' message that say that a piece of data has been modified or deleted. Kind of like how git does revert instead of rewriting a branch.
Of course, D would now point to a node which does not exist. But that is not a problem. Indeed, its a fundamental key point of the idea - that content may be unavailable.
I guess you mean that the C node is dropped locally. It's still might be available somewhere so you can't assume that it's not reachable anymore.
This is a hard problem. My first idea would be a PubSub channel where each client periodically announces their IPNS hashes. I’m not sure whether PubSub nodes must be connected directly. If this is the case, the problem just got harder. There’s also the problem that this channel would be rather high-volume as soon as the network grows. If each client announces their IPNS hash every 15 minutes, for example, we get 4 messages per client each hour. That’s already a lot of bandwidth if we speak about 1,000, 10,000 or even 100,000 clients! It is also an attack-vector how the system can be flooded. Not nice!
I think that a social network is a special case that you can exploit.
- As a user, you don't care about everyone. You only care about your connections in the social graph. You only care about a very local part of everything that is happening.
- Your connection in this social graph are not stranger. They are people that you know and/or care about. And this connections are likely to be bidirectional. This means that when you design a system for this use case, you can exploit that create a solidarity network. Users can hold data for each other, serve as relay. Instead of having a unique global feed for your network, you can simply ask for updates to your contacts and they will likely be able to answer for themselves, but also for your others contacts if they are not connected.
Is it possible to talk to a specific node directly in IPFS? This would be helpful for discovering content by asking nodes what profiles they know. It would also be a nice way for finding consensus when multiple devices have to agree on which node publishes a merge.
You can already fake that with the pubsub. Each node can listen to a channel named after its own identity. You can now publish in this channel to contact said node. Obviously, it's neither private nor optimized.
go-ipfs already have an api called p2p that allow to send direct message between nodes but it's highly experimental right now, sometime incomplete (no bindings in js-ipfs-api). You also need to know the ipfs node ID which might not be your identity ID.
How fast is IPFS with small files? If I need to traverse a long chain of profile updates, I constantly request a small file, parse it and continue requesting the previous node in the chain. That should be fast. If it is not, we might need to introduce some “pack files” where a list of metadata-nodes is provided and traversing becomes unnecessary with. But that makes deleting content rather complicated, TBH.
Your system would likely be implemented using IPLD data block instead of files. IPLD will (or already ?) come with something called IPLD selectors. The idea is that you can say to ipfs things like 'i want every node of this graph up to a depth of 10'. These selectors can (or will ?) be transmitted on the network so your ipfs node can ask for the data a single time. Way less latency.
To finish, you might be interested in Arbore (http://arbo.re/), the project I'm working on. It's more about files than posts and comments, but it is a social network. It's somewhat close to what you and I described. I'm still looking for help