Blueprint of a distributed social network on IPFS (2) [blog article]

I just published the second revision of my Blueprint of a distributed social network on IPFS - I would love to get some feedback on my thoughts.

Please be niceā€¦ Iā€™ve been writing this article for two months now, and I really hope I got everything right here.

5 Likes

Hey, thatā€™s nice :slight_smile:

A few comments:

Because Bob would also see another chain, his client would also provide a new version of the profile (Fā€™) where E and Eā€™ are merged - one of the problem which must be sorted out. But a rather trivial one in my opinion, as the clients need only to do some sort of leader-election. And this election is temporary until a new node is published - so not really a complicated form of concensus-finding!

Itā€™s the kind of problem that can become messy real quick in a distributed system. Maybe you should have a look at CRDTs (Conflict-free replicated data type - Wikipedia, A Look at Conflict-Free Replicated Data Types (CRDT) | by Nezih Yigitbasi | Medium). They are a really helpful form of data structure that will enventually converge to the same state for everyone. Itā€™s what is used behind the scene in Orbitdb, a distributed database built on top of ipfs.

ā€¦ and I see that you talk about CRDT later. My point is that instead of rewriting the history, you might want to publish a ā€˜revertā€™ message that say that a piece of data has been modified or deleted. Kind of like how git does revert instead of rewriting a branch.

 A<---B     <---D<---E<---F
      \                 /
       -----------------

Of course, D would now point to a node which does not exist. But that is not a problem. Indeed, its a fundamental key point of the idea - that content may be unavailable.

I guess you mean that the C node is dropped locally. Itā€™s still might be available somewhere so you canā€™t assume that itā€™s not reachable anymore.

This is a hard problem. My first idea would be a PubSub channel where each client periodically announces their IPNS hashes. Iā€™m not sure whether PubSub nodes must be connected directly. If this is the case, the problem just got harder. Thereā€™s also the problem that this channel would be rather high-volume as soon as the network grows. If each client announces their IPNS hash every 15 minutes, for example, we get 4 messages per client each hour. Thatā€™s already a lot of bandwidth if we speak about 1,000, 10,000 or even 100,000 clients! It is also an attack-vector how the system can be flooded. Not nice!

I think that a social network is a special case that you can exploit.

  • As a user, you donā€™t care about everyone. You only care about your connections in the social graph. You only care about a very local part of everything that is happening.
  • Your connection in this social graph are not stranger. They are people that you know and/or care about. And this connections are likely to be bidirectional. This means that when you design a system for this use case, you can exploit that create a solidarity network. Users can hold data for each other, serve as relay. Instead of having a unique global feed for your network, you can simply ask for updates to your contacts and they will likely be able to answer for themselves, but also for your others contacts if they are not connected.

Is it possible to talk to a specific node directly in IPFS? This would be helpful for discovering content by asking nodes what profiles they know. It would also be a nice way for finding consensus when multiple devices have to agree on which node publishes a merge.

You can already fake that with the pubsub. Each node can listen to a channel named after its own identity. You can now publish in this channel to contact said node. Obviously, itā€™s neither private nor optimized.

go-ipfs already have an api called p2p that allow to send direct message between nodes but itā€™s highly experimental right now, sometime incomplete (no bindings in js-ipfs-api). You also need to know the ipfs node ID which might not be your identity ID.

How fast is IPFS with small files? If I need to traverse a long chain of profile updates, I constantly request a small file, parse it and continue requesting the previous node in the chain. That should be fast. If it is not, we might need to introduce some ā€œpack filesā€ where a list of metadata-nodes is provided and traversing becomes unnecessary with. But that makes deleting content rather complicated, TBH.

Your system would likely be implemented using IPLD data block instead of files. IPLD will (or already ?) come with something called IPLD selectors. The idea is that you can say to ipfs things like ā€˜i want every node of this graph up to a depth of 10ā€™. These selectors can (or will ?) be transmitted on the network so your ipfs node can ask for the data a single time. Way less latency.

To finish, you might be interested in Arbore (http://arbo.re/), the project Iā€™m working on. Itā€™s more about files than posts and comments, but it is a social network. Itā€™s somewhat close to what you and I described. Iā€™m still looking for help :wink:

4 Likes

First: Thanks for your reply!

Weā€™re talking about two different things here: One is leader-election for electing a node which does the ā€œmergeā€ and the other nodes have to follow.
The other thing is having content which is only shared between the nodes a user uses (rather ā€œdevicesā€ in this context). That data should never reach the ā€œinternetā€ ā€¦ CRDTs are definitively the way to go here, yes!

Absolutely right!

Very valid points. But finding new content gets harder if you donā€™t have a way to ā€œexploreā€ the things outside of your social network. Consider the following: you have nothing. You installed the client for that network, and now you want to find someone. Not necessarily Bob from your local football team, but anyone. Like ā€œHey, Iā€™m here, who is there?ā€. In this case you need some way to find other profiles. And for finding other profiles, you need a way to find other hashes.

Yeah, I thought of that, but it is ugly and I donā€™t want to dig a hole into that direction, TBH.

That sounds interesting, thanks for pointing that out!

I know of IPLD, but I donā€™t know how far it is reality right now. From what I know all this is ā€œwe want it, but we donā€™t have that right nowā€ - please tell me that my information is outdated! :slight_smile:

Also: What is a IPLD data block?

This looks great! Iā€™m not a web dev, though. In fact, I would implement this social network idea in Rust as a CLI and later with a GUI client - I even have some code on my machineā€¦ which is simply some types and no ā€œworking codeā€ā€¦ but its a first step.

Weā€™re talking about two different things here: One is leader-election for electing a node which does the ā€œmergeā€ and the other nodes have to follow.
The other thing is having content which is only shared between the nodes a user uses (rather ā€œdevicesā€ in this context). That data should never reach the ā€œinternetā€ ā€¦ CRDTs are definitively the way to go here, yes!

My point is that if you use CRDTs, you donā€™t need to elect a leader.

Very valid points. But finding new content gets harder if you donā€™t have a way to ā€œexploreā€ the things outside of your social network. Consider the following: you have nothing. You installed the client for that network, and now you want to find someone. Not necessarily Bob from your local football team, but anyone. Like ā€œHey, Iā€™m here, who is there?ā€. In this case you need some way to find other profiles. And for finding other profiles, you need a way to find other hashes.

In Arbore, you have to add your first contact using an ā€œArbore IDā€ which is the public key of this contact. Once you have your first contact, contact list are exchanged in the background and Arbore will start to suggestion new contact based on its view of the social graph so you can easily build your list. It can probably be improved but it works :slight_smile:

Yeah, I thought of that, but it is ugly and I donā€™t want to dig a hole into that direction, TBH.

Iā€™m using that in Arbore for now, Iā€™ll switch to the p2p thing in the future. At least I can iterate on everything else and the switch will be transparent later.

I know of IPLD, but I donā€™t know how far it is reality right now. From what I know all this is ā€œwe want it, but we donā€™t have that right nowā€ - please tell me that my information is outdated! :slight_smile:
Also: What is a IPLD data block?

IPLD is here and used as default in the last go-ipfs. Probably the same in js-ipfs. Not sure about the selectors though.

IPLD is a way to encode data and links between blocks. When you store a file in IPFS, itā€™s broken down into pieces and stored using this format with links to form a graph. When you pass some JSON to IPLD, itā€™s converted to the IPLD format to be stored as binary instead of text. You can reference other block in your JSON and use the IPLD resolver to query your data across these links. See https://www.youtube.com/watch?v=bi-4YGZXxwA for more funny things.

This looks great! Iā€™m not a web dev, though. In fact, I would implement this social network idea in Rust as a CLI and later with a GUI client - I even have some code on my machineā€¦ which is simply some types and no ā€œworking codeā€ā€¦ but its a first step.

Iā€™m not either :wink: I chose (and learned!) javascript, react, electron and all because itā€™s the only way to build a really nice UI with a one-man team. I know because I failed the other ways :wink: Arbore is for now a prototype where I can iterate somewhat fast.

If Arbore is successful it might be a good idea to drop electron and use a more efficient language, but IMHO itā€™s better this way. Make the thing works efficiently after you proved it works and people are interested.

If you want to look at the code, what you are interested in is in app/actions and app/models.

Why not? I have to decide which node does the merge! I donā€™t see how CRDTs helps me with that decision!

And where do you get that from? How do you find that public key?

Thatā€™s basically the approach I would use, too.

Thanks for that pointer! I will look at this tomorrow, after I had some sleep! :slight_smile:

Why not? I have to decide which node does the merge! I donā€™t see how CRDTs helps me with that decision!

With CRDTs each node can do the merge and they all will converge into the same state when they have all the messages.

And where do you get that from? How do you find that public key?

Sadly, with another way of communication. But itā€™s not necessarily a big problem, we exchange email address sometimes as well. That said, this problem could be mitigated several ways:

  • QR Code for an easy scan
  • embed the ID in something easily exhangeable, a file, an image ā€¦
  • integrated or external public directory
  • when someone share the software with someone else, package the client with the ID
1 Like

Yeah, I know ā€¦ but how does this help with the general idea of having a chain of profile updates? I would need to persist these CRDT objects after every action and that results in a IPFS node. But if two devices do a merge of the last IPFS objects and create a new object for that, that new object might have different hashes on different devicesā€¦ because of a different timestamp, for exampleā€¦ and thatā€™d result in two different IPFS objects.

True.

Thanks for your input on IPLD, especially for the linked video. Definitively the way to go!

Not technical here. Just wondering if this could be useful for connecting people who want to collaborate to make finished news pieces? Basically IPFS-hosted content, IDā€™d by various categories or by Keyword ā€“ to link, for example, someone who shot a raw video with an editor in NYC looking for that video based on keyword?

1 Like

Sorry, Iā€™m not entirely sure what you want to say. :sweat_smile:

hello,

I think what sfrady is trying to say. This is your social network can be used for journalistic collaboration. For example, I am a video editor of events in Noew York and what for example the New York Times can collaborate with me to publish my videos and especially find it through key words?

finally itā€™s my interpretation.

Regards.

1 Like

Revisiting this after finally getting some progress on a site, albeit non-IPFS and non-crypto integrated as per original visionā€¦ there is already a company doing what you described, called Storyful ā€” but in being restricted to that establishment media ecosystem, doesnā€™t really level the playing field in favor of the many freelancers out there.

That said, I do think there is progress to be made both in the realm of user-friendliness (I share a bit of empathy for this If Users Struggle to Install, IPFS Will Fail ) and in privacy and encryption (protection of identity of people posting sensitive but newsworthy content to be edited into a finished html5 news piece for example).

2 Likes