IPFS-based manufacturing execution system

This is a prototype of an IPFS based worker assistance and manufacturing execution system, built by my employer Actyx AG. It has been exhibited at the AWS popup loft in Munich for the last four weeks. We will also have an exhibit at the german industrial fair Hannover Messe in April. The same technology is in limited production use right now.

Currently we are using IPFS to distribute large static assets such as assembly instruction videos and CAD files, as well as webapps. We are planning to use IPFS also for communication between the devices.

The devices are industrial android tablets (Zebra ET50, Quad core atom processor). We are running go-ipfs on the devices as a data store. We are obviously using a private swarm.

9 Likes

This is really cool! How’s the connectivity? Would love to hear all the complaints.

Are there specific transport or networking protocols that would make this better?

How big would these networks get? how many devices? across NATs or no?

2 Likes

For this demo, we got a DSL connection that can be disconnected to illustrate the offline capability of the system. See the cable hanging out of the wall to the left of the left big screen. (A part of the swarm is in the cloud (EC2) and used to publish new app versions or work instructions)

At real customer installations, they typically have a SDSL connection to the main office and to the internet. These can be quite slow, since manufacturing is usually in the countryside and not in a city center where you get decent speeds. IPFS provides a big benefit here, since we can get some large assets to 100s of devices without having to download from the cloud 100 times.

That I can do. It will be quite a long list though. Don’t get me wrong, all in all we are quite happy…

High level:

We would love to have a dependable roadmap about when the current parts of IPFS (specifically IPNS and pubsub) will become production ready. I took a big risk in recommending IPFS and a fully distributed approach instead of a more traditional centralized approach. I think going with a more distributed architecture will pay off, but it would be great to know that significant resources of protocol labs are dedicated to making the current features of IPFS rock-solid and stable. There is some concern that there are so many things to be worked out about filecoin that work on IPFS will be neglected.

We got a set of backup plans, e.g. using centralized MQTT instead of IFPS pubsub if pubsub is not production ready when we need it. But we would very much like to avoid using additional systems to reduce complexity and single points of failure.

Technical:

Currently, we are using IPFS just for distributing the application and large static assets. The data distribution functionality works reasonably well, with the caveat that the DHT seems to require a lot of bandwidth. The mDNS discovery also works well with android devices.

However, we have had a lot of problems with IPNS. First of all, publishing to IPNS and resolving IPNS names is frequently very slow. We also had some for now inexplicable issues where old/wrong hashes for IPNS names appeared. We will file a bug report once we have figured out what exactly is happening. It would be good to have a way to access the low level details such as sequence number etc. to troubleshoot this.

What we really want is the following:

  • notification of IPNS updates (currently we are polling ipfs name every few minutes)
  • recursive pinning of the hash pointed to by an ipns name, with unpinning of the old hash on update (currently we use ipfs pin update when we get a new hash for a name)
  • a way to get notified whenever a new ipns entry is fully pinned (so we can only switch to e.g. a new version of an app when it is fully available locally (in case the device were to go offline in the next second))
    • some way to only resolve an ipns name to the latest fully pinned hash
  • ability to publish to a key/name when a device is offline (=has no peers)
    • an app should be able to store its state on a device by publishing to a name, even while offline

ipfs name follow · Issue #4435 · ipfs/kubo · GitHub contains some good ideas to solve some of these points.

For the communication between the devices, we would heavily rely on pubsub, and would urgently need a system that puts more effort into delivering pubsub messages. If I got the current situation correctly, if you have a topology a <-> b <-> c, and just a and c are interested in topic X, they won’t be able to communicate. This will have to be changed, or else we will have to use MQTT. Obviously something more efficient than floodsub would be highly desirable, but with our relatively small swarms we might be able to live with floodsub initially.

What we don’t need and want are any delivery guarantees (order, at least once, at most once, exactly once) except for best effort. We would much prefer a quick, best effort UDP style system than some sort of TCP approach that would have significant overhead. Our system will be layered on top of pubsub and will ensure “eventual exactly once delivery” by publishing hashes of an event log, somewhat similar to what orbit db does. We also don’t have a need for auth or encryption. If we need encryption we would layer it on top of pubsub.

One more thing: it would be great if there was a guarantee that some sort of private swarm feature would eventually become non-experimental.

Not really. Support for Android P2P wifi would be very cool. But having the things that are currently there production quality and stable would be way more important in the medium term.

For a typical customer we are looking at 20-500 devices (devices are industrial android tablets for GUI terminals and industrial PCs for machine interfaces and interfaces to other on-premise systems (e.g. ERP)). For larger customers there usually are multiple sites, so that would be across NATs. For small customers there is just the usual NAT between the private network and the cloud.

2 Likes

This info is fantastic, thank you very much for writing it all up :slight_smile: will influence things.

1 Like

Great job! @rklaehn

Could you be dear enough and spin something on how to run the private swarms successfully?
Are you pulling the go-ipfs from the master?
Is there something else that needs to be used?
How’s the encryption playing?

Kindly request you to take some time out and help the community see things in action…
Keep it up!
TIA

Sure. Setting up the private swarm was completely uneventful. We just followed the instructions.

We create a swarm key which gets downloaded when a node starts up for the first time, as well as a number of bootstrap nodes for this swarm. It is part of the initial configuration that gets downloaded when the device is initially set up, as well as IPFS version, settings and command line args…

We set the LIBP2P_FORCE_PNET environment variable to make sure we don’t accidentally connect to the public IPFS. E.g. once we accidentally had swarm.key stored as swarm_key, and this safeguard prevented IPFS from starting up and trying to connect to the public swarm. Just removing the public bootstrap nodes is not enough, since IPFS has many ways to discover peers… :slight_smile:

No. We are downloading a released version from IPFS Distributions . See this discussion for details. The apk chooses the right binary based on the architecture (arm or x86) of the android device.

The private swarm encryption uses the symmetric cipher Salsa20 in a pretty straightforward way. See this discussion for details.

You need a secure way to distribute the swarm key to each device.

We are currently using --routing=dhtclient for the ipfs nodes on the android devices in order to reduce bandwidth usage, and because the android devices frequently go offline and therefore probably are not good DHT nodes. We need to perform some more experiments with this though.

Hope this helps!

1 Like

@rklaehn, thank you much for taking the time out! Appreciate it!
Where did you set the LIBP2P_FORCE_PNET variable? Which file exactly? Please let us know :slight_smile:

It is an environment variable. You can set it in your start script for ipfs, or start ipfs using

LIBP2P_FORCE_PNET=1 ipfs daemon

Thanks again.
I initated IPFS again, generated swarm keys again and started the daemon…
The nodes are not yet syncing. Perhaps they need to be under the same firewall?

Maybe (?), but I suppose that depends on a lot of factors. Can you ipfs ping one of your nodes in the private swarm from another one?

I tried pinging via peer ID. Getting the ERROR:
Ping error: dial backoff

Also tried ipfs swarm connect but getting this ERROR:
Error: connect QmccgyDKmCRPAK5wRkmNCahnW7xEXkmYobLPJs76KEQE62 failure: dial attempt failed: context deadline exceeded

Should i share the swarm.key generated at ~/.ipfs with the other peer node? Or should it be stored in the node in which its generated? Pls help me out here :slight_smile:

TIA

Ok, sorted out by copying the swarm.key from one node(The first node bootstrapped) to all nodes wishing to join the private network!

FYI, I generated swarm keys for each nodes and saved locally (Duh, :sweat_smile:)

1 Like

Thanks again @rklaehn for helping us out! We kept generating the swarm keys for all new nodes joining the network :sweat_smile: - That seemed to be the problem.

Also, the daemon successfully joined the private network even without LIBP2P_FORCE_PNET ! Hopefully thats because we removed all public bootstrap nodes successfully (made sure of it, twice :sweat_smile: ).

The --routing=dhtclient worked for the lightweight client we are trying with! :clap:

Setting that environment variable forces the usage of private networks. It’s a safeguard to prevent the daemon from starting if no private network is configured. So I’d expect that if you have a private network configured (which you do) then the variable won’t have any observable impact.

Even if you left the public bootstrap nodes in there after configuring a private network, I’d expect your node to simply refuse to connect to them since they wouldn’t have your shared key.

True dat. I observed the same. I couldn’t launch the daemon before configuring the keys.

Hi… I also tried to set up the private ipfs swarm, but I’m stuck with the same error which you had

Ping error: dial backoff

I generated the swarm.key in one of the nodes, and then copied this to other nodes. Did the

“ipfs bootstrap add multiaddr” also.
But I’m still getting that error. Any idea what i did wrong?

Hi @mithun, are you using the ipfs from the release distribution?

Hi @0zAND1z, dn’t know how, but it started working. No more errors now.

Can somebody from moderation maybe move the private swarm discussion into a separate topic? It seems to be a common question, so it would benefit from being more easy to find.