How to speed up file retrieval from js-ipfs node

I have a file pinned on my NodeJS IPFS server, when I ipfs.cat(cid) this file from the browser it can often take a minute or more to get the file. I’m assuming this is because IPFS is trying to figure out where the file is located on the internet. So in order to speed this up, I have added my server’s IPFS PeerID to the bootstrap list of my browser’s js-ipfs node.

However, this doesn’t seem to make the file retrieval any faster. What can I do to speed up this retrieval time? The file in question is just ~1MB and it isn’t my internet speed that is slowing the download down.

I think a lot of people are mistaking IPFS for a database. it’s not a database, and should not be expected to perform like one. For example it can’t do search. The way to think of IPFS is much more akin to BitTorrent than to MongoDB. So my way of handling this is that I therefore load all data my app Pins or needs into my MongoDB (in addition to IPFS) instance where it’s therefore searchable with lightning fast performance. So my “app” itself will never need to “access it thru IPFS” theoretically ever again!

People might say this is bad because it is too redundant. However I consider the redundancy a feature not a bug. Truthfully I consider the MongoDB the authoritative source of the data, although another way to look at MongoDB in this scenario is as a “big searchable cache” for IPFS. But I consider it primary source since I can move my MongoDB database to another instance and re-execute all the “Pins”, and build up a new redundant source for the IPFS pins.

Another way to think of this approach is to simply call IPFS itself a ‘sharing technology’ that enables decentralization and backups of information. No company in the world is going to switch over their official database info to IPFS (instead of a real databases) anyway, simply because they’d loose search and performance and 90% of what databases do.

You have swarm control problem.

ipfs can be seen as a huge p2p disk, putting all together any CID anyone could put in here.
So if you bootstrap to the public default nodes, then your swarm peers become anyone, you are thrown in a too big datastore for your node to focus on your data.

I found a way to take control over my swarm using scuttlebutt relationship. Then you can have a quick access to your reduced ipfs file store.

I am now using https://gchange.fr identities as ipfs node identity, so each node “possess” a Libre Money wallet…

I see. Well the files that I have pinned on my server are several MBs in size so saving them to a database probably wouldn’t be a good idea.

Is it not possible to connect my two nodes with ipfs.swarm.connect() and that would speed up the data speed problem?

So if you bootstrap to the public default nodes, then your swarm peers become anyone, you are thrown in a too big datastore for your node to focus on your data.

This makes sense. I don’t quite understand the scuttlebutt relationship part. Is this a way to limit who my node’s swarm peers are?

'ipfs swarm peers' lists the set of peers this node is connected to.

I am using CLI commands

ipfs swarm addrs - List known addresses. Useful for debugging.
ipfs swarm connect … - Open connection to a given address.
ipfs swarm disconnect … - Close connection to a given address.
ipfs swarm filters - Manipulate address filters.
ipfs swarm peers - List peers with open connections.

If you control your swarm peers list everything is fine.
To ease swarm automatic construction, Scuttlebutt is used to publish an “ipfstryme” message containing <ipfs_address>. Then ssb relations are read and used by swarm control scripts (1 & 2)

1 Like