Is it possible to share the MFS from one node to another

Pin will endlessly attempt to fetch files that ere not available localy unless you specify a timeout, once the timeout expires it will then throw an error

@phillmac Are you talking about a scenario where the MFS files are no longer available on the network (remote machines), and so the PIN fails due to not being able to get the data? …or are you saying the PIN fails even though the data being pinned WAS available on the network?

IMO pinning should always go get the data, if it isn’t local. I mean that’s the definition of what pinning even is. So if there’s a scenario where MFS pins would otherwise fail, there should be some kind of ‘force’ option for the PIN command that says “Hey IPFS, I really mean it. Get this data and pin it.”

By default pin will try indefinitly to search for a node with the data, without giving up.
Quite often I’ve tried to pin some data thats already been deleted and no longer available in the swarm at all.
In that case the pin job will be ‘stuck’ forever waiting for the data to become available.
You can observe this by using the --progress option, the fetched count will feeze either at zero or whenever it encounters the unavailble data
If the data becomes available at a later time then alls well and good, the pin job will carry on its merry way untill its complete.

Otherwise, which has been quite a problem for me, you can end up with many stuck jobs using bandwidth to the point it slows everything else down.
So to get arround that issue, if you specify --timeout=15m for e.g. The job will be terminated after 15 mins if it’s not complete and throw an error

What you’re saying about timeouts and unreachable (un-findable) data sounds fine. That makes sense.

However the issue at hand here is that apparently if you pin a large MFS file structure root folder, the “pin” is guaranteed to always NOT pin the data unless all the files just happen to have already been copied to the local machine…even if the files ARE reachable on the remote machine. That seemed like what @hector was implying, unless I’m confused.

Also my opinion is that if pinning MFS files is therefore unreliable (for the case I just gave), then pin should return an error code immediately in this scenario, because it definitely can tell if something is already stored locally or not. And if it’s not local it’s not going to copy it automatically?, and therefore cannot claim to have “pinned” it right?

I did not imply anything. I you pin something , or ipfs files cp something, the operation will not finish until the whole tree is fully available locally.

Thanks @hector. So my original answer above where said this:

“I think you can just pin the root and you’re done. Your node you pin it on will go get the data and keep it I think.”

was actually correct then? We can use ‘PIN’ to get MFS files from foreign servers, even without using a copy command? It certainly seemed to me like you were saying copy should be used and that my suggestion to just PIN it, was not correct. Thanks for clarifying.

this thread a classic example of why you guys having issues getting the contributors.

I got lost, TBH. Couldn’t you ( core devs ) just answer it nicely in one post or video or something.

So WHAT is the solution?

1 Like

The simple solution is this:
ipfs files stat --hash /
gives you a cid

if you want to populate the mfs of a remote node you then need to execute
ipfs files cp /ipfs/<cid> /
on that remote node using the cid you received from the stat command

if you want to do it in realtime you need to either run it on a cron job or have a 3rd party app that manages the mfs state and issues updates to any replication nodes

the mfs allows you to overwrite the root hash as much as you want so you can just keep issuing new cp commands.

The mfs has nothing to do with the pinning system other than the fact that the garbage collector will treat any local files that are referenced by the mfs as if they are pinned regardless of if they actually are.

populating files hashes into the mfs via ipfs files cp doesn’t perform any data copy between nodes, so you then need to separately manage pinning or unpinning the relevant cids as well.

My workflow looks something like this
cron job 1:

  • get mfs folder cid with ipfs files stat --hash /<folder name>
  • publish cid to an ipns key

cron job 2 on worker

  • monitor ipns key for changes
  • pin updated subfolders, as needed via ipfs pin add /ipns/<ipnskey>
  • update mfs via ipfs files rm -r /<folder name> && ipfs files cp /ipns/<ipns key> /<folder name>

The upshot here is that the ipfs files cp only ever copies the directory structure, not the underlying file data. So if the directory structure is available, it’ll copy that only and then immediately return.
ipfs pin add <cid> also works only with things that resolve to a cid, so multiaddrs that start with /ipns/ or /ipfs/
if you want to ensure that the mfs root cid is actually pinned, you need to do something like this:
ipfs pin add "$(ipfs stat --hash /)"

I can state with near certainty that this phrase:

"ipfs files cp only ever copies the directory structure, not the underlying file data"

…is something that 99% of people are going to misunderstand unless it’s explicitly stated in the docs. I’m glad this information (assuming it’s correct) has finally been…extracted.

This is one of those cases where the API docs has exactly ONE SENTENCE (yes one! lol), and even that sentence is technically incorrect, because it says:

Copy command will “Copy any IPFS files and directories into MFS (or copy within MFS).”

So people are assuming they can use “cp” command to move data, when in reality they need to be using “PIN” to move data, exactly like I said above in my first “answer” (although I was wrong about only needing to pin the root). If a PIN command is what it takes to cause data to get copied that needs to be clearly stated in the “cp” command documentation.

No wonder people are having so many problems with MFS, because it’s possible for MFS files to suddenly “vanish” during garbage collection because no one was aware they need to ALSO be PINNED.

To hopefully add more clarification the following 6 steps should be the high level algorithm to “synchronize” two arbitrary MFS directory structures:

To synchronize SRC folder structure to DST folder structure (making DST folder be a perfect match of SRC), where DST is on your local gateway where you’re running this and SRC can be anywhere.

  1. Recursively scan SRC and load all CIDs into srcSet collection set (like a Java set collection, but in your own programming language).
  2. Recursively scan DST and load all CIDs into dstSet collection set (again, an in memory collection).
  3. For all CIDs in dstSet that are not in srcSet, unpin them, and remove from dstSet.
  4. For all CIDs in srcSet that are not in dstSet, pin them.
  5. Run MFS command to delete the previous DST folder structure.
  6. Run MFS command to copy the SRC to the DST.

This could be in a ‘sync’ command of IPFS itself (I know it’s requested already) and as you can see, the above algo is trivial to implement.

I think there are a few things that make MFS confusing. First the name “Mutable File System”. I don’t know what I would have called it but it’s probably better than “Mutable File System Abstration Over Immutible Storage” MFSAOIS?!

Then the CLI command for that. It takes a bit to figure out that the “files” sub command is MFS. Then everything gets confusing because you talk about files in a generic way. “Add a file to IPFS”. Are you talking about “ipfs add” or adding it to MFS?

I still haven’t found a way to add a file to MFS and change the file name.

And then there’s Unixfs and dag-pb. They seem to be casually interchanged sometimes.

I get it all and it all makes sense but when you’re new to it and before you get it all sorted out it can be a bit confusing. Then you throw ipns.

2 Likes

MFI - Mutable File Index

It’s definitely not a “File System”, and it definitely IS an index.

The easest way by far is to use ipfs-desktop or the webui to manage the mfs.

1 Like

Debatable. To an end user it certainly looks like a filesystem and probably more so than IPFS itself. Enoug h so that MFS should probably be the primary interface and IPFS just the plumbing where you can peek behind the curtain if you want to and grab a CID.

If a ‘-pin’ option was available on the “cp” command that would maintain backwards compatibility, and be a very good solution imo.

This is indeed true, and I was in the wrong above. I have corrected my posts above. Pinning the CID is necessary to copy the full tree before doing files cp.

And I must have written that myself when I worked on “improving” the MFS documentation. Too bad, I long thought it worked differently. I am owning the mistake and fixing it.

1 Like

I was just looking at the 0.8.0 release announcement and noticed under section " Remote MFS pinning policy" it says,

Every service added via ipfs pin remote service add can be tasked to update a pin every time the MFS root changes

If you set up the remote that you’re looking to mirror MFS onto would setting it up as a pinning service cover half of what’s being discussed above? The only thing left would be updating the root MFS CID on the remote.

1 Like

@hector All the users of IPFS really appreciate this forum, and value the input from all you core developers! Thanks for the update on this particular issue itself, and for all the hard work!

3 Likes