How urlstore works

Can someone clarify how urlstore works? From the documentation on the CLI it says, “Add URLs to ipfs without storing the data locally.” but there’s an option '–pin bool - Pin this object when adding. Default: true."

I’m a little confused because pinning the object would imply that it’s being stored locally. What I’m wondering is the following. Is the file stored locally if pinned? If it’s unpinned is it still stored until a GC? If it’s GC’ed is it fetched at the next request? If that’s the case how do you remove something added with urlstore? If it’s removed when it’s unpinned then I don’t quite see the point of the urlstore unless its never stored locally and pinning and unpinning is what adds/removes it.

I’m going to speak from memory.

Urlstore is similar to filestore, except instead of the chunks of a file being stored in the filesystem (and referencing them with location+offset), they will be in a URL.

Pinning does not notice about this. It thinks it has the “blocks”, but the stored “blocks” are actually just pointers to the remote url.

This conversion to pointers happens while adding here: go-unixfs/dagbuilder.go at master · ipfs/go-unixfs · GitHub (no longer speaking from memory i guess :))

Instead of the actual content, you are going to be storing a magic FilestoreNode thing which hopefully takes less size than the thing in the original location. There is also similar magic on read: when ipfs detects it has read a FilestoreNode thing, it will then access the content from the location+offset specified.

1 Like

Thanks. I’m trying to understand it and play around with a couple of ideas related to it. Does unpinning it remove it at GC? I can’t think of any other way it could ever be removed if that’s not the case.

So the blocks are never stored locally? I guess it’s relying on the requester to “cache” the blocks then seems unfortunate that the requestee has to continually redownload the file for each request if the requestee doesn’t share the blocks. I wonder if it would make sense to use two nodes, one with just points to a url store behind another that acts as a block cache. It would also be really nice if IPFS had a LRU GC policy.

GC will remove the filestore nodes (and the folders and anything that is not a content-block).

There are a few web-caching-proxying solutions that you could put in front and then make url point to the cache url instead.