What exactly is happening underneath `ipfs get`?

Hi fam,

I’ve been extensively testing ipfs and thoroughly reading the ipfs code and I’m seeing the following issues and wondering why:

  1. upon running an ipfs get command, theoretically, when there’s no blocks stored locally related to the CID the bitswap protocol should be used so should be the datastore service (or something similar) that would cache the result; i.e., putting the fetched blocks to the local datastore. However, there’s nowhere I could locate the code where the datastore service is used. Wierdly, if I ran ipfs refs local I could actually grep the cid which I assumed came from the datastore. Clearly my assumption doesn’t stand but what is really going on after you get the blocks resolved from the cid you provide? Other than seeing the blocks are copied to io.Reader or something similar I find no evidence showing where the blocks are temporarily stored? Mem?
  2. I assume whatever is fetched via ipfs get is not pinned by default; meanwhile, I don’t find any evidence showing the cid is provided (which I believe is a process that writes the peerID-CID pair to the dht). However, if I run ipfs dht findprovs I can see myself (the node I SSHed on) from the output. Even weirder, if I run ipfs repo gc and confirm that the cid I want to be removed is removed and run the ipfs dht findprovs again, I can still see myself providing the cid that just got removed!!! Is this a bug? Or where should I look up the code and find the implementation?

My current IPFS version is 0.7.0 and my ipfs codebase (at least the go-ipfs repo) is up to date. Essentially, I don’t believe there has been tons of new changes added since the 0.7.0 release.

Please help me out, any advice and answer is appreciated!

Please upgrade to 0.8.0 or 0.9.0 when it comes out in a few days.

  1. Yes, blocks are cached and not pinned. It happens here:
  1. The CIDs are provided by default. Provider records last 24 hours I think, before they expire. IPFS does not “unprovide” on GC.

The IPFS codebase spawns many repositories and the dependency injection method does not make it easy to track code around. I suggest you check GitHub - hsanjuan/ipfs-lite: IPFS-Lite is an embeddable, lightweight IPFS-network peer for IPLD applications which requires exactly the same libraries and ends up doing the same things, but probably easier to see what is initialized with what and where things are coming from.

Thank you Hector, I later dug into the codebase deeper and found that there was indeed evidence showing that blockservice was used to cache the blocks. Also, providing the blocks fetched from bitswap is somewhere in the codebase as I later managed to locate it. This is indeed how the bitswap was originally designed (as in the IPFs whitepaper) and I’m proud that the team implemented it craftily. Great work!

I guess this thread exists for a good reason since I believe numerous people would wonder what was going on with ipfs get under the hood.

This part seems a little off though; i.e., if blocks are garbage collected how come the node is still able to provide them?

They are advertised in the DHT for the lifetime of the record. When a peer looking for them contacts the provider (using bitswap) the provider will say that it does not have the block and life will go on. The search for the block will continue by trying to find other providers.