How to get the exactly same CID for a file/directory created with IPFS daemon using unixfs locally

Hi there,
I’ve searched through the forum and Internet and it seems I can’t find any tutorial or documentation on the topic.
For a file or a directory, I could easily get the CID using IPFS Add CLI or API provided by the IPFS daemon. What I’d like to know is how do we get the same CID locally without running an IPFS daemon?

For encoded string content, I have no problem getting the CID using the official CID and multihash implementations (https://github.com/multiformats/cid#implementations). However, from what I understand so far, to get a CID of a file/directory, files should be chunked and built into a DAG, then the CID will be computed from the DAG. The implementation of DAG building, the chunking algorithm and chunker size, all these factors would make the result CID different.

It seems there are Go and JavaScript implementation of the UnixFS to handle the DAG building. But after trying blindly for a while I still can’t get the CID we’ll get when using the default configuration to add a file via IPFS daemon.
Maybe there’s some official documentation or community project about how to create exactly same CID as the CID created through IPFS daemon. Any help or direction would be appreciated.

1 Like

The IPFS add command works without a daemon running.

No sure about the limitation of not having a daemon though.

ipfs add does everything locally. Despite the name no file is being uploaded. It’s just made available to the p2p network via that command.

ipfs --offline add ...

or https://github.com/alanshaw/ipfs-only-hash/

are the ways.

Now, you want to go the hard way and implement, you can take a look to the code and comments in go-unixfs/builder.go at master · ipfs/go-unixfs · GitHub

That will chunk a file. When doing folders you must further build the DAG that represents them. You can check ipfs-cluster/add.go at v0.14.1 · ipfs/ipfs-cluster · GitHub as a starting point.

Thanks for the reply. I understand there’s actually no such thing as uploading in IPFS. I guess I used the word uploading because I’m currently using a remote IPFS node so I used the term without thinking too much.

An Interesting I found is that the v0 CID created by ipfs-only-hash is the same with the CID created by go-ipfs, but the v1 CID differs.

npx ipfs-only-hash --cid-version 0 cat.jpeg                                                                                                                             
QmddbSfH6k3DMcGAAApnSwCJc9gEcB2oyhpdGNQEWkMY83
ipfs --offline add cat.jpeg                                                                                                                                             
added QmddbSfH6k3DMcGAAApnSwCJc9gEcB2oyhpdGNQEWkMY83 cat.jpeg
npx ipfs-only-hash --cid-version 1 cat.jpeg                                                                                                                             
bafybeic3545pjhl5bm4wzahg3y4dpaanbi624d2eil5rightij4rivwfhm
ipfs --offline add --cid-version 1 cat.jpeg                                                                                                                            
added bafybeifk6fyfmrgylejsujeido7kob7ysp6353wb5b7i5nhc64kzkxdvs4 cat.jpeg

I’m not sure if there’s anything I missed or if it is an issue of ipfs-only-hash. It seems like I’ll have to look into the source code you mentioned. Thanks for the information.

In go-ipfs --cid-version=1 also enables --raw-leaves. Try with --cid-version=1 --raw-leaves=false and see if the CID is what you expect.

Thanks for the suggestion! After adding the raw-leaves options to ipfs-only-hash package it can reproduce the exactly same CID.