Confusion betweed CID and "bare"/"naked"/"raw" Multihash

multiformats
multihash
#1

Hi.

Lurking in Github repos, I see a huge effort is ongoing to upgrade default CID from CIDv0 to CIDv1.
I’m trying to wrap my head around the concepts of CIDv0 and CIDv1, and all the concepts of the Multiformat “stack”. What I understand from https://multiformats.io/ and https://github.com/multiformats/cid is that, if you have a block, and want a CIDv1 out of it, you must:

Hash_function(block) = a hash (as a binary)
Concat(hash_function_code, digest_length, hash) = a multihash (as a binary)

(Here, if the hash is sha256, and the length is 32, it almost a CIDv0. We just have to encode it in base58btc, right?)

Since we want a CIDv1:

Concat( multicodec_code_for_multihashes, multihash) = a multicodec (specifically a multihash's multicodec, as a binary)

(and the multicodec_code_for_multihashes = 0x31)

Concat ("0x01",multicodec) = something almost useful (as binary)

(and “0x01” is the version of the CID)

Encode(previous binary) = a string of characters
Concat(code_for_this_encoding, previous string) = a multibase, and more specificaly a CIDv1 (as a string)

So to sum up, for a particular block of data:

  • there is only one CIDv0
  • there are a lot of CIDv1
  • the different CIDv1s depend on the hash function, digest length, encoding type choice
  • If we except the “0x01” for CID version, a CIDv0 is just a particular flavour of CIDv1: the one with the base58btc-encoded untruncated sha256 hash (which is 32 byte-long)

Is everything above correct?

#2

You are correct, CIDv0 can be thought as CIDv1 that has implicit base, cid-version and multicodec:

<cidv0> ::= <multihash>
<cidv1> ::= <multibase><cid-version><multicodec><multihash>

Multicodec list: multiformats/multicodec/table.csv
Multibase list: multiformats/multibase/multibase.csv

In theory, the same multihash can be refered to from multiple CIDv1 with different multicodecs.
In practice, usually only one codec makes sense for a specific block of data. For example, unixfsv1 files are encoded as dag-pb/0x70 and raw/0x55.

I believe you will find https://cid.ipfs.io quite useful.
It will let you inspect CIDs, see implicit/explicit hash, codec and easily convert CIDv0 to CIDv1.