Why do IPFS hashes start with "Qm"?

From @moreati on Sun Jul 26 2015 14:20:24 GMT+0000 (UTC)

Answer:

IPFS represents the hash of files and objects using Multihash format and Base58 encoding. The letters Qm happen to correspond with the algorithm (SHA-256) and length (32 bytes) used by IPFS.

TODO:

1 Like

From @jbenet on Mon Jul 27 2015 06:43:12 GMT+0000 (UTC)

> Is the prefix always ‘Qm’?

No, if objects were hashed with other functions, the prefix would be different. try out this binary: https://github.com/jbenet/go-multihash/tree/master/multihash

From @moreati on Mon Jul 27 2015 08:35:00 GMT+0000 (UTC)

Thanks, when does IPFS use another function? So far I’ve only seen Qm.

From @jbenet on Mon Jul 27 2015 08:49:26 GMT+0000 (UTC)

@moreati we use sha256-256, but it’s not hard for someone to re-compile ipfs to use another function as a default, or change the importer code to add a way to specify the multihash choice.

From @RichardLitt on Mon May 02 2016 19:12:23 GMT+0000 (UTC)

I see this question a lot; reopening it and labelling it as ‘answered’ to increase visibility.

From @RichardLitt on Thu Oct 06 2016 19:14:52 GMT+0000 (UTC)

How is Qm the string? I’m a bit confused on that, because I don’t see it in any of the tables on the Multihash repo.

From @Kubuxu on Thu Oct 06 2016 20:50:21 GMT+0000 (UTC)

It is base58-btc encode of the two bytes prefix of that multihash.

From @JustinDrake on Tue Nov 22 2016 15:42:03 GMT+0000 (UTC)

I’m a bit confused by “The letters Qm happen to correspond with the algorithm (SHA-256) and length (32 bytes)”

Is it that Q corresponds to SHA-256, and m corresponds to 32?

From @hsanjuan on Tue Nov 22 2016 16:06:55 GMT+0000 (UTC)

@JustinDrake If I got it right, multihashes start with a byte (0x12) which indicates the hashing algorithm, followed by another byte for length (0x20) . “Qm” letters are the result of those bytes encoded in base58.

Source: https://github.com/multiformats/go-multihash/blob/master/multihash.go#L146

From @alexanderattar on Tue Apr 18 2017 19:55:42 GMT+0000 (UTC)

maybe I just haven’t put all the pieces together, but I’m wondering if anyone here can off-hand explain the steps for how one might precompute the IPFS hash (Under the current build) given some JSON. Thanks in advance!

From @RichardLitt on Tue Apr 18 2017 19:57:02 GMT+0000 (UTC)

@alexanderattar You can do that using ipfs add -n, iirc. It doesn’t add it to IPFS, it merely spits out the hash for you.

From @alexanderattar on Tue Apr 18 2017 20:09:49 GMT+0000 (UTC)

ah thanks @RichardLitt! I guess my question is actually in regard to how I would approach this programmatically by running the JSON through the encoding algorithms without necessarily using the IPFS CLI. I am looking to take some JSON I have in JavaScript and precompute the hash before sending to IPFS if that helps explain the use-case.

From @alexanderattar on Tue Apr 18 2017 21:12:32 GMT+0000 (UTC)

I just want to add that I have tried encrypting JSON via sha256 to get a hash such as 4f72333148622e4ae56e9c65d57aee47186cd6910ca080757ab72cc0c650f6bb and have prefixed this with 1220 and then taken the entire string with the prefix:

122000c75938d356b000b34e7f7885f8982f29d89af76c234a8d439486b40fdc5469

and after running that through a base58 encoding, I get a hash that resembles an IPFS hash with the prefixed Qm, but the hash is not consistent with was is returned from adding the same JSON to IPFS via the command-line. I am wondering if I am doing something wrong, or missing a step. Thanks again!

From @Kubuxu on Tue Apr 18 2017 21:45:08 GMT+0000 (UTC)

Try doing the add with --raw-leaves option instead but now you have to add a CID at the front. https://github.com/ipld/cid

Where raw block has multicodec of 0x55.

From @alexanderattar on Tue Apr 18 2017 22:01:44 GMT+0000 (UTC)

Thanks @Kubuxu but I just want to reiterate that I am not looking to use the ipfs CLI, but rather go through the necessary encryption and encoding steps to get from JSON to a IPFS hash. So far my method has been:

Take the JSON and encode via SHA256 to get the digest, then prefix the digest with 1220 as described here, so the entire hex string is composed of the prefix plus the digest and then base58 encoded. Does any of this approach sound incorrect?

From @Kubuxu on Tue Apr 18 2017 22:03:24 GMT+0000 (UTC)

IPFS by default also wraps the file you give it into some metadata used by ipfs itself. That is why it is different. --raw-leaves utilizes CID to communicate to others that data under the hash has no wrapping.

From @alexanderattar on Wed Apr 19 2017 17:47:50 GMT+0000 (UTC)

Wow, I was not aware of that, but that does explain why the hash is different. Is there documentation on the metadata IPFS wraps the file in before generating the SHA256 digest? I tried using the --raw-leaves flag which indeed gives me a different hash:

{"Name":"myfile.txt","Hash":"zb2rhnyuQdBJVhb3j7FAL1NRUrQu4TMkb7zED9S5sh2YCKd62"}

but I have not found any documentation on what algorithms and processing the data goes through to produce this hash.