Alpha conversion?

Is there a way of having the file-hash aka CID be computed for only some of the file contents, instead of all of it? Say I want to have only the first 100 bytes of the file contribute to the CID, but not the rest?

In my app, I’ve got data that is equivalent, up to name changes, and so if two different users post the same content, differing only in the names, I’d like to get the same CID back.

So for example, one user has a file whose contents are

int main(int argc, char **argv) { return 42; }

and another user has a file

int main(int foo, char **bar) { return 42; }

I’d like to somehow arrange to have these hash down to the same CID. And when retrieving content, either one could be retrieved, don’t care which. (My app wouldn’t be storing this, this is just a crude example of alpha-equivalence).

Sorry, hash algorithm will only give the same output hash for the same input data, purely deterministic. Those are two different strings.

1 Like

Also, content in IPFS does not have a filename attached. Filenames are part of the wrapping folders. So if users add A/dog.jpg and B/hound.jpg and both are the same picture, they will get different root folder CIDs but both will have the same CID-link to the picture inside.