File-chunk identification possibility

From @shalnoff on Fri Dec 25 2015 09:13:36 GMT+0000 (UTC)

Is it possible to identify the file that chunk is owned (and vice versa)?


Copied from original issue: https://github.com/ipfs/faq/issues/85

From @RichardLitt on Thu Mar 10 2016 19:02:19 GMT+0000 (UTC)

I’m not sure. @diasdavid @whyrusleeping what do you think?

From @whyrusleeping on Mon May 02 2016 19:37:32 GMT+0000 (UTC)

@shalnoff could you clarify what you mean?

From @diasdavid on Tue May 03 2016 18:28:18 GMT+0000 (UTC)

@shalnoff if your question is ‘given a chunk, can I identify the file it belongs to, without any other information at all’, the answer is no, you can’t, that would violate the principle of a DAG.

From @mcast on Tue Sep 13 2016 20:38:27 GMT+0000 (UTC)

> @shalnoff if your question is ‘given a chunk, can I identify the file it belongs to, without any other information at all’, the answer is no, you can’t, that would violate the principle of a DAG.

I would only add, “…unless you have some extra information”. Would these methods work?

  1. fetching the chunk’s data, recognising some distinctive feature of its encoding. e.g. it’s plain text, or gzipped text with a convenient block start; take five words and put them into a search engine to get the original.
  2. enumerate (#155) a large enough swathe to discover (some of) the chunk’s parents.
  3. be in possession of some kind of reverse-lookup index, constructed from such enumeration

In a different chunking system, which could ride ipfs it is by design impossible to identify chunks.

From @shalnoff on Wed Sep 28 2016 21:35:28 GMT+0000 (UTC)

Clear, that’s what I’d like to clarify. Thank you (sorry for late reply)