Can IPFS respond to queries related to the content of a file

Can IPFS repond to queries such as HTML tutorial and return all documents that contain these keywords?

Nope, IPFS respond to query by the hash of a content,
If u are looking for a way to query IPFS Data u can use Orbitdb

I’m new to IPFS, I just started reading papers about it.

So far, I know IPFS finds document by its hash, but here do you mean if we hash the keywords, then it can be used to find any document containing these hashes? Or there is no way to look into the files at all?
Can you explain it more?

What about Orbitdb how can it help in this matter?

No, it can’t.

You would need to build an indexing service for that: it would read new files from time to time and build an index to make text search fast. Essentially this is what Google did 20 years ago.

What does this “index” look like? In its simplest form, it keeps a list of all files that contain “HTML” and a list of all files that contain “tutorial”, or rather a list of file hashes. However if we try to search for “html” it would return nothing. Designing a good index is hard and designing a good distributed index is much harder.

2 Likes

Indexing service is already there as IPFS have DHT which used to index files over the network, can’t this DHT be extended to include the keywords?

I think DHT is just a distributed hash table: one can put a (key, val) pair in it and someone else can get it. DHT doesn’t do any work beyond that, i.e. it doesn’t read files.

I understand.
So can’t we read the files when added to DHT and index all keywords into a database, then let the database respond to queries?

I don’t see a way to monitor all new entries in DHT. When someone wants to add a new entry, the algorithm finds a few nodes that are suitable for this entry key. Unless we are one of those nodes, we won’t even see the new entry.

1 Like