Use Case: IPFS as local media file indexer

Hi, excuse this terribly beginner question:

Could IPFS be of use as a local media file indexing system?

Use Case

  • indexing of a (local) library of images, audio files, videos and their backups on several harddisks
  • discovery of duplicate files
  • verification that each file is present on at least 3 different hard disks
  • harddisks can be unmounted, but their index should stay queryable
  • media-specific metadata (image resolution, encoding, bit rate, etc.) should be queryable directly from the index

Extra Points

  1. if a media file could be represented as a merkle tree which nicely separates sub-blocks containing only metadata (e.g. EXIF for images) and raw content data. (This way, duplicate images with different metadata could be detected)
  2. if relationships between media files could be represented, i.e. if a photo is an edited version or a thumbnail from an original (of course, the info itself would come from the outside)
  3. the immediate need is that data is (and stays securely) local, but later it could be nice to easily share some media files with specific users

If this has been asked or treated before, please excuse my ignorance and point me towards relevant sources. One bit of info I could find was the recent option FilestoreEnabled:

Thanks!

Disclaimer, i’m not an IPFS dev. I could very well be wrong.

Nope. IPFS is not build as a media library. It sounds like you’re searching for something that gives you an API which you can then query to see if you have that media. IPFS is not the system for that.

Yes and no. It’s just the nature of hashes. If you hash the same file twice, you get the same hash twice. A hash can only occur once in IPFS so if you add a file that IPFS already knows it won’t add that again. Thus you effectively have deduplication.

Definitely a big no! IFPS can tell you who has your file (look for findprovs in the CLI documentation). But even so, it still can’t tell you who has it pinned. In other terms, it can tell you who has your file but that one may or may not have it pinned. It certainly has no way of knowing it’s spread across HDD’s.

As you ask it, it’s a no. But… If you have another application that manages your index and if you put that index in IPFS (ipfs is just the storage method) then you can effectively have your index available through some node that is online while your harddrives are unmounted.

Same. No.

I’m not even going to bother going over the “extra points” :slight_smile:
Don’t see IPFS as a programmable indexing application. Its not. If you have an index already, you can use IPFS to store it and make it accessible.

Also, please be aware that anything you put in IPFS needs to be stored somewhere. If you’re the only one storing it then your node most stay online for you to get that index. You can also use pinning services to do this for you, they distribute your data among nodes in their cluster.

1 Like