Minerva: Build the Hadoop-Hive on IPFS

maris205 · July 6, 2019, 12:29pm

Hi

There still lacks a big data system on IPFS. So we built Minerva, which could be regarded as the Hive on IPFS. Using Minerva, you could use standard SQL to query the file content on IPFS (json, csv format).

Minerva is based on Drill and IPFS. Technically, it’s a Drill storage plugin that connects IPFS’s decentralized storage and Drill’s flexible query engine. Any data file stored on IPFS can be easily accessed from Drill’s query interface, just like a file stored on a local disk. T

The basic idea is very simple: run a Drill instance along the IPFS daemon, and you can connect to other users on IPFS who are also using Minerva. If one of the users happens to have stored the file you are trying to query, then Drill can send execution plan to that node, who executes the operations locally and returns the results back. Of course, other users can benefit from your node as well, if you are sharing the data they want. If there are enough people running Minerva, data sharing and querying can be made distributed and more efficient!

If you are insterested, we have made a few slides that explain the ideas in details:

Any suggestion is welcome.

Find the code on GitHub: https://github.com/bdchain/Minerva

A live demo: http://www.datahub.pub (may be unstable please bear with it)

josselinchevalay · July 23, 2019, 1:24pm

good project, this very intresting

Topic		Replies	Views
An IPFS-based DB Ecosystem	8	2825	April 5, 2018
Docker has a problem - is IPFS the answer?	0	554	January 31, 2018
IPFS-driven equivalent of google drive? Ecosystem use-cases-and-apps	10	3405	September 7, 2017
What is IPFS actually good for? Help blockchain , ipfs-cluster , files , ipfs-desktop	11	966	January 24, 2022
Blogging Site - Can I persist data without servers/in a client-side app? Help js-ipfs , files	8	749	April 7, 2021

Minerva: Build the Hadoop-Hive on IPFS

Related Topics