Centralized DB support?

From @pocketmax on Tue Sep 29 2015 03:59:57 GMT+0000 (UTC)

If I have an app thats deployed in IPFS with a central DB via a REST api…does IPFS mess with the REST communication of my app?

Also, does IPFS have a solution to centralize large databases? Maybe split records across multiple nodes or is that outside the scope of IPFS?


Copied from original issue: https://github.com/ipfs/faq/issues/51

From @jbenet on Tue Sep 29 2015 06:52:23 GMT+0000 (UTC)

What do you mean by “a central DB via a REST api”? can you be much more specific?

Also, does IPFS have a solution to centralize large databases? Maybe split records across multiple nodes or is that outside the scope of IPFS?

out of scope of ipfs-core, but certainly in scope of protocols on top of IPFS that we’re designing.

imagine ipfs-cluster + an IPFS-KV or IPFS-SQL implementation.

From @pocketmax on Tue Sep 29 2015 08:13:08 GMT+0000 (UTC)

when I say “a central DB via a REST api”. I mean something like a mysql db behind a PHP/APACHE web server. The public side of the server exposes a REST api via port 80 while the internal side connects to a mysql db. So in order for my app to make changes to my DB, it would make rest calls to this php/apache server. I was just curious if IPFS would mess with that traffic in any way. But I’m guessing it wouldn’t since I would use the IP of that web server in my app.

From @jbenet on Tue Sep 29 2015 08:15:27 GMT+0000 (UTC)

correct

From @jbshirk on Thu Oct 29 2015 18:35:17 GMT+0000 (UTC)

@pocketmax

Does IPFS have a solution to centralize large databases? Maybe split records across multiple nodes or is that outside the scope of IPFS?

I’m no expert but years ago I dabbled with a particular column-oriented database called KDB (now: KDB+) which was, and still is, way ahead of its time. In 2000 the concept was relatively unknown to traditional RDBMS people, but not so today.

I know that with column-oriented databases in general (and specifically KDB) index files are not used and column data, stored in separate text files, can be, and is, split over many volumes. I just have a hunch that this kind of data structure would lend well to IPFS, and eventually updates and inserts could be automagically handled with version diffs.

I suppose that files like Microsoft SQL Server uses would probably be a nightmare on IPFS.

Just a thought…

From @jbenet on Sun Nov 01 2015 22:11:50 GMT+0000 (UTC)

there are clean layerings of SQL (or restricted subsets) on top of these kinds of systems.

From @pocketmax on Mon Nov 02 2015 02:30:45 GMT+0000 (UTC)

jbenet. Can you send some links or give some names of the technology that use these “restricted subsets”? I’ve never heard of it. I did some googling and can’t find any links to any tech like that. It sounds interesting.

From @jbenet on Mon Nov 02 2015 02:53:28 GMT+0000 (UTC)

one example GQL is a restricted subset of SQL for bigtable/“appengine datastore”. layered over a multiversion distributed kv-store. look for those. all the “no sql but relational” dbs usually have a SQL restricted subset layered over a distributed kv store, even if they dont call it that. this includes mongodb.

From @jbshirk on Wed Nov 04 2015 22:37:00 GMT+0000 (UTC)

name isn’t pretty, but possibly a good candidate to adapt to ipfs: https://github.com/cockroachdb/cockroach
I’ll look more into it later.

From @jbshirk on Wed Nov 04 2015 22:46:04 GMT+0000 (UTC)

here is another that is very interesting:


http://hyperdex.org
http://hyperdex.org/papers/hyperdex.pdf

From @jbshirk on Wed Nov 04 2015 23:37:50 GMT+0000 (UTC)

A couple of interesting graph-oriented databases found on wikipedia. My criteria were basic: distributed, open source, NOT written in java.

From @pocketmax on Wed Nov 04 2015 23:46:21 GMT+0000 (UTC)

lol “NOT written in java.” so true… so true.

 On Wednesday, November 4, 2015 4:37 PM, Joe <notifications@github.com> wrote:

A couple of interesting graph-oriented databases found on wikipedia. My criteria were basic: distributed, open source, NOT written in java.

From @jbshirk on Fri Nov 06 2015 11:10:56 GMT+0000 (UTC)

An important concept that ought to work well with distributed hash tables
Handling ridiculous amounts of data with probabilistic data structures