Idea for storing flashcards and educational data in IPFS. Looking for advice and possible help

Background:

I’m a teacher who’s very interested in technology. I’m planning on going to grad school for educational technology. I’ve been interested in and following the developments of decentralized networks/technology but definitely not treating it as a new hammer that I’m making everything into a nail to use on. I also don’t believe in using technology for tech’s sake, especially in education.

I used flashcards as part of my Chinese studies to great success, and now I use Quizlet to help my students review and store knowledge points. There are third party apps like Kahoot and Gimkit that can let you host a live competition using your cards. My students love it, but the import/export process is kind of annoying and can generate errors like lost information or mixed up cards.

Idea:

I propose to make an Universal Flashcard Standard. I want the cards to be hosted on a cloud server (or the IPFS, here to ask advice about that) that would allow users to instantly access their cards on any platform without worry that the platform they are using will disappear along with their data. Data about answers guessed correctly and incorrectly could also be included to make seamless SRS(Spaced repetition Software) experiences across platforms. The time you spend using the flashcards in a game would update your use history such that those cards are presented to you less frequently in subsequent sessions. I’d prefer the IPFS than a cloud server account or even github account that someone owns, because developers might feel wary of joining the network if they feel they could be suddenly cut off from the data.

I believe this could unlock a lot of potential for innovation. When someone has a new game idea that uses flashcards, users could instantly try it out with their existing sets. New apps could also be developed for more convenient flashcard generation, and it would improve cross platform functionality.

Technical considerations:

  1. The format of the data. I’d likely start with something that can easily use the data of the most popular current sites Quizlet and Anki. In my own experiments I’ve started using Json files and have successfully imported Quizlet files.

  2. Cloud/IPFS hosting (what I could use your advice on!)
    no one’s going to want to use this service if it costs money, even micropayments. From my understanding of IPFS, if the cards are used and adopted, they will become free to users, but someone will have to pay for an initial node to store files that haven’t been accessed yet. If we allowed media files only in the form of links, the files could be all text, and therefore very small. So the cots wouldn’t be so high. I’m not sure what the total data use would be so I’m not sure of the cost of running the server. I’m debating setting up my own server (Aliyun would be the best cause I’m in China, but might need help setting it up) or using cloudflare/pinata. I wish they had bitcoin payments so I could accept donations for the project and people could see the money was going righ to server upkeep, but it seems they don’t have the option.

  3. Data sharing methods and data privacy issues. I’m thinking that cause all card sets on IPFS are publicly viewable by default. People can clone your decks and change their own version. If you want a private deck, you could first encrypt it before putting it on IPFS. I have some more ideas though about how people could make collaborative notes for shared decks that might complicate this.

  4. Syncing issues. I think IPFS handles this pretty will with versioning? client side apps might need some kind of reviewing mode for handling unwanted discrepancies.

I’m still rather new to programming. I’m learning python and trying to test out some of my ideas already. In this simple text based flashcard program I wrote, I included an importer that converts Quizlet export files into the format I use.

To bootstrap adoption, I want to make a webcrawler app that takes all of the public decks currently on Quizlet and Ankiwebb, converts them to our universal format and puts them on IPFS. That way as soon as someone makes an app that uses our format, users instantly have access to all of their cards. As a new programmer I also could use some guidance on how to achieve this.

Current project (haven’t really started implementing any of these ideas yet) https://github.com/Jewcub/PythonFlashCards

Thanks!

I like the idea, and from my understanding of your proposition, the small data pairs that compose flash cards would be well suited for IPFS storage. You did point out the one biggest flaw, that is, data is only “stored” on the network if other people wish to store it or just access it regularly, in which case it’s more “cached” than “stored”. I’ll admit to not having done anything beyond some extensive reading when it comes to the topic, but I think Ethereum may be best suited for a master record, sort of a “this is a flash card database, here’s the list of cards” such that the (for lack of a better term) bootstrapping point for the program is universally hosted regardless of anyone’s interest in the project. That ethereum applet would likely do no more than link to (this is how I would do it) two things on IPFS: an original description file for the program (what it is, list of all data files originally included (which would be small)) and the most recent list of data files (and program description if that should ever change). Basically, to solve the point of having to maintain versions, (for archivists’ sake I wouldn’t replace existing data, only ever create new data and declare old files obsolete). If that wasn’t too clear (cause I sort of pieced it together in my head while typing)…

User wants to use the program.
In some way or another, user retrieves the Ethereum stored hash for the latest program version. (Or an older version if the latest is no longer retrievable via IPFS)
Uses hash to retrieve said program “file” from the IPFS network.
Said file contains a (I would do) HTML/JavaScript webpage for a nicer interface, which then displays a list of all data files (courses, sets, individual cards, etc…)
User can then choose a specific data set to download.
Data set is downloaded from IPFS.
Each data set should be distributed with display code (ie: the webpage) to show the data in a happy human format.

Future revisions to anything stored on IPFS should have a new hash generated for the (at the least) the main data file (that lists the others and shows the initial webpage) stored on Ethereum. Hypothetically it would cost a small (very small) amount of money to add the hash to the Ethereum network, but that would store it in a reliable decentralized place, that couldn’t be just removed. And of course the hashes for any IPFS files could be stored and distributed elsewhere, but Ethereum would provide a fallback, a sort of uneraseable list of past versions.

Also, while I am minorly familiar with Quizlet, not heard of the other, and I don’t have enough knowledge to say anything about what it would take to crawl their sites for the data, only thing is that you could run into issues if they catch you publicly distributing data that was scraped from their sites (I’m unfamiliar with their copyright policies). I wouldn’t stop you, personally I think any data stored on a central location can and should be duplicated elsewhere and made publicly available, so it can’t just be erased by whatever organization holds it.

Interesting. I’m not sure how the Ethereum would help solve the hosting problem. Even if it links to the hash of the files, if the files aren’t on the network anywhere you’re still SOL. It seems that you still need someone interested enough in the project to support it at the beginning and host all the files even when no one’s using them. At least as far as I understand it. I’m probably just going to use pinata for now and limit the project in the free zone under a GB unless I get some kind of funding.

One distinction I should point out that might be causing confusion is that I don’t primarily want to make a flashcard app, but rather to make the flashcards and the user’s SRS data live independently from the apps. Therefore any app could use the data and there could be an open market for functionalities built on top of them, much in the way that Gimkit is a game built on using Quizlet imports.

Besides the Eth bit I think might be a bit unneeded, I like your implementation description. A user-friendly web interface for choosing the files will probably be one of the core programming jobs to do.

As far as copyright, I couldn’t find anything on the Quizlet website. They just talk about how not to impinge on copyright while uploading content. Maybe I should lay off the large scale webcrawling, and just allow users to give a URL link of their sets, and have the app just crawl that page. That would surely not be a problem because you are getting express permission from the user. Quizlet has an API but they aren’t giving away new keys. On their API FAQ page it says API data is not to be used to make a competing platfrom competing with their core services…

Flashcards might not just be ‘pairs’ but could also have more than two ‘sides’. Quizlet has a side for pictures, and I’d like a side for notes. I think it would be really cool if when making a card, you could see other people’s notes/definitions, and decide to add to your reviewer or not, also perhaps with a ranking system of most useful notes. Notes help with mnemonics and contextualizing the information- something often missing from too simple cards.