Avoid hosting of illegal material

This is a major issue with all P2P networks: hosting of illegal stuff. As far as I understand, IPFS solves this by not hosting content by default, but only content that the user opted-in for.
Does this mean that a node (like, say my laptop) with never share any content until I tell it to share something? Or does my laptop still shares content, but without pinning (ie. old content is replaced with new content as I download more stuff)?

IPFS solves this problem by not “pushing” content to other nodes, so only content the node has visited is shared.
IPFS does share content that you visited but have not pinned, in order to minimize this, run a repo GC (garbage collection) often.

Also, it’s worth to note that in go-ipfs, automatic garbage collection is disabled by default due to an inefficient implementation. This will be fixed, but for now can be enabled by starting the IPFS daemon with the --enable-gc flag.
Manually triggering a GC with ipfs repo gc works either way.

6 Likes

I think it would be useful to have a more fine grained control over this. For example, let nodes decide what to share: A) default (share what you download) B) all except blacklist C) only from a whitelist D) only pinned content E) all from a specific user, namespace, or license (for example, I’m OK to re-share everything from Archive.org, because I know they’re gonna share free content and not kidporn, but I don’t want to whitelist everyone of their items individually) F) share everything regardless G) don’t share anything, never

4 Likes

This may even actually work this way.

Is there any similar feature request on Github?

Law is not cut and dry. It is all arguable.

The main issue with law is that laws vary widely around the world. Country A may state something is legal and country B not. It is not possible to know every law so the legal question is not determinable.

Also there are issues such as Wikileaks. Would IPFS allow wikileaks to post material? Wikileaks has given rise to considerable comment and the US Government has pursued various people connected with Wikileaks. So would IPFS simply prohibit wikileaks?

Already many US sites are banned around the world and for example youtube is banned (and blocked) in UK schools as being unsuitable. Google is also banned in UK schools as being unsuitable. So if Google tried to post material, would IPFS allow that?

@zillerium, IPFS is a protocol, like HTTP. The protocol itself is content-agnostic. Asking “Would IPFS allow wikileaks to post material?” or “if Google tried to post material, would IPFS allow that?” is like asking whether HTTP would allow those things. It’s not about the protocol allowing these things. It’s about individual nodes on the network allowing or disallowing.

Instead, @martha is asking the right question: How can I avoid hosting content I disapprove of on my own node. You can also extend that to ask how communities, or whole networks of peers, can choose which content they will/won’t host on their nodes.

I think @martha is also getting at the distinction between simply being a node vs. participating in some effort to pin copies of the good content. As @Mateon1pointed out, your node will only hold content that you pulled onto it, so the real question might be “how can I help pin and serve the good content?” The answer here is for communities to create pin sets that list the content they care about. Anyone who wants to support that community’s content can participate in pinning parts of their pinset(s).

There is also the issue of running gateways. If you are exposing your node as a public IPFS-HTTP gateway, anyone can request any content through that gateway by default. This means they can use your gateway as a repeater for any content they want. The simplest fix to prevent this is for nodes to keep a blacklist (and eventually whitelists) of hashes that they will/won’t allow on their gateways, but solutions will get much more nuanced than that over time.

In the long run, I also think that communities and networks of nodes will also develop network-wide blacklists, where the participants in that network will identify the content that they mutually agree not to host.

3 Likes

I can understand how blacklists would work: any content is allowed, all content is equal, until a rights owner etc. requests a specific hash to be blocked, either in the whole network, or only under a certain jurisidiction. Easy system, kinda, and probably simpler than today. But how would whitelists work? What would they be?

Example of a whitelist: a research institution’s library runs an IPFS gateway that serves only content that has been created, pinned, or cited by its researchers. They want to use ipfs.animpressiveinstitution.edu as the canonical gateway address for the content that they publish, so any time their researchers link to the data they can use institution-specific links. They are eager to serve all of their created/pinned/cited content but absolutely don’t want unapproved content to be served through that gateway. In order to support this use case, IPFS should allow you to set white lists on a gateway. In this case, the whitelist for the gateway running at ipfs.animpressiveinstitution.edu would be all of the hashes of all of the content pinned on the institution’s IPFS node(s) plus the hashes of any content cited by the institution’s researchers. That way any non-approved content simply won’t be available through the gateway.

4 Likes

Why don’t we do like gnunet and make it impossible to know what you host? It’s called plausible deniability. It’s the final solution.

So, people in China can’t read the god damn New York Times, because of your god damn whitelists, basically. Who comes up with this stuff?

Not correct. As @flyingzumwalt wrote, a whitelist is just for certain nodes, nodes/gateways that only want to host their own stuff (which btw can be co-hosted by other nodes too). The NYT needs permanence, global availability, and fast loading times, so their pages would eventually be hosted all over the place, officially by themselves (whitelisted or not), by filecoin nodes, but also by the people who actually read the pages, who will therefore (by design) auto-reseed the content through their in-browser nodes (for a certain amount of time), unless they’ve explicitly added the NYT to their in-browser node blacklist. So not to worry, the NYT would be available in China. :slight_smile:

What about the terrorist handbook? Surely few people will host that. Also, if one can prove that you are hosting the terrorist handbook in maui, they’ll chop your dick off, so naturally, if blacklists excists, you’d want to use them. We’d also have thousands of blacklists that people can apply, so it’ll be a mess figuring out what you need. Also, if you can determine that you’re actually hosting a specific file which is illegal in some country, you’re fubar. So, why at all have this problem, if you could do like GNUnet and not being able to determine what is actually on your system? You only know the price you’ll get for hosting a certain fragment of whatever; it does seem like a more sound solution. I don’t want to be bothered looking all the time for what good damn blacklist I should be applying. Also, you cannot count on dumbasses on the internet with their dumbass blacklists, so you’ll end up with some kind of illegal content on your system one day and the NYPD SWAT team or German SS breaks down the door and cuts off your balls, because you can figure out what’s on your system.

That’s why IPFS needs total & totally secure IP anonymity, but built-in Tor should just be an intermediate solution. (There are probably better ways.)

As for hosting objects, I’m still not completely sure how that works. I know that if you pin an object to local storage, you’re hosting it. Same goes for accessing an object without manually pinning… the only difference being that you’re only hosting it temporarily.

As for routine background tasks of an IPFS node, I believe you’re temporarily storing and hosting at least parts of a file, but I’m not sure. Maybe you’re storing and hosting full objects. (?) I’m just seeing that I have an IPFS up- & downstream of ca. 1 GB per day each. So there’s a lot of stuff coming in and going out. Not sure, though, what the node is doing with it.

Well, Tor is useless. You need a constant and equal downstream to your upstream, so that other hosts can’t determine where data is coming from; if you’re the originator or just an intermediate node. You also need connections to respected nodes, to make sure you’re not on some dumbasses local network;)

I only see problems with IPFS, but I still don’t understand it all, yet.

2 Likes

There are (to my knowledge) no respected/trusted nodes in IPFS. I think every user’s node will by default trust all the nodes around it. So if they want to use Tor, which I’m against as well, then it would have to work without the trust aspect.

But maybe there’s a solution in this “local network” thing? When I connect to a VPN, I’ll first receive a VPN address like 10.10.47.178. Other users who connect to that VPN server, are in principle accessible via that VPN server’s subnet. I had a VPN provider once who forgot to block local protocols like AFP, and I could actually look into some of the users’ public folders, those who hadn’t disabled Local Filesharing. But maybe there’s a similar way to do that with IPFS, i.e. all nodes form one big ad-hoc “local network”, and then they just talk locally, and when you try to scan any given user, you just get a (virtual) local address. Don’t know if that’s possible without an actual server, though.

There are also some ideas on data exchange without revealing any IP address here:

@madorian A gentle reminder to keep discussion civil. We’re all here working together, trying to build something great, non-constructive comments have no place here.

7 Likes

In defense of @madorian: he worded his thoughts quite “raw”, but I guess he did so due to the weight of these, actually valid concerns.

What IPFS represents even on a technological level can easily be “converted” to an ideological representation, to put it an abstract aspect. More specifically, the principles IPFS carries (like decentralization and transparency) may, and likely will manifest in real life, causing socetial conflicts.

These conflicts may result in sharing compromising data on the network, be it WikiLeaks data, or the Anarchist Cookbook. These forms of data can mean a huge threat for those possessing AND sharing them (depending on location and period), and even I think it may have a high priority in the IPFS workflow to address these issues.

IPFS is intended to be the new World Wide Web, meaning it must address the flaws of the current one. Security is one of them. Of course, the ideological details are opinion-based, but to be honest, I think we mostly agree that no one should be accountable for the very data they own - not in the context I highlighted above, at least.

Don’t get me wrong, I heavily appreciate all the efforts on the protocol, but @madorian had some very real concerns that I decided to emphasize in a much more civil manner.

4 Likes

@Katamori I definitely agree, I just want to make sure that we keep discussion here civil, with issues like anonyminity its easy to get carried away. I just want to ensure that anyone who might have good input in the conversation isn’t discouraged from participating by the tone we set.

3 Likes

This doesn’t make any sense, but you have a lot of very good points there, so let’s try to address some.

I think the U.S. government has better ways to censor a newspaper than the Chinese government does, so let’s ditch the cold war agenda and take Venezuela as an example:
Suppose I’m a Frank Zappa living in Venezuela and I want to read every newspaper I can get my hands into, the more, the better. But my national government say it’s wrong and will do whatever they can to prevent me from reading the New York Times, for instance.
IPFS can help me because if I know the hash for the newspaper’s PDF file, I can fetch it in my IPFS node. How would I find out what the hash of the file is beforehand if my government won’t allow me to do even that? Well IPFS can also help with that, because I’m clever and I often refresh my friend’s blog IPNS address which sometimes publish the hash for this week’s newspaper.

Given the above scenario, I’m asking anyone: how can one censor that even more? If someone somehow manages to block the network in a way that a specific hash is blocked in the IPFS network (which is very unlikely - they’ll probably try to block IPFS itself instead), someone with access to the original file could add junk to the end of it, therefore producing a new hash that can be seeded again in the network, and so on.

That would hold true for the anarchist cookbook, the terrorist handbook, the mein kampf, the salomon’s holy bible, etc.


As for plausible deniability, it is a concept that still has to be proved, and so far the cases in which plausible deniability would save someone from being arrested haven’t really worked. We have persons which have been arrested, prosecuted, and even committed suicide. What does work is anonymity.

If you want to securely host and access stuff, there’s already Freenet which is many years in production and it just works ™. Also, more recently we have Whonix which is more versatile than Tails for hosting content.

I’m personally doing personal tests with Whonix and IPFS (and some other communication tools) in encrypted volumes. If one knows what it’s doing, it can be done even on Tails.

This wouldn’t solve the problem with your local despots beating you up until you hand the encryption password (like it’s happening on South America) or the governments you’ve mentioned that would do surgical operations in one’s body whenever they find content they don’t like. For that matter, we would still have to use Free Net or GNU Net, which from the ground up have been specifically being developed with those matters in mind.

So, to clarify my point: in my opinion, putting too much responsibility in the IPFS software doesn’t make sense, because that’s not what would secure your node from the perspective of the privacy concerned. Even with much effort in this matter, it would still boil down to rely on external tools, like disk encryption and IP address hiding / bouncing.

2 Likes

[quote=“desci, post:19, topic:48”]
If someone somehow manages to block the network in a way that a specific hash is blocked in the IPFS network (which is very unlikely - they’ll probably try to block IPFS itself instead), someone with access to the original file could add junk to the end of it, therefore producing a new hash that can be seeded again in the network, and so on.[/quote]

I think that’s what the blacklists will be for, to block specific files. Star Wars Episode XII 4K H.265 mkv has been added to the IPFS, and Lucasfilm or Disney will have the mkv’s IPFS hash added to a blacklist that’s valid either for specific jurisdictions or even the whole global IPFS. (Depends on whether they gave licenses away regarding other territories.) Then, of course, someone prints a carriage return to the end of the mkv and re-adds it, and it turns into the usual whac-a-mole.

But will other players actually try to block the IPFS altogether? I’m not sure if that’s even possible—you can block the ipfs.io site (and other gateways), which e.g. my employer is already doing with WatchGuard—, but it’s probably easier to block specific hashes. Today, people make copies, and you’d have go to every torrent site, every 1-click-hoster, every cloud provider, where users offer a specific file, and have the files and torrent listings taken down. Simply adding an IPFS hash to an official blocklist is much easier and faster.

And when I imagine that the whole web could one day be based on IPFS, then what’s the use of blocking IPFS? You’d be blocking the web itself.

Agree. Which is why anonymity in IPFS should be higher on the to-do list.