How are blocklists maintained?

From @mcast on Thu Sep 08 2016 21:12:33 GMT+0000 (UTC)

“Bad data” blocklists are introduced in #106, #36.

In https://github.com/ipfs/notes/issues/21#issuecomment-167912686

Official, default block lists are dangerous, they make the maintainer a target of law enforcement requests and bring lots of political hassle with them over what belongs on such a list and what doesn’t.

Block lists should require explicit and separate user opt-in to reduce censorship incentives. Having them maintained by a 3rd party also helps to unburden developers from such politics.

Sadly, this seems likely true in at least some regions.

So how are blocklists to be

  • generated and maintained?
    • The big “official” organisations can publish lists of any data they care to declare “bad”.
    • If I assert copyright on my data, can I publish a blocklist for unauthorised derivatives I find?
  • tagged?
    • for fetching some files, you could go to prison or even just disappear
    • if for copyright files (e.g. .mp3) maybe it’s ok to have it if you paid; no re-sharing
    • some files you just prefer not to see at work, or show to children
  • provided as default, or discovered?
    • by users/servers in an administrative region; like CRDA
    • according to the carriers’ preference e.g. businesses, for their staff
    • by user preference, e.g. things many people find they want to un-see
  • audited?

to be sure to keep coders out of legal & political trouble?


Copied from original issue: https://github.com/ipfs/faq/issues/176

From @jbenet on Fri Sep 09 2016 22:17:27 GMT+0000 (UTC)

Great questions. we can get all of these answered over time. But the first thing to say is that we care about doing this very right, and are thoughtful in our approach.

In short, for now, we’ll translate DMCA requests and the like into a blocklist, and publish it somewhere. (we haven’t had to do this yet). clients can follow it in their configs. (will have some easy way to configure and turn on/off).

We will likely provide our own as default for now. In the long term it would be our hope that an independent bodies like EFF and Berkman Center would get involved in making sure these lists are correct. (I personally would like to reduce the “false positives” problem in DMCA, but it’s not easy to do this at scale given the huge volumes (youtube receives tons of these)). Regardless all blocklists would be accessible through ipfs itself to audit, and should carry a reason (usually the original legal request from whatever entity). This can be audited independently.

From @mcast on Tue Sep 13 2016 21:33:53 GMT+0000 (UTC)

TL;DR I’m optimistic about the technical side, but not so much the social.

[…] the first thing to say is that we care about doing this very right, and are thoughtful in our approach.

They’re composable - the fact that the union of a list of blocklists is a blocklist could be useful. To take best advantage of this, maybe the UI needs some mechanisms for feedback to & pushback from the user?

If the spec will need iterating, it makes sense to push censorship out into a daemon or subprocess. Like Squid can do.

Maybe a tool to check your pinned list for new blocked data, and give some warning?

In the long term it would be our hope that an independent bodies like EFF and Berkman Center would get involved in making sure these lists are correct. […]

So getting the blocklists “right” or “right enough” implies impartial curation? This sounds expensive, and yet the copyright holders probably won’t want to fund a charity to do it, and neither are most of the readers.

At least it can be done in an open way; and any attempt to get round the “one man’s blocklist is another man’s pin list” by e.g. hashing the ids again or putting them in a bloom filter… comes back to https://github.com/ipfs/faq/issues/85#issuecomment-246816012.