Defining the goals is a key aspect. If re-invention is what we desire than I wou...

Defining the goals is a key aspect. If re-invention is what we desire than I would like to take a shot at outlining the positive aspects of usenet, as well as the negatives.

Positive:

* Anonymity possible (to an extent)

* Moderation possible (to an extent)

* Caching of desired content at the network edge

* Binary data (though obviously no more yyencode/etc)

* Libre (as in freedom of speech)

* Free (as in beer)

* Useful, if probably illegal, content

* Distributed

The negatives:

* Impersonation/other false claim to identity.

* Spam

* Illegal content (to whom? how to identify? intractable)

* Flame wars

* Difficulty of setting up a 'feed'

I'd like to take a small stab at these various problems.

For identification I would specify the use of public key cryptography; it's the only de-centralized option I know of. OpenPGP with some extensions (IE: ed25519 signing keys) seems to be the obvious choice.

With identification the use of spam filtering technologies can also be resolved. Have users 'file' copies of messages in to several training bins via flags. Flags would be ternary state entities (true/false/null). Liked, On Topic, 'harmful content' (the catch all would be used in a design sense to include any type of illegal content, however for some groups that content /is/ the signal; this is meant to inform users so they can choose, not to be a nanny for them).

The above tagging would allow for aggregation to determine the 'health' of a data-pool, as well as how useful it was to the user base of a given server.

Data pools would, in themselves, be another type of tag. The built in base tags defined above would be the only 'required' ones, but a firehose of all data is crazy. Thus tags (similar to keywords) would also be attached. Advanced users (any that provide 'detailed' feedback) could 'vote' on the accuracy of applied tags including the base tags (which would be inferred as necessarily existing).

Base tags become 'groups' in this distributed database.

Critically servers aggregate and thus anonymize the tag weighting of their own userbase (even from their own user base).

Every (tag sync period) an enumeration of all non-default tags (and their yes/no vote counts) would be computed and the published result for that listed.

Also published, would be a list of the other 'servers' which this current server is aware of. SOME of these would be replication servers (which would have a non-zero weight that isn't required to be published), while others are just the servers known by other servers. Each entry would have an age; this would be the last time that the tag stats of that remote server was successfully polled (thus low entries are likely to be replication sources, BUT might be 'validation' of other servers as obfuscation).

Servers might only share post contents with authorized connections. Anyone able to do so would be able to source the other server and therefore replicate the tagged data that it chooses to cache. The other server may require something like providing account data for it to sync your server's userbase stats to it. Comparing the relative accuracy of stats would enable it to determine if your userbase is real or not, as well as how your userbase votes on things it's userbase does not. This would be the reason that (semi-anonymous) peering between even not-like-sized servers would be permitted; particularly if your own server is frugal and normally doesn't download things that aren't voted on it.

Obviously server to server communication would involve the automated use of signing keys /for the server/.