Just throwing this out there: What if you tagged every payload posting/response with a (non-brute-forcable) index key that lets everyone who knows the index key find all the messages? Then rely on users inviting others by sending metadata objects containing the thread index key and the session key to the thread payloads? I know you give up unique per-public-message-in-thread session keying, but only when someone becomes a full participant (they need the index key and the session key) in the thread. And in either scenario, a compromised thread member with access to public messages in the thread results in a disclosure of all public messages in the thread.
This is something I have been recently considering as well. In many ways it would be an improvement if we didn't break things down to such a fine degree, by which I mean we had forum shared index tags and group shared keys, rather than pairwise indexing tags and pairwise keys. It would be a lot more like a traditional forum. Instead of Alice tagging a message with an index string, and then telling Bob and Carol about it with a metadata packet indexed with a secret tag between Alice and Bob for Bob and Alice and Carol for Carol, Alice could tag a message with a single group index tag and the message could be decrypted with a shared group key. This would have several advantages:
1. As people are invited to the forum, they can easily download all the old posts of the forum that are still in the cache. All the posts are indexed by the same group index tag, and all of them are encrypted with the same key. Someone who has the group index tag and the key can easily download all forum messages, without having to be pointed to each of them. This would make it a lot more like a traditional forum.
2. In being more like a traditional forum, it would probably make group organization a lot easier. Like you said earlier, a public forum is one-to-any, not one-to-many. If someone has the group index tag and the group shared encryption key, they can make a post and immediately know that anyone in the group can read it, even if they don't know everybody in the group. I don't know everybody on this forum, but when I make a post on this forum I know anybody who knows about this forum can read it.
3. It would make things less complicated. Pretty much, we would be nearly done with everything after implementing private stream searching. The actual forum itself would mostly just be a GUI, we wouldn't need to implement systems for people being able point to posts, etc. In general it would be much simpler and easier to understand.
There are also several really big problems with such a system though.
1. It is less compartmentalized. If Alice makes a post and she only wants Bob and Carol to be able to read it, well she cannot tag it with a group shared tag anymore, unless only Bob and Carol are part of the group. If they have a group shared tag between them, it is just complicating the original system more. What if Alice then wants to make a post for only Carol and Doug? Do they need a new group tag between them? It would be easier in such circumstances if they all have individual pairwise tags between them, and build up groups by adding pairwise shared index tags to the messages they send. This makes it easier to dynamically create groups on the fly.
A. On the other hand, Alice could tag the post for the group, but include in it a decryption key only encrypted for Bob and Carol. It could also be tagged with a shared secret between only Bob and one for only Carol. This would allow the entire group to see that someone posted a message, but only Bob and Carol would be able to decrypt it and know it is for them. On the other hand, this also has some problems of its own.
2. This is probably the biggest issue. If the messages to the group are all indexed by the same tag, then every single time Alice searches for posts, she will get all of them! By indexing messages with a one use shared secret string, Alice knows she will only get new messages with each search, because she will not search for tags she has already found messages associated with. How could we have unique group tags for every message, if we assume that Alice doesn't know everybody in the group? We could hash out the original group tag I suppose. But there are some anonymity issues with this.
3. How do we protect from spammers? If messages are searched by a group shared tag, whitelists will not work. If a spammer learns the group shared tag, what is to stop them from spamming thousands of messages? Everybody in the group will download the messages. The only way around this is to have two keywords and to only download items that match both, the first being a group tag and the second being some sort of individual identifier. But it is a question in itself how we could make the individual identifier in such a way that it can not be forged. The first solution that comes to mind is a pairwise shared secret between the poster and every single person they expect to read the message. Any solution likely requires that every member of the group either has a whitelist of known posters, or that they are vulnerable to being spammed to death. If they go with a whitelist, then it is back to one-to-many instead of one-to-any, and we have the same social bootstrapping problem as before.
4. Groups of any significant size will not be able to put much faith in the encryption key of their messages. This goes back to the loss of compartmentalization. If there is a group shared key, that means for a large group thousands of people could have the decryption key. And if a single key is compromised, all messages encrypted with that key can be decrypted, if there is a group shared key that means all of the group communications can be decrypted. This isn't so much a problem for public groups, since they will not really be concerned about the encryption of their messages anyway (although they will still be encrypted at least so the PSS servers can have some deniability). But what about a private group with twenty members? I suppose that even if each message is encrypted with a new key, if one of the clients is compromised or malicious, all messages that they can decrypt are already compromised. From a cryptographic point of view, it is much easier to break a single key than it is to break a single key for 100,000 messages, but I suppose we hope that it is impossible to break any of the encryption even a single time.
I suppose that what you are saying is a bit of a mix between the extreme singularity of a public forum and the extreme granularity of what I suggested. You suggest per thread encryption and indexing, whereas I was originally suggesting per message encryption and indexing, and the other alternative would be per group indexing and encryption. Perhaps per thread will be the best compromise. It will certainly make organization easier, which will be one of the biggest hassles with per message indexing/encryption. It still has some issues we would need to think on though.
I think this is one of the areas that we still need to discuss, and I appreciate any feedback from anyone else reading this. Pretty much at this point I believe we are done with posting messages to the forum, and we are done with encryption of messages, and we are done with all of the ground work. At this point the primary thing to work on is receiving messages from the forum, which is very likely going to be done with Private Stream Searching, as One Way Indexing didn't turn out to be as cool as I hoped it would. Actually using PSS presents one problem, in that it is not resistant to censorship. But back to the main point:
Assuming we have a strong system for making posts and a strong system for receiving posts, how do we actually make a system for group communications on top of this? At one extreme we can have a totally public forum type system, but then we need to take care of at least several of the points I made above. At the other extreme we can have an extremely compartmentalized per-post indexed messaging system that users sort of work into a forum looking thing themselves. In the middle we have things like per-thread indexing. We still have time to figure out what will be best, because no matter what we decide on it can use the same fundamental infrastructure. Actually, as the design work on the fundamental infrastructure is near completely done at this point (it pretty much is done unless we can find something with the properties of PSS that is also censorship resistant), then the best way people who are not programmers can help is by thinking of how we want the actual forum/communication part of the system to work, as that is something that is less solidified in design at this point.
I'm sure my suggestion gets rid of some of the granularity in the invitation process in exchange for simplicity/less searches, but regardless of method, anyone with a full view to the thread can disclose the contents of the thread however they want anyway. And you should kick the first person to suggest DRM as an answer firmly in the nuts.
Yes the system I originally suggested is extremely granular. In some ways that is a good thing, in other ways it is probably a bad thing. It is particularly bad when it comes to keep a single perspective of a single forum, and organization will be a challenge to say the least. Keep in mind that single searches can include multiple keywords and return multiple documents. Indeed one of the huge advantages of PSS over PIR is that we can return all documents tagged with either "From Bob" and "From Carol", rather than the single document at position 321 in the PIR database (which requires a nymserver to make sure only messages to Alice are put at that position).
I'm having trouble imagining a scenario where I want a new thread partcipant to only see new messages going forward, or to not see some public messages because he doesn't know the author (except for his WoT/whitelist/etc settings, which are on him). And any control I can imagine to keep that from happening is trivially subvertable by any member of the thread with a full view. Or even with just "a better view" than the new partcipant, since they can send him the decrypted messages if all else fails.
Yes that is a problem with the system I suggested as well, and another advantage I could add toward the less granular designs. In the less granular designs, it will be much more likely for new participants to see messages made in the past, whereas with the more granular design it will be much less likely but still possible. In the more granular design, it will be much less likely that an individual can see all posts in the thread even, but rather might only be able to see some subsection of them, although we would hope they can see everything people want them to see.
I think you're gonig to just have to rely on PoW as a limiting factor. You can't have any Freenet-style caching of content, because there's no concept of last-access-time (without massively diluting the blindness of server owners that's 99% of the point of PIR), so you just have to eat the oldest data first.
I think that POW is the only real solution as well, unfortunately. It will at least make it significantly harder for a single person to spam the shit out of the system. We could have it so users upload posts in popular threads though. The users know if a thread is popular.
Everything is a tradeoff, and in this case, storage is the achilles heel of being completely blind to content in any form. And that's the whole point of the PIR/EKS/etc system. It's worth that tradeoff. But as soon as somebody can give you 4TB worth of flood, your database has been effectively emptied. All messages have been lost, thanks for playing. Move along.
All messages on the server have been lost, but users keep content client side. It would be like if I have a complete mirror of SR, and then an attacker wipes the SR server. Well, I still have a copy of it!
Fantastic.. The more of the CPU load you can shift to the client, the better I like it. You get some PoW-like benefits, and your server scalability improves.
Indeed
Being honest here: no, yes, hell yes, no, and barely.
I'm not saying I have to be djb before I can help, I'm saying programming has never been my focus. Again, I'm starting to work on that, but I'm a long ways off. I've been duct taping things together for a few decades, and I'm reasonably comfortable reading C code up to a point, and will definitely look at anything you post on github, but I'm just setting proper expectations. I could probably design and implement something like Tails from scratch (great example of integrating and duct-taping other people's code not actually developing anything with an attack surface from scratch) without a technical problem. But if I wanted to write Bitmessage from scratch, that'd take me a year and a shitload of learning. And you'd laugh your ass off at it when I got done.
I'm happy to help however I can.
Something tells me you could write BitMessage in much less than a year. It is actually a really simple system. Keep in mind that a lot of programming stuff is indeed gluing other peoples code together. I didn't write ECDH or AES, but I am making extensive use of both. Perhaps the best way you can help is by helping on the design of the forum component. Assume we have an anonymous system for making posts and an anonymous system for receiving posts by keyword, and anything can be encrypted strongly. How do we go from this to a group communication system? How do we go from the group communication system to a full fledged marketplace? Largely, those are the remaining design questions.