High latency networks (like mixnets, Freenet to some degree) don't provide instant gratification. You request a resource (a message, a copy of a static website, whatever), and that message gets bounced around for a while, and the content works its way back to you. Maybe in a minute, maybe in a day. Maybe in 30 seconds. But the key here is that it's not establishing a *connection* between two points (you and your server/destination) and grabbing the content. It's you, sending a message, waiting for a response. Email is high-latency. Usenet is high-latency. Downloading a video, then watching it is high-latency.
How long traffic mixes for is less important than how much other traffic it mixes with. If a message is delayed for two weeks but nobody else mixes traffic on that node, it might as well have been mixed for two seconds. If your traffic is mixed for two seconds but ten thousand other people sent traffic over that node, then it would be just the same as mixing for two weeks with ten thousand other people sending traffic over the node. So the more heavily used a mixnet is the faster it can go while still providing the same level of anonymity. Another technique is called alpha mixing, which is where the messages themselves have user defined time delays per hop. Old mixnets had various other strategies where the user was not in control of their own latency but it was decided by the individual mix nodes. Alpha mixing allows us to lower latency a little bit as well, provided not everybody does. Some people routing high latency traffic over the mixnet gives an anonymity benefit to everybody using the mixnet, including the people routing lower latency traffic over it. This isn't to say that you can use it with no time delays and be fine, but there are techniques to shave some latency away while still keeping anonymity intact.
When you're talking about an adversary the size of NSA, they're always going to be able to see enough network traffic to follow *connections* around the Internet. High-latency Mix/etc networks separate the connection from the actual conversation. This lets them chop the conversation up into pieces, toss it in with other pieces of other people's conversations, and keep an adversary from saying "Aha! That specific request is from X! He's sending a message to Y!"
The issue is that, because the NSA can watch most of the links between nodes on any given network, they can indeed follow the packets around. Low latency networks do not mix traffic, they get packets and forward them on as they come. If the attacker watches Alice send a packet to a server and then Bob sends a packet to the same server, he knows that the first packet out of the server belongs to Alice and the second packet out belongs to Bob. So he can follow their traffic around easily. In a mix network, Alice and Bob send their packet to the server as before, but the mix network holds them long enough that it can randomize their output order before sending them on. Now the first packet out has a 50% chance of belonging to Alice and a 50% chance of belonging to Bob. Throw in some randomly generated dummy packets between mixes, and it becomes even harder for the attacker to tell what is going on.
No technology is going to keep NSA from seeing that *your IP* is making connections and participating in an anonymity network. Nor is it going to keep someone that size from seeing what servers you're connecting to. Or seeing who else is connecting to the same server, and possibly correlating that traffic.
There are covert channel technologies that could make it harder for them to tell, something sort of like bridges on steroids could be made, but it is going to be really hard to protect from a global external attacker anyway. There are systems for Alice and Bob to talk with an extremely low probability of their act of communication being identified, even by an attacker who watches the links of the entire internet. But I think these techniques would work better for a spy to funnel information back to his home country than they would work for bridges to an anonymity network.
But it can provide massively better protection against those agencies identifying what you're doing. In some perfect, "everybody uses a perfect high-latency anonymity technology" world, NSA can still see that you're using anonymity technologies. Just not what you're doing. Maybe you're sending an message. Or participating in a group chat. Or downloading a catalog of products from a vendor and sending a message to transmit an order.
Pretty much.
However, we're not just one missing piece of technology away from that happening this week. kmfkewm's project to build a PIR/etc system is fantastic. But even if he finishes it tomorrow, and it's ready for production, and he stands one up on the Internet, what you have is a single PIR server, sitting on the Internet, able to securely route messages from Client A to Client B, with nobody else able to see what they're doing. But they'll still be able to see all the clients connecting to the server. They can't know what they're doing, but they'll know they're using that server.
A single mix is only good to protect from external attackers anyway. If the mix is bad it can link communicating parties. But a single good mix on a messages path can buy it significant anonymity. With CPIR (which PSS seems to be a type of) it doesn't matter if the server is bad. It is essentially cryptographic anonymity, nobody can tell the messages you download unless they can solve a hard math problem. Also, people would connect to the server via Tor anyway, so nobody who cannot break Tor can tell they are connecting to the server. But hopefully a network of volunteer nodes springs up pretty quickly.
One single perfect PIR server doesn't fix the problem. It's a key part of the equation, but it's nowhere near the whole equation. Sure, it's cryptographically awesome, but from a practical anonymity perspective, if there's just one single server sitting on the Internet, doing amazing crypto stuff, it's really not that much better than a hidden webserver just routing PGP messages between users. If someone seizes that server, and backdoors it, they're going to be able to see that Client A sent something encrypted to Client B, and because of their NSA-level view of the world, they probably will know who Client A & B are. Not what they said, but everything else about their conversation.
Well, one of the reasons it is better is because if a server routes GPG messages between users, it knows who is talking to who, and unless the users connect to it with Tor it can easily tell which IP address belongs to which person. With single PIR server the server cannot tell who communicates with who and it cannot link messages to IP addresses. No, even if somebody seizes the server they cannot tell anything. PSS assumes a malicious server the entire time. Unless they can solve a hard math problem, having the server buys them next to nothing. On the other hand if they seize a single mix it is game over because the owner of a mix can follow traffic through their own mix. A mix network needs at least two nodes operated by different individuals to protect from internal attackers, although a single mix can protect from external attackers like the NSA, unless they take the mix over.
A hundred isolated PIR servers, acting as little individual islands of communication, still basically have the same problem. They have to be able to communicate with each other, forming a meshed network of mixing and content delivery, that actually decentralize the network.
Certainly they need to form a mesh network. A hundred isolated PIR servers wouldn't work very well.
kmfkewm, I'd love to know where you see those technologies evolving, and how you see the world working after the actual implementation of PIR/etc technologies. Is every user node routing traffic for others? Or are they connecting to central servers? Do PIR storage servers talk to each other, or are they islands?
I see things evolving past the point of browser based applications in many ways, and toward custom security oriented software packages for specific goals. The anonymity of a mix network is actually hurt if it is too big, pretty much the opposite of Tor. The theoretically ideal mix network would consist of one node, from a traffic analysis perspective (or two nodes if you want protection from internal attackers), but in practice it needs more nodes to ensure the two people running nodes don't turn to the darkside etc. The more concentrated traffic is over the mix nodes the better, and the more mix nodes there are the less concentrated traffic over them is. If all users are mix nodes, traffic wont be mixing with much other traffic at any given hop.
I envison a mesh network of maybe 50 or so nodes, each node being a mix and a PIR server, with messages being distributed through the servers with everybody gets everything PIR or something. The shittiest part of this system is the fact that all PIR servers need to have the same database, and that means they need to share all messages they get with each other similar to how BitMessage shares all messages with all users of the system. It would be much nicer if we could have messages segmented and spread across the network to different nodes, instead of a single database mirrored over each node. Doing it this way is kind of crappy, because for one it wastes probably hundreds of terabytes of storage space that will be dedicated to the same mirror, for two it opens up the risk of DDoS attacks since sending a packet to a single node echos it to all nodes, etc. And it is hard to keep good content that is accessed a lot, because the protocol itself prevents anybody from knowing what is being accessed and what is not. So this is not ideal, but I cannot think of a better system that doesn't introduce traffic analysis vulnerabilities. Certainly we can not have different messages tied to different PIR servers, or else Alice could cause Bob to access the various servers in a pattern that she can then identify. Bob could have a single server associated with his pseudonym where all messages to him are sent, but then his anonymity set size immediately falls to the users using that node, and what happens when that node is taken down? If it is malicious it doesn't matter because of PIR, but if it is taken down he needs to go to a new server, this will introduce traffic analysis vulnerabilities as well. So I cannot think of a way to do it other than a mirrored database over all the servers, but what to do about the risk of DDoS , plus it is a fucking shame to waste so many terabytes of space mirroring the same thing over and over again. The biggest win from mirroring the database instead of having it on a single server is that the bandwidth load of clients downloading messages will be distributed, but in reality a single CPIR server is no more insecure than 100 CPIR servers. It doesn't matter if your CPIR server is compromised or not.
If you can think of a better way to manage inter-CPIR server communications etc please let me know.
And everyone who said "We need an easy way to do that" is hitting the nail perfectly on the head. Even if the world's best anonymity network is built, if only two people are using it, then it's still not anonymous. You need more widespread adoption to get the true benefit from mix technologies. Basically, I have to have enough "other" traffic to mix my traffic with with before I can hide in that traffic.
Yeah one of the hardest issues will be bootstrapping an anonymity set to start with. I will probably suggest people run it sending dummy traffic, but not using it for anything, until it has at least a thousand members.