Feel free to do some research http://freehaven.net/anonbib/cache/alpha-mixing:pet2006.pdf <--- done http://www.abditum.com/pynchon/sassaman-wpes2005.pdf <---- not doing, talks about the idea of using PIR + protecting from intersection attacks http://www.esat.kuleuven.ac.be/~cdiaz/papers/cdiaz_inetsec.pdf <---- discusses mixes in general http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA456185 <---- need to do http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf <---- done http://spar.isi.jhu.edu/~mgreen/ZerocoinOakland.pdf <--- Talks about Zerocoin, the first distributed blind mix. Might integrate, others have coded an alpha version http://www.cl.cam.ac.uk/~rja14/Papers/bear-lion.pdf <--- block cipher for Sphinx, done that should get you started Why it is superior to Tor: A. Tor is vulnerable to end to end correlation attacks Client <-> Client ISP <-> Entry Node ISP <-> Entry Node <-> Entry Node ISP <-> Middle Node ISP <-> Middle Node <-> Middle Node ISP <-> Exit Node ISP <-> Exit Node <-> Exit Node ISP <-> Destination ISP <-> Destination if the attacker has control over any of the members of {Client ISP, Entry Node ISP, Entry Node} AND any of the members of {Exit Node ISP, Exit Node, Destination ISP, Destination} then Tor offers no anonymity to the client. Tor tries to protect from this primarily by having a very large network of nodes, which makes it hard to own both the clients entry node and exit node. As the Tor network has more nodes added, it becomes increasingly difficult to do a fully internal attack, however it is still entirely possible to. The big thing is that it is easy to monitor destination site at its ISP, or even to seize the destination server and monitor from that point. Even hidden services have relatively crappy anonymity. So the Tor security model is little more than if you have a good entry node you might not be fucked but still could be, and if you have a bad entry node you probably will be fucked but might not be. Attackers like the NSA monitor traffic at the ISP level, quite intensely apparently, and they could very well be capable of watching your traffic at Client ISP and Destination ISP, which means they can link you to your destination. Attackers with a lot of nodes on the network still stand a significant chance of owning both Entry Node and Exit Node, especially if they have high bandwidth nodes. It is estimated that there are already groups of node owners who can deanonymize huge parts of the Tor network in little time, right now the hope is that they are not malicious. Adding more nodes to the Tor network, from diverse groups of node operators, can continue to protect from internal attackers nearly infinitely, as more nodes are added by good people it becomes less likely for bad people to have nodes on your circuit, unless bad people also keep adding more nodes. Also, even if bad people keep adding nodes, it makes it less likely that any given bad person will have a node on your circuit. Anyway, despite the ability to do a somewhat decent job at protecting from internal attackers, there is a hard limit to the protection Tor can provide from external attackers. There are only so many different networks making up the internet, and most of them are exchanging traffic through a much much much smaller number of exchange points. Monitoring a few hundred key points on the internet is enough to monitor most traffic on the internet, and no matter how many Tor nodes are added to the network it doesn't matter, because once all Client/Entry/Middle/Exit/Destination ISP's are being monitored by an attacker, they can break Tor in 100% of cases. But they don't even need to monitor all of these ISP's or even all of the IX's they exchange traffic through, in order to deanonymize a huge subset of Tor users, because Tor entry guards rotate over time for one and eventually you will use a bad entry or a good entry with a bad ISP, and for two if the attacker monitors your entry and your destination externally they can still deanonymize YOU even if they cannot deanonymize the entire internet. In addition to this keep in mind that a tremendous amount of traffic on the internet is crossing over the US, and most international traffic is going through multiple countries. The graph I drew above isn't even high enough resolution, in reality there are tons of points between these nodes and a compromise ANYWHERE between client to entry + a compromise ANYWHERE from exit to destination, is enough to deanonymize a user. High latency mix networks are not nearly as vulnerable to this sort of attack. The goal of anonymity is to remove any variation between one user and their communications and another user and theirs. Any variation can be turned into an anonymity attack. Tor takes a few measures to add uniformity to communications. First of all, due to the encryption of onion routing, all streams are, at the content level, 'the same' in that they all are indistinguishable from random noise at each hop. If you send the message "Example" down the following path: Client -> Node 1 -> Node 2 -> Node 3 -> Destination an attacker who owns Node 1 and Node 3 has no trouble at all linking the message from Node 1 to Node 3, even though there is Node 2 in the middle. This is because the content of the message is exactly the same, and it sticks out from other messages. If onion encryption is used, the message "Example" might look like "I9aiPS1" at the first node, "9!@jU9A" at the second node, and "AHZ12(a" at the third node. Now the attacker at node 1 and 3 cannot so easily link the message, because it looks completely different at node 3 versus node 1. It blends in with all the other messages node 3 is handling, in that all of them look like completely random noise, and there is an extraordinarily low probability that the same pattern of noise has been seen anywhere else ever in the existence of the entire universe. Tor also pads all information into packets that are exactly the same size, 512 bytes. If not for this, we would have problems like this: Client -> Node 1 -> Node 2 -> Node 3 -> destination sending the stream [1byte][6bytes][3bytes][10bytes] 4 packets with different size characteristics. In reality we could expect many more packets with vast size differences, if not for the padding of Tor. The attacker at Node 1 and Node 3 can easily link the stream, because they see that there is a correlation in the packet size of the stream they are handling, and chances are it is pretty damn unique at that. With Tor it is like this Client -> Node 1 -> Node 2 -> Node 3 -> destination sending the stream [512byte][512bytes][512bytes][512bytes] Each packet has the same actual payload data as before, but now it is padded such that all packets are the same size. Now the attacker at node 1 and 3 cannot use this characteristic to link the streams, because all the packets are the same size, and all streams all the nodes handle have packets of the same size so variation has been removed and replaced with uniformity. So Tor is good to do these two things, and they are indeed improvements over many VPN technologies, and they have in fact ended up making Tor harder to attack at a technical level (just look at fingerprinting of VPN traffic, which approaches 100% accuracy, versus fingerprinting of Tor traffic, which has taken a long time to get past 60% accuracy). But Tor still has many problems left over, and most of them are inherent to the design of Tor. There is still the possibility of variation in interpacket timing. imagine each . equals a small unit of time, like 1ms Client -> Node 1 -> Node 2 -> Node 3 -> destination [packet].......[packet]..[packet].......[packet].........[packet]...[packet] When the client sends a stream of packets down Tor, they are sent as they are constructed, not at fixed time intervals. Even if they were sent at fixed time intervals, any of the nodes could delay individual packets as much as they want before sending them forward, and in fact even though they don't need to do this they can still do it to speed up this type of correlation attack. Now we are right back to where we started, the attacker at node 1 and 3 can link the stream by analyzing interpacket arrival times and looking at the packet spacing. Tor does not protect from this attack at all, short of natural network jitter, which is not enough. Mix networks protect from this attack because the nodes completely wipe interpacket timing characteristics between nodes. Client -> Mix 1 -> Mix 2 -> Mix 3 -> destination If Mix 1 sends a message to Mix 2 with the following packet properties (though on a mix network the entire message can be one very large padded packet): [packet].......[packet]..[packet].......[packet].........[packet]...[packet] it doesn't matter, because mix 2 waits until it has the entire message, and then it sends it forward. It doesn't send packets as it gets them. This removes all of the interpacket arrival time that mix 1 inserts into the message. [packet].......[packet]..[packet].......[packet].........[packet]...[packet] Becomes [packet][packet][packet][packet][packet][packet], prior to node 2 sending it to node 3. The fingerprint is sanitized. Another issue with Tor is that packet counting is not protected from at all. Client -> Node 1 -> Node 2 -> Node 3 -> destination [p1][p2][p3] The client sends a total of 3 packets to the destination. Now the attacker at node 1 and 3 can use this to try to link the stream to the destination back to the client. Because if node 1 processes a 3 packet stream, and node 3 shortly after processes one, then there is a high probability that the streams are related. Node 3 knows the 5 packet stream it just processed is not the 3 packet stream that node 1 just processed. Tor very minimally protects from this by padding single packets to the same size, but it doesn't pad entire streams to the same number of packets. Mix networks protect from this because every message is just one really large packet, so all 'streams' are one packet. These are just two of many examples of how higher latency mix networks are superior to Tor. I was going to make a list so I started with A, but after typing this all out I realize that A by itself should be enough to show you why mix networks are superior to Tor. I could go on to B, C, D, and E, but for right now I am tired of typing. To summarize point A: Tor adds packet size invariance (by padding packets) and payload content invariance (by onion encryption) but leaves stream size variance and interpacket arrival time variance. Any message variance leads to correlation attacks! Good mix networks remove stream size variance by having all messages as a single fixed size packet, and also removes timing variance by the same method as well as by introducing timing delays at each hop enough for mixes to remove interpacket timing delays and add interpacket timing invariance between mixes (even assuming multiple packets are used). A good mix network will remove *ALL* message variance at *EACH* mix. So it doesn't matter if: Client -> Bad Node -> Good Node -> Bad Node -> Destination happens, because the good node in the middle makes the message totally uniform and removes any possible fingerprint that could be identified between the first and last bad node. On a good mix network, every single message either has exactly the same characteristics, or essentially the same characteristics in that all messages are totally randomized at each hop. Anything that is not invariant between hops is randomized at each hop. This alone massively increases the anonymity of a good mix network over Tor, but many other things come into play as well. I was not talking out my ass when I said the attacks on mix networks are almost totally different from the attacks on Tor. This is because a good mix network fixes all the attacks on Tor that can be fixed, and also should take measures to protect from more advanced attacks that are in the realm of mix networks. The mix network threat model is totally beyond the Tor threat model, in that it fixes the problems of Tor that can be fixed, and moves on to the problems of mix networks. And as for the problems of mix networks, let's start by saying that the threat model against mix networks assumes that the attacker can view *ALL* links between mix nodes. Keep in mind that the threat model for Tor assumes that an attacker who can see all links between all Tor nodes can 100% compromise Tor. So mix networks assumed attacker is the attacker that can deanonymize Tor traffic in real time. One of the problems of older mix networks (all the currently implemented remailer networks, in and of themselves anyway) is the long term statistical disclosure intersection attack. Assume that the attacker can see all links between all mix nodes. Alice communicates with Bob over the mix network that has 1,000 members. The attacker can see how many messages every client sends and receives, due to the fact that he can see the links between all nodes. In the simplest example, the attacker is Alice. Alice sends 1,000 messages to Bob. Now Alice waits to see which nodes on the network receive 1,000 messages in the period of time she knows it will take for all of her messages to be delivered. Any node that doesn't receive at least 1,000 messages can be ruled out as Bob. Over enough (little) time, Alice can identify Bob in this way. Note that like nearly all anonymity attacks (and many security attacks), the culprit here is variance. In this case, the variance is in the number of messages received. Not all nodes receive the same number of messages, and Alice can use this to her advantage in order to find Bob. This attack also allows third parties to link Alice to Bob, even if Alice and Bob are both good and not attacker, although that takes a little bit more math to explain. Anyway, the first solution presented to this problem was to use PIR. Now Alice sends 1,000 messages to Bob's PIR server. She has no way to directly send 1,000 messages to Bob. Instead of the network pushing 1,000 messages to Bob (as happens in the old designs), Bob now pulls messages from a rendezvous node (via PIR to protect his anonymity). The key difference is that Bob will only download 50 messages every day, regardless of how many messages he has waiting for him. And every other member on the network will only download 50 messages per day as well. They can communicate with their PIR server over the mix network if they want to tell it to delete certain messages, or to prioritize certain messages over others, but no matter what they will only download 50 messages per day. Now it doesn't matter if Alice spams Bob with 1,000 messages, because in the best case Bob can just delete them and never end up downloading them at all, and in the worst case Bob will download 50 of them a day for 20 days, and end up obtaining 1,000 messages over a 20 day period, the same as every other user of the network. This protects from long term statistical disclosure attacks from third parties (linking Alice to Bob) and from a malicious Alice trying to locate Bob. However, it doesn't prevent a malicious Bob from trying to locate a non-malicious Alice. Let's say that Bob notices he has obtained 25 messages from Alice over a period of ten days. Now, remember that Bob can also see the entire networks links. So now Bob can do the intersection attack himself, saying that Alice must be one of the nodes that sent at least 25 messages over this period of time. And again, over enough time, Bob can locate Alice with this attack. The only way to protect from this attack is for each client to send an invariant (yes, once again, variance was the culprit) number of messages per period of time. This can be accomplished via random dummy messages being sent when legitimate messages are not. Essentially Alice has 50 messages she can send in a 24 hour period, and every so often either one of the legitimate messages in her outgoing queue is sent, or if she has no messages in her outgoing queue, a dummy message is sent in its place. Now every 24 hour period of time, Alice and every other node on the network sends 50 messages, and the variance is erased, preventing Bob from carrying out his attack. Both of these techniques are modern and neither has been implemented in a real network yet (although Usenet + Mix Network has been used, aka everybody gets everything PIR, to protect from malicious Alice and malicious third party carrying out this attack. However, everybody gets everything PIR does not scale), currently all mix networks are weak to this type of attack (in themselves, when everybody gets everything PIR is layered on via usenet or shared mail archives you can protect from the first attack). Pynchon Gate was the first whitepaper to describe a (scalable) way to prevent this attack from locating Bob or a third party linking Alice to Bob, I cannot name a specific paper that was the first to suggest using dummy traffic to prevent Bob from locating Alice with this attack, but probably one of the early papers on Dummy Messages mentions something about it.