2858
« on: December 22, 2012, 10:24 am »
Here's a good overview of known attacks on Tor
https://lists.torproject.org/pipermail/tor-dev/2012-September/003992.html
> > - "Traffic confirmation attack". If he can see/measure the traffic flow
> > between the user and the Tor network, and also the traffic flow between
> > the Tor network and the destination, he can realize that the two flows
> > correspond to the same circuit:
> > http://freehaven.net/anonbib/#SS03
> > http://freehaven.net/anonbib/#timing-fc2004
> > http://freehaven.net/anonbib/#danezis:pet2004
> > http://freehaven.net/anonbib/#ShWa-Timing06
> > http://freehaven.net/anonbib/#murdoch-pet2007
> > http://freehaven.net/anonbib/#ccs2008:wang
> > http://freehaven.net/anonbib/#active-pet2010
It depends in what way you want to become more precise.
I think the #SS03 paper might have the simplest version of the attack
("count up the number of packets you see on each end"). The #timing-fc2004
paper introduces the notion of a sliding window of counts on each side.
The #murdoch-pet2007 one looks at how much statistical similarity you
can notice between the flows when you are only sampling a small fraction
of packets on each side.
> > - "Congestion attack". An adversary can send traffic through nodes or
> > links in the network, then try to detect whether the user's traffic
> > flow slows down:
> > http://freehaven.net/anonbib/#torta05
> > http://freehaven.net/anonbib/#torspinISC08
> > http://freehaven.net/anonbib/#congestion-longpaths
Section 2 and the first part of Section 3 in #congestion-longpaths is
probably your best bet here. It actually provides a good pretty overview
of related work including the passive correlation attacks above.
If by 'more precise' you mean you want to know exactly what the threat
model is for this attack, I'm afraid it varies by paper. In #torta05
they assume the adversary runs the website, and when the target user starts
to fetch a large file, they congest (DoS) relays one at a time until they
see the download slow down.
In #congestion-longpaths they assume the adversary runs the exit relay
as well, so they know the middle relay, and the only question is which
relay is the guard (first) relay.
In #torspinISC08 on the other hand, they preemptively try to DoS the
whole network except the malicious relays, so the target user will end
up using malicious relays for her circuit.
> > - "Latency or throughput fingerprinting". While congestion attacks
> > by themselves typically just learn what relays the user picked (but
> > don't break anonymity as defined above), they can be combined with
> > other attacks:
> > http://freehaven.net/anonbib/#tissec-latency-leak
> > http://freehaven.net/anonbib/#ccs2011-stealthy
> > http://freehaven.net/anonbib/#tcp-tor-pets12
These are three separate attacks.
In #tissec-latency-leak, they assume the above congestion attacks work
great to identify Alice's path, and then the attacker builds a parallel
circuit using the same path, finds out the latency from them to the
(adversary-controlled) website that Alice went to, and then subtracts
out to find the latency between Alice and the first hop.
#ccs2011-stealthy actually proposes a variety of variations on these
attacks. They show that if Alice uses two streams on the same circuit,
the two websites she visits can use throughput fingerprinting to
realize they're the same circuit. They also show that by looking at
the throughput Alice gets from her circuit, you can rule out a lot of
relays that wouldn't have been able to provide that throughput at that
time. And finally, they show that if you build test circuits through
the network and then compare the throughput your test circuit gets with
the throughput Alice gets, you can guess whether your circuit shares a
bottleneck relay with Alice's circuit. Where "show" should probably be
in quotes, since it probably works sometimes and not other times, and
nobody has explored how robust the attack is.
#tcp-tor-pets12 has the adversary watching Alice's local network, and
wanting to know whether she visited a certain website. The adversary
exploits vulnerabilities in TCP's window design to spoof RST packets
between every exit relay and the website in question. If they do it
right, the connection between the exit relay and the website cuts its
TCP congestion window in response, leading to a drop in throughput on
the flow between the Tor network and Alice. In theory. It also works
in the lab, sometimes.
I also left out
http://freehaven.net/anonbib/date.html#esorics10-bandwidth
which uses a novel remote bandwidth estimation algorithm to try to
estimate whether various physical Internet links have less bandwidth when
Alice is fetching her file. In theory this lets them walk back towards
Alice, one traceroute-style hop at a time. In practice they need an
Internet routing map (these are notoriously messy for the same reasons
the Decoy Routing people are realizing), and also Alice's flows have to be
quite high throughput for a long time.
> > - "Website fingerprinting". If the adversary can watch the user's
> > connection into the Tor network, and also has a database of traces of
> > what the user looks like while visiting each of a variety of pages,
> > and the user's destination page is in the database, then in some cases
> > the attacker can guess the page she's going to:
> > http://freehaven.net/anonbib/#hintz02
> > http://freehaven.net/anonbib/#TrafHTTP
> > http://freehaven.net/anonbib/#pet05-bissias
> > http://freehaven.net/anonbib/#Liberatore:2006
> > http://freehaven.net/anonbib/#ccsw09-fingerprinting
> > http://freehaven.net/anonbib/#wpes11-panchenko
> > http://freehaven.net/anonbib/#oakland2012-peekaboo
#oakland2012-peekaboo aims to be a survey paper for the topic, so it's
probably the right one to look at first.
> > - "Correlating bridge availability with client activity."
> > http://freehaven.net/anonbib/#wpes09-bridge-attack
If you run a relay and also use it as a client, the fact that the
adversary can route traffic through you lets him learn about your
client activity. Section 1.1 summarizes:
2. A bridge always accepts connections when its operator is using
Tor. Because of this, an attacker can compile a list of times when
a given operator was either possibly or certainly not using Tor, by
repeatedly attempting to connect to the bridge. This list can be used to
eliminate bridge operators as candidates for the originator of a series
of connections exiting Tor. We demonstrate empirically that typically,
a small set of linkable connections is sufficient to eliminate all but
a few bridges as likely originators.
3. Traffic to and from clients connected to a bridge interferes with
traffic to and from a bridge operator. We demonstrate empirically that
this makes it possible to test via a circuit-clogging attack [17, 15]
which of a small number of bridge operators is connecting to a malicious
server over Tor. Combined with the previous two observations, this
means that any bridge operator that connects several times, via Tor,
to a web-site that can link users across visits could be identified by
the site's operator.
> > I tried to keep this list of "excepts" as small as possible so it's not
> > overwhelming, but I think the odds are very high that if the ratpac comes
> > up with other issues, I'll be able to point to papers on anonbib that
> > discuss these issues too. For example, these two papers are interesting:
> > http://freehaven.net/anonbib/#ccs07-doa
Traditionally, we calculate the risk that Alice's circuit is controlled
by the adversary as the chance that she chooses a bad first hop and a bad
last hop. They're assumed to be independent. But if an adversary's relay
is chosen anywhere in the circuit yet he *doesn't* have both the first
and last hop, he should tear down the circuit, forcing Alice to make a
new one and roll the dice again. Longer path lengths (once thought to
make the circuit safer) *increase* vulnerability to this attack.
I think the guard node design helps here, but whether that's true is an
area of active research.
> > http://freehaven.net/anonbib/#bauer:wpes2007
If you lie about your bandwidth, you can get more traffic than you
"should" get based on bandwidth investment. In theory we've solved this by
doing active bandwidth measurement:
https://blog.torproject.org/blog/torflow-node-capacity-integrity-and-reliability-measurements-hotpets
but in practice it's not fully solved:
https://trac.torproject.org/projects/tor/ticket/2286
-----
All that being said, no Tor user has ever been identified through a direct attack on the Tor network. There are lots of ways to give up your identity, but if you behave safely, Tor won't betray you.