Silk Road forums
Discussion => Security => Topic started by: astor on August 31, 2013, 06:25 pm
-
Figured some people would be interested in this.
http://dedis.cs.yale.edu/2010/anon/
The Dissent project is a research collaboration between Yale University and UT Austin to create a powerful, practical anonymous group communication system offering strong, provable security guarantees with reasonable efficiency. Dissent's technical approach differs in two fundamental ways from the traditional relay-based approaches used by systems such as Tor:
Dissent builds on dining cryptographers and verifiable shuffle algorithms to offer provable anonymity guarantees, even in the face of traffic analysis attacks, of the kinds likely to be feasible for authoritarian governments and their state-controlled ISPs for example.
Dissent seeks to offer accountable anonymity, giving users strong guarantees of anonymity while also protecting online groups or forums from anonymous abuse such as spam, Sybil attacks, and sockpuppetry. Unlike other systems, Dissent can guarantee that each user of an online forum gets exactly one bandwidth share, one vote, or one pseudonym, which other users can block in the event of misbehavior.
Dissent offers an anonymous communication substrate intended primarily for applications built on a broadcast communication model: for example, bulletin boards, wikis, auctions, or voting. Users of an online group obtain cryptographic guarantees of sender and receiver anonymity, message integrity, disruption resistance, proportionality, and location hiding.
See our CCS '10, OSDI '12, and USENIX Security '13 papers describing the experimental protocols underlying Dissent. Also feel free to check out the source code at the link below, but please keep in mind that it is an experimental prototype that is not yet ready for widespread deployment by normal users.
-
Interesting! The future of anonyous networking, perhaps? Thanks for posting astor ;D
-
Dissent can guarantee that each user of an online forum gets exactly one bandwidth share, one vote, or one pseudonym, which other users can block in the event of misbehavior.
so a platform called dissent is designed to squash dissent?
-
so a platform called dissent is designed to squash dissent?
Exactly! But, were you expecting any different from the american nazis?
-
so a platform called dissent is designed to squash dissent?
Exactly! But, were you expecting any different from the american nazis?
not sure why anyone would call them 'american nazis' when they are more like 'american bolsheviks'
funny tricks those Zionists do eh?
-
ill be your 600th positive astor :D
-
not sure why anyone would call them 'american nazis' when they are more like 'american bolsheviks'
Because the US is a textbook example of fascism? Nationalism, militarism and big business running the show = fascism.
-
how the fuck can they guarantee everyone one bandwidth share without knowing who you are or where you're coming from?
-
Dissent can guarantee that each user of an online forum gets exactly one bandwidth share, one vote, or one pseudonym, which other users can block in the event of misbehavior.
so a platform called dissent is designed to squash dissent?
It "can" be used that way. Makes it sound like an optional feature, for people who want to prevent spam on their forum, for example. I guess each client is assigned a unique identifier, but I don't know how it works to preserve anonymity.
-
Dissent can guarantee that each user of an online forum gets exactly one bandwidth share, one vote, or one pseudonym, which other users can block in the event of misbehavior.
If you want a world where people can communicate and discuss issues anonymously on a large scale, mechanisms for creating anonymous accountability are an an absolute necessity.
And guaranteeing that each user gets one share/vote/pseudonym would be a fantastic technical achievement. It seems like it would be censorship, but it's actually the complete opposite. I'll explain why I think that in a second.
Think about Silk Road Forums. Right now, it's a nice little community built on some forum software. If you want to join, and start sharing your opinion, you're welcome. But you need to post fifty times in the Newbie section. That acts as a very low-level "proof of work" to keep the spammers from tearing the place up. A simple CAPTCHA is the only actual technical control that prevents SRF from getting torn up.
With a non-anonymous (i.e. clearnet) forum, you manage spammers with IP-based bans and very active moderation. With an anonymous (all you know about the people you're dealing with is who they SAY they are) forum, you can't ban them effectively.
Let me lay out a fairly simple, relatively low-cost attack against any message-based anonymous community. You flood it. Epicly flood it. Millions and millions of new messages a day. If there's a CAPTCHA acting as a "proof of work", you write a quick API to outsource solving those. Either host a porn site where when users are filling out your CAPTCHA, they're filling out the target's. Or just pay people in shitty countries nine bucks an hour to solve them.
There's a point where almost every known message-based anonymous community just shrivels up and collapses under this scenario. It happened to Frost, the message board system on Freenet.
It's not about "restricting" someone to one voice, it's actually about restricting assholes to less than fifty million voices. A community gets to define what "one voice" is. Maybe it's legitimately a thousand posts a day. But not a million a day.
Haven't read the paper; probably will shortly. But I'm excited to see that people are working on the problem. For as much effort as people put into building anonymous transport mechanisms, there are bigger, Layer 8 problems that prevent widespread adoption of anonymous communications. And accountability is a big one.
-
yeah but with true freedom there comes a price, we should all know that. the price of true freedom on a message board is spam. srf does alright, we get a bit of spam but its not so bad for the value that we gain. at least i know i have some form of security. if there was something in the code somewhere that was identifying me to just one account then it would always be on my mind that it could be exploited to reveal my real identity or ip address. i think i'm ok with the spam thanks. its a bit annoying but its not that annoying.
-
yeah but with true freedom there comes a price, we should all know that. the price of true freedom on a message board is spam. srf does alright, we get a bit of spam but its not so bad for the value that we gain.
Yeah, mainly because of the newbie forum and a lot of work by the mods. Think about how many man-hours are wasted on that bullshit, when it could be automated, as an optional feature built into the anonymity protocol.
at least i know i have some form of security. if there was something in the code somewhere that was identifying me to just one account then it would always be on my mind that it could be exploited to reveal my real identity or ip address. i think i'm ok with the spam thanks. its a bit annoying but its not that annoying.
They claim to offer, "accountable anonymity, giving users strong guarantees of anonymity while also protecting online groups or forums from anonymous abuse such as spam, Sybil attacks, and sockpuppetry
Again, not sure how they did it, haven't read the papers yet, but they address your concern.
-
yeah but with true freedom there comes a price, we should all know that. the price of true freedom on a message board is spam. srf does alright, we get a bit of spam but its not so bad for the value that we gain.
I agree. But we're talking about methods to build systems that inherently withstand threats that SRF could not.
SRF works as an anonymous forum not because it withstands the sorts of attacks that I'm talking about trying to mitigate, but because it doesn't face them. Sure, there are spammers. There are people who probably wish it would go away. But they're not heavily motivated, skilled enough, or they're resource-constrained. Remove those restrictions, and SRF couldn't exist. Tor's network+hidden services+ a HTTP server + Simple Machines forum code don't give DPR and company the tools necessary to withstand attacks on that scale.
For a clearnet example, look at DoS attacks against Mastercard or PayPal. They have a very viable economic motivation for withstanding attacks, signficant resources to counter those attacks, and every first-world government (and LE agency) on their side. They have technical advantages on their side that existing Tor/i2p/etc infrastructure does not (the ability to cause incoming traffic to be filtered before it gets to them etc) And they still get knocked the fuck offline every once in a while. They stop those attacks (assuming they don't just have to wait them out) using a combination of effective upstream filtering and LE kicking down doors. Neither is a viable protection method for anonymous communities.
at least i know i have some form of security. if there was something in the code somewhere that was identifying me to just one account then it would always be on my mind that it could be exploited to reveal my real identity or ip address.
Again, I agree. But any mechanism providing effective anonymous accountability is going to allow you to choose to either "be" an identity, or shed that skin when you need to. We have that here through simple, manual methods. You either choose to log into SRF as "isallmememe" (by entering your password to SRF and proving you're the anonymous individual who is accountable for isallmememme's actions) or you don't. You're making the choice to assume that specific anonymous identity. If you just want to read the stupid shit that I'm posting, you could do that without assuming that isallmememe identity. You get to decide, and that's the way it should work in any system providing anonymity.
It's when someone wants to post a few million messages to the SRF Security forum in the next ten minutes (making it unusable for all of us), or wants to post kiddie porn images to a message board that we need the ability to hold them accountable for those actions. Because regardless of anarchist/libertarian beliefs, at some point, you have to at least have the option to shun someone from contributing to your community, or its not your community. Or more accurately, the other members of the community won't be able to find the core community within the steaming pile of rubble that they find.
-
coming out of a university they're gonna have to convince me hard that the govt don't have some secret backdoor into this thing. until then i'm remaining skeptical.
-
yeah but with true freedom there comes a price, we should all know that. the price of true freedom on a message board is spam. srf does alright, we get a bit of spam but its not so bad for the value that we gain.
Yeah, mainly because of the newbie forum and a lot of work by the mods. Think about how many man-hours are wasted on that bullshit, when it could be automated, as an optional feature built into the anonymity protocol.
as i said freedom comes at a price.
-
@ECC_ROT13 - again its the price of freedom. whether that's millions of men and women dying in a war, a bunch of kids being shot up in a school, or having to occasionally see and hear words and images we don't really want to, that's what freedom is. its not free to have freedom, it costs us once in a while.
being able to live without those things is the exact opposite of freedom. why can't everyone just accept that. its when you try to offer an alternative without those things that we end up with tyranny. only a form of tyranny and control can offer that reality.
this board doesn't get spammed up that much because it isn't cost effective to spam it up. by the time you've accumulated 50 posts you just go and get your account banned within the hour. that isn't really worth the little bit of money you can make from it.
-
this board doesn't get spammed up that much because it isn't cost effective to spam it up.
This board exists as an anonymous community only because right now, there's an gap between available technologies (Tor, hidden services, and Bitcoin, to support SR to support this board) that has outpaced the capabilities of the people who don't want it to exist.
At the point where that balance changes, this board will cease to exist. Flood attacks, LE kicking down doors and shutting down servers for SR's activities, you name it.
What research into PIR, EKS, accountable anonymity, and all the other (annoyingly technical bullshit to most people here) buys us all is a chance at still having that ability to still have an anonymous conversation after that shift in balance occurs.
-
any platform that doesnt allow for child pornography, terrorist organizations, drug dealing and general bullshittery is not any platform i want to be a part of.
-
"accountable anonymity, giving users strong guarantees of anonymity while also protecting online groups or forums from anonymous abuse such as spam, Sybil attacks, and SOCKPUPPETRY"
SOCKPUPPETRY
Do I need to spell out what that means?
-
any platform that doesnt allow for child pornography, terrorist organizations, drug dealing and general bullshittery is not any platform i want to be a part of.
SR doesn't allow child porn, so bye!
-
not sure why anyone would call them 'american nazis' when they are more like 'american bolsheviks'
Because the US is a textbook example of fascism? Nationalism, militarism and big business running the show = fascism.
it does have its own special blend though... seems like they have perfected some neo version of a few of these all rolled into their NWO
I still think India is the black swan in all this.. which way they go will be the domino (go along with the BRICS once they really try for that secondary economy, or bow to the West)
Corporate America is so reliant on China & India it seems only Russia is an outsider and TPTB seems to be fine moving all their chips into China & India while a lot of America goes to hell
-
Fascists and commies are no doubt close relatives so I understand why you'd call the yanks bolsheviks. I would do it too if I wanted to upset the typical retards who babble about the US being a free market.
Sure, american fascism may have some particular features of its own, but it still is fascism.
As to india, I don't know. I haven't been paying much attention to it (my fault obviously). But my uninformed opinion is that china is more important than india in the grand scheme of things.
-
So, back to this "dissent" thing which is indeed a system to squash dissent.
They make it clear that they are not going to allow sockpuppetry - that is, after censoring all the people who challenge their nazi politics they'll make sure that those people are completely unable to post anything anymore.
Right! "freedom is slavery"
-
Subbed ;)
Peace & Hugs,
Chem
O0
-
As to india, I don't know. I haven't been paying much attention to it (my fault obviously). But my uninformed opinion is that china is more important than india in the grand scheme of things.
India just nuked its currency by 25% recently which makes it even more attractive for outsourcing corp america jobs.. but what if China finally unpegs their yuan and it competes with the dollar? Indians might start chasing yuan.. and also manufacturing gets more expensive for America... things can get really ugly.. WWIII can happen
I have to admit I need to read more on Dissent project instead of making a turn on this thread ;)
-
Can this be run on top of Tor? Should be possible, I suppose.
On OVDB there were plans to create some kind of p2p system similar to dissent, which would run on top of Tor. It would basically be a hidden decentralized communication infrastructure which can't get DDoS'd or taken down like hidden services. Does anyone know what happened to these plans? Is is still getting developped?
that is, after censoring all the people who challenge their nazi politics they'll make sure that those people are completely unable to post anything anymore.
It says the users will be able to block other users. It doesn't say that some dictator group decides which users gets blocked.
-
It says the users will be able to block other users.
Right. People with dissenting opinions will be blocked by the mob.
It doesn't say that some dictator group decides which users gets blocked.
The mob does. Also, in the name of fighting "spam" and "sockpuppetry" user accounts will somehow be linked to real individuals in a way that can't be circumvented.
-
People with dissenting opinions will be blocked by the mob.
People with dissenting opinions are *always* blocked by the mob.
It's just a matter of how dissenting the opinion has to be before the mob turns on the author. Some asshole shows up here and spends all day posting guides to molesting children, and you'll see this mob block his dissenting opinion. They'll delete his posts, nuke his account. Then we can go back to talking about how ANY censorship is a breach of our fundamental human rights, and anyone who would ban someone over a dissenting opinion is a piece of shit.
It's just how the world works. All I'm saying is that better technical mechanisms for doing that (preferably with a scalpel instead of a chainsaw) are helpful. And I'm not saying Dissent is it. I'm saying I'm glad somebody is coming up with new ideas to address the problem. I'm still reading through the Dissent stuff. And I'm still not understanding how they propose to restrict or not restrict initial group membership, and that's somewhat important in their scheme. Any good method should allow some sockpuppetry, but should make it expensive. If it takes you twelve hours of active work to build
a sockpuppet, you'll use it more carefully.
-
any platform that doesnt allow for child pornography, terrorist organizations, drug dealing and general bullshittery is not any platform i want to be a part of.
SR doesn't allow child porn, so bye!
wow so u dont like free speech, what a tool. FYI SR isnt a platform its a site, but u would know that being a computer expert pshfmpt!!
-
i think you'll find less people will block each other than you think. on totse the only time anyone would block another user was through personal conflict, but it was always seen as a pretty faggy thing to do to block another user. which it is really. just because you read somebodies view that you don't agree with, no matter how out there it is, doesn't mean you have to accept what that user says. cool people accept that others have opinions that differ from their own. so just read it, disagree and move on. simple.
-
I don't like the way dissent goes about kicking users or whatever. I prefer whitelisting by default, meaning only people who you whitelist can send you messages that you obtain. There is no reason for a group of people to decide who gets to talk, it is up to you to decide who you should listen to.
-
If it is based on DC-net's it is probably indeed provably anonymous to the set size, but the set size will be small. Also they will really need to be able to boot misbehaving spammers because only one person at a time can send a message on a DC-net.
edit: though you can have redundant DC-nets using the same infrastructure the get around that. It still isn't considered a very scalable solution.
I should also point out that a DC-net actually isn't totally immune to traffic anaylsis over time. If three people get together and want to anonymously answer the question "did one of us pay for the meal we are sharing", then yeah , if it is implemented correctly it is indeed perfect anonymity in that the question can be answered without the possibility of anybody knowing who paid for the meal. But once you get to the point of anonymous communicating pseudonyms in a large network that changes over time, although you cannot tell who sent a bit as a certain pseudonym from a single snapshot of the network, over time intersection attacks are still possible with node churn. So a DC-net is only perfect anonymity to the set size if the set size starts at X members and never loses a single member or gains a single member. Theoretically they can indefinitely be perfectly anonymous to the maximum degree possible (to the entire set size) but in the real world the anonymity they provide doesn't stay perfect for long.
I suppose it is almost but not quite like how a one time pad is perfectly secure encryption.....until you send the key encrypted with RSA. Or forget that you only send two messages of different sizes and forget to pad them ;). DC-net is perfectly anonymous communication, until people leave the network, or new people join the network, or the NSA cuts the internet to your country and waits to see if your pseudonym keeps communicating on the network.
-
wow so u dont like free speech, what a tool.
No, I don't like free speech, and I've made my opinion on this subject abundantly clear before. Ironically, the long thread where I debated this with Roky Erickson about 8 months ago was deleted. Free speech is a bullshit idea that mostly Americans are brainwashed to believe in. It sounds nice in theory, but in practice it leads to 99% noise. Furthermore, nobody actually believes in free speech. Even on this forum, which is touted as a free speech playground, spam, scams, doxing, and cp stories are banned. Find a truly unfettered speech platform and I will show you 99% bullshit and noise.
You learn and grow through the judicious consumption of information, not by consuming every piece of input you encounter. Your own brain works that way. You filter out the vast majority of information hitting your senses. You would be overwhelmed if you tried to take it all in. Schizophrenia is a condition where they lose that filter, and it is sometimes described as "watching 500 TV channels at once". There's nothing enlightening about that. As ECC alluded to above, censorship isn't about blocking dissenting views. It's about filtering out someone shouting nonsense a million times into the void. It's about filtering out the 99% noise that invariably pervades, consumes, and destroys any truly "free speech" platform.
Every healthy community and discourse REQUIRES moderation.
-
BTW, this forum would be a useless shithole if it didn't ban the spams, scams, and doxing.
Find an abandoned forum, filled with thousands of spam messages. That's free speech. That's what no censorship or moderation gives you.
If you hate censorship so much, you can hang out there. Have fun on your "free speech" platform.
-
Wow. this is getting stupider and stupider.
Or should I rather say that little nazis like Astor are showing their true colors?
By the way, it's funny how SR is touted as some kind of libertarian enterprise, but the kind of people it attracts are hardly libertarians at all.
edit : Oh, and looks like some security experts never heard about usenet. Unmoderated.
-
coming out of a university they're gonna have to convince me hard that the govt don't have some secret backdoor into this thing. until then i'm remaining skeptical.
Lol dude if you don't trust security software coming out of university you better pack up your bags and go home, or over to I2P at least ;). Even Freenet has ties to the academic community.
-
Wow. this is getting stupider and stupider.
Or should I rather say that little nazis like Astor are showing their true colors?
By the way, it's funny how SR is touted as some kind of libertarian enterprise, but the kind of people it attracts are hardly libertarians at all.
Right, so if I posted your personal info on this forum, you wouldn't want it deleted?
If a spammer posted 100 spam posts in a row like he used to do, to where the first real thread was 5 pages deep, you wouldn't want it deleted?
You are in favor of censorship too. I'm just not a hypocrite about it.
We are qualitatively the same, we only differ in matters of degree, how judiciously we would apply that censorship.
And calling me a Nazi for saying that just proves you are the noise.
-
this board doesn't get spammed up that much because it isn't cost effective to spam it up.
This board exists as an anonymous community only because right now, there's an gap between available technologies (Tor, hidden services, and Bitcoin, to support SR to support this board) that has outpaced the capabilities of the people who don't want it to exist.
At the point where that balance changes, this board will cease to exist. Flood attacks, LE kicking down doors and shutting down servers for SR's activities, you name it.
What research into PIR, EKS, accountable anonymity, and all the other (annoyingly technical bullshit to most people here) buys us all is a chance at still having that ability to still have an anonymous conversation after that shift in balance occurs.
Indeed, it is time for the new generation of security software. No doubt about it. The paradigm of Tor and I2P style networks is over, we need to move fast too I am afraid. The software we are using now can be replaced by so much better designs that it is kind of scary really. Even GPG is using what RSA and CAST5 by default? Time for some ECDH and AES-256 up in this bitch. The future of anonymity is going to be based on mixes and systems that are "PIR-like", I very strongly believe this. We need to march forward and we need to do it fast, because I don't think the old model is going to last much longer. And indeed I think it is already near its end. CP has been the canary in the coal mine for Tor for a long time. Right now there is not CP on Tor, I think that speaks volumes.
However I dislike the idea of revocable anonymity etc. I like the idea of whitelist by design a lot better. As I said before, it shouldn't be about some group picking who can talk but rather about individuals picking who they want to listen to.
any platform that doesnt allow for child pornography, terrorist organizations, drug dealing and general bullshittery is not any platform i want to be a part of.
Any platform that tries to restrict information is something I want nothing to do with.
wow so u dont like free speech, what a tool.
No, I don't like free speech, and I've made my opinion on this subject abundantly clear before. Ironically, the long thread where I debated this with Roky Erickson about 8 months ago was deleted. Free speech is a bullshit idea that mostly Americans are brainwashed to believe in. It sounds nice in theory, but in practice it leads to 99% noise. Furthermore, nobody actually believes in free speech. Even on this forum, which is touted as a free speech playground, spam, scams, doxing, and cp stories are banned. Find a truly unfettered speech platform and I will show you 99% bullshit and noise.
You learn and grow through the judicious consumption of information, not by consuming every piece of input you encounter. Your own brain works that way. You filter out the vast majority of information hitting your senses. You would be overwhelmed if you tried to take it all in. Schizophrenia is a condition where they lose that filter, and it is sometimes described as "watching 500 TV channels at once". There's nothing enlightening about that. As ECC alluded to above, censorship isn't about blocking dissenting views. It's about filtering out someone shouting nonsense a million times into the void. It's about filtering out the 99% noise that invariably pervades, consumes, and destroys any truly "free speech" platform.
Every healthy community and discourse REQUIRES moderation.
Can't help but disagree with this. Free Speech is good. Free Speech means anybody can say whatever they want, it doesn't mean that you are forced to listen. CP stories being banned is hilarious, but at least DPR lets us have free speech regarding CP which I highly appreciate. This forum is quite libertarian and free speech seems to be pretty accepted here. But the problem is you are thinking in an old paradigm. See, we don't need a group of moderators to tell people what they can say, we need to give people the power to only listen to those they want to while effortlessly ignoring the others. See, I don't care if a forum is full of spam, so long as I can press a button and not have to see any of it. I want people to be allowed to spam, I don't want to censor them. But I don't want them to censor me either, by making it so I can not navigate what I want to through their spam. The answer is not to tell them what they can or cannot do but rather to empower the user so that the user can get to what they want without running into things they don't want. You confuse free speech with some other concept I think.
Censorship = telling others what they can say
Empowerment = helping others only listen to who they want to
Censorship is *always* bad.
BTW, this forum would be a useless shithole if it didn't ban the spams, scams, and doxing.
Find an abandoned forum, filled with thousands of spam messages. That's free speech. That's what no censorship or moderation gives you.
If you hate censorship so much, you can hang out there. Have fun on your "free speech" platform.
Censorship is one way to try to solve the problem of spam and noise. Another solution is to let people select to only listen to things they want to, while ignoring the noise. I personally am in favor of empowering users rather than censoring them.
-
If a spammer posted 100 spam posts in a row like he used to do, to where the first real thread was 5 pages deep, you wouldn't want it deleted?
Yep. Security expert is clueless about Usenet, full of spam, and yet useful for discussion.
And calling me a Nazi for saying that just proves you are the noise.
Every healthy community and discourse REQUIRES moderation.
OK, and now for the kicker :
Every healthy community and discourse REQUIRES BANNING DRUGS.
Spoken like a true nazi eh.
.
-
For the most part, centralized moderation has been used to control floods, opinions/content that dissent too greatly from the accepted norm within the community, etc. Forums, including this place, operate on that basic concept.
Longterm, a core problem with centralized moderation in anonymous communities is that to centralize moderation, you have to centralize a level of control over the community at one point. And when that central point disappears, so does the community. With somewhere like SRF, it's obvious. Remove the forum software, Tor, and the moderators, and the SRF community is removed.
I think the advantage of PIR systems, and other decentralized architectures, is that the individual decides what to view and what to censor. Managing that successfully is difficult. One problem is the the encrypted transfer mechanism, and a working PIR/etc would address that.
The other problem is managing trust. Freenet's Web of Trust is a good example of a step in the right direction. I decide who I trust enough to listen to their messages, and by extending that trust to who *they* choose to trust to a controlled degree, I end up with a filtered view of all of the messages that's relatively free from spam and content that I don't want, and *I* am the censor for what I view. That's the way it should be. They are anonymous, but they are accountable.. to me. If they create fifty sockpuppets, what do I care? I'll chose not to trust their sockpuppets.
But for true resistance to attacks, my whitelist has to control which messages I actually retrieve.
Usenet is a great example to throw into the mix. I'd argue that effectively, Usenet actually has central administration, in the form of the admins over the NNTP servers, who chose how to handle control messages. Anyone can spoof a cancel message to remove a dissenting opinion. Anyone can flood a group with tons of messages. Then people can spoof cancel messages for the spam. Or they can add the sender to killfiles.. but in that case, they're still downloading the messages, then their client is filtering them. I can tell you firsthand that that doesn't work well when a group gets flooded beyond a certain level and you have a finite amount of bandwidth.
Also, most of the technical controls that you can enforce on Usenet ultimately boil down to your ability to leverage the source address of the originating sender. Throw in an anonymous transport mechansim, and they become largely worthless.
Freenet FMS, from my limited experience with it, appears to be Usenet with a Web of Trust dropped on top, all transported over Freenet. And clients only retrieve messages from authors they trust to some degree. That's probably the closest combination available today. Want more messages? Extend that trust. Oops, not that guy's one friend, he's posting CP. Fair enough. Lesson learned. Block that guy.
I've never taken a serious look at Freenet, but I think I need to rectify that. I always threw it in the "just for downloading warez and movies and shit" category, but that's unfair.
-
Afaik with Usenet it is essentially everybody gets everything PIR, in that everybody gets every message for the group or whatever. I am not sure though I never used Usenet but have read about it in the context of remailer networks. I think the original design for a PIR based anonymity system was actually intended to be an improvement on usenet. In the old model it was remailer to Usenet, then download the entire Usenet set of messages and filter for the ones you want. Pynchon gate brought down the bandwidth requirements by orders of magnitude by replacing Usenet + everybody gets everything with a simple distributed PIR that was not everybody gets everything.
I agree the only person that should censor what you see is yourself. And you should be able to do so easily and in a way that is scalable (ie: not downloading all messages and then ignoring them, but never downloading messages you want to ignore in the first place). All of these things can be done with PIR like systems (I am more interested in PSS and OWI but really all of these fancy new things are like extensions of PIR, other than EKS I suppose which is like PSS without the PIR-like aspect). But really it isn't even a requirement for it to be PIR-like, it just needs to be pull instead of push. The same thing helps for anonymity as well, the main reason Pynchon Gate is superior to Single Use Reply Blocks (SURBS) is because with a SURB based remailer an attacker can push messages to you but with Pynchon Gate PIR the clients pull messages of their own choice. This prevents an attacker from spamming you with a billion messages and then watching to see which node gets a billion messages pushed to it, since with Pynchon Gate PIR if the attacker spams you wil a billion messages you just don't pull any of them.
I made a pretty good basic design for a whitelisting messaging system I think, it is inherently whitelisting in that by the very nature of the system users cannot communicate with each other until they both have accepted to whitelist each other. Invitations to whitelist can be carried out through shared contacts as rendezvous, after an initial out of band social bootstrapping to get a few social contacts using the system.
I think Freenet is the most interesting and innovative of the anonymity networks personally.
-
I agree the only person that should censor what you see is yourself.
Censorship is not antithetical to libertaraniasm or any other form of minarchism or anarchism, only to one brand of brainwashed American Libertarianism. I never once mentioned the government. Censorship can and should be applied by the person who owns the property. If I ran a forum, I would moderate it, and it would be a far healthier place than a forum run by anyone here who advocates no censorship at all (even though in practice there's all kinds of shit they would censor). Here it's spam, scams and doxing, but they let trolls ruin the place to some extent.
BTW, Tor relays are allowed to set their own exit policies. That's censorship! If anyone here hates censorship that much, they should get off Tor now, because it is a platform that allows censorship. In fact, you can set ExitPolicy reject *:*, so you don't have to allow anyone to access clearnet sites from your relay. The relay is your personal property, and I and the Tor developers advocate the right to do what you want with it, and not to subjugate yourself and your property to whatever other people want to shout on it.
Right now we are experiencing the fantastic merits of no censorship, because the Tor network is being crippled by a botnet that we can't stop. That is 2 million voices shouting without moderation. Fuck that noise. We must censor that botnet to keep the Tor network alive.
-
Nobody is allowed to shout nonsense on my lawn. I have every right to censor them by kicking them off my property. Censorship is compatible with anti-statism, because it is subjugated by property rights. There are also perfectly legitimate laws that censor speech, particularly when it causes direct harm, such as yelling fire in a crowded theater and causing a stampede that causes physical injury to people, or libeling someone and destroying their life through false accusations. Yes, there many good reasons to censor, and the free speech absolutists have an childish understanding of free speech and censorship.
-
OK, and now for the kicker :
Every healthy community and discourse REQUIRES BANNING DRUGS.
Spoken like a true nazi eh.
This once again has nothing to do with the government. Any business owner with half a brain would ban drug use on the job, while you can do whatever the fuck you want on your property.
In my last few posts, I have made several arguments and used specific examples to back them up, and you've addressed none of that. Your only response was to repeat mindless mantras that you've been brainwashed with ("derp, anyone against free speech is Nazi!"). Nothing you've said has been edifying or advanced the conversation, which is why I say you are the noise.
-
OK, and now for the kicker :
Every healthy community and discourse REQUIRES BANNING DRUGS.
Spoken like a true nazi eh.
This once again has nothing to do with the government. Any business owner with half a brain would ban drug use on the job, while you can do whatever the fuck you want on your property.
In my last few posts, I have made several arguments and used specific examples to back them up, and you've addressed none of that. Your only response was to repeat mindless mantras that you've been brainwashed with ("derp, anyone against free speech is Nazi!"). Nothing you've said has been edifying or advanced the conversation, which is why I say you are the noise.
Hows this, spamming is not speech and is a form of censorship, blocking all other communications to proffer your own.
Calling a forum "your property" just because you run it is like the government saying that it can do what it wants with your property because it owns it. Haha it does do that, sucks doesnt it?
Trolls offer dissenting opinions that reflect the underlying problem in the issue, hence why people get pissed about it. This is still a form of contribution whether or not you see any value in it. Editing and removing discussion because it doesnt fit in with YOUR view of how things should be done is classic fascism, everything must be done towards a specific end and everything else must be suppressed. Read a history book and tell us how thats worked out for various countries over the years.
Or do what every other fascist wannabe fuck does and create a forum for people who think just like you so u all circle jerk eachother and feel supreme.
-
Hows this, spamming is not speech and is a form of censorship, blocking all other communications to proffer your own.
Spamming, like all advertising, is a form of speech. These things are not mutually exclusive. When you shout at the top of your lungs, other people can't hear me, so your speech censors my speech. Your own argument proves that in order to have useful speech you have to censor that shitty speech that would otherwise drown out everything else, which is what I call noise.
Calling a forum "your property" just because you run it is like the government saying that it can do what it wants with your property because it owns it.
You own the forum because you legitimately paid for the server that runs it, and yes you can do whatever you want with it. I am not your slave. I don't have to let you do anything with my property.
But if your argument is true, that's cool. I'll be crashing on your couch tonight. I mean, you can't just kick me off and do whatever you want with it just because you "own" it.
Trolls offer dissenting opinions that reflect the underlying problem in the issue, hence why people get pissed about it.
LOL, I bet you think you are some kind of intellectual revolutionary.
This is still a form of contribution whether or not you see any value in it. Editing and removing discussion because it doesnt fit in with YOUR view of how things should be done is classic fascism, everything must be done towards a specific end and everything else must be suppressed. Read a history book and tell us how thats worked out for various countries over the years.
Nazis AND Fascism, we are on a roll. Is that your great contribution to this debate?
There's a difference between dissenting opinion and trolls. You can offer a dissenting opinion in a way that is edifying to other people. How many people here have been enlightened by trolls?
Trolls are noise. The majority of trolls here don't offer an "opinion" that challenges status quo assumptions, they instigate in order to get a reaction, because they are bored little boys in their basements.
-
I agree the only person that should censor what you see is yourself.
Censorship is not antithetical to libertaraniasm or any other form of minarchism or anarchism, only to one brand of brainwashed American Libertarianism. I never once mentioned the government. Censorship can and should be applied by the person who owns the property. If I ran a forum, I would moderate it, and it would be a far healthier place than a forum run by anyone here who advocates no censorship at all (even though in practice there's all kinds of shit they would censor). Here it's spam, scams and doxing, but they let trolls ruin the place to some extent.
BTW, Tor relays are allowed to set their own exit policies. That's censorship! If anyone here hates censorship that much, they should get off Tor now, because it is a platform that allows censorship. In fact, you can set ExitPolicy reject *:*, so you don't have to allow anyone to access clearnet sites from your relay. The relay is your personal property, and I and the Tor developers advocate the right to do what you want with it, and not to subjugate yourself and your property to whatever other people want to shout on it.
Right now we are experiencing the fantastic merits of no censorship, because the Tor network is being crippled by a botnet that we can't stop. That is 2 million voices shouting without moderation. Fuck that noise. We must censor that botnet to keep the Tor network alive.
Well, censorship can be seen in two ways. If you have private property I agree you should be able to restrict what people do on it. SR can ban CP etc it is morally fine in my opinion they own the server. So I am fine with censorship in such cases. The censorship I am against is when a government tells people what they can or cannot say on their own property. And I think the best solution is volunteer community property that allows anyone to talk and say what they want, but nobody has to listen.
Astor why would your forum be healthier? Why do you even want to run a centralized forum? Wouldn't it be better if everybody networks with who they want to, and the only forum is the way the threads are organized by the individual user? You seem to advocate for a hierarchical system where some designated person is in charge of what can be said, I am advocating for a non-hierarchical system where every individual is in charge of what they see. I could outsource this to you and you censor spam, or I could just not select to listen to people who spam.
-
Nobody is allowed to shout nonsense on my lawn. I have every right to censor them by kicking them off my property. Censorship is compatible with anti-statism, because it is subjugated by property rights. There are also perfectly legitimate laws that censor speech, particularly when it causes direct harm, such as yelling fire in a crowded theater and causing a stampede that causes physical injury to people, or libeling someone and destroying their life through false accusations. Yes, there many good reasons to censor, and the free speech absolutists have an childish understanding of free speech and censorship.
I am not a free speech absolutist, I think that you have the right to tell people not to scream stuff on your lawn. I think in the cyber environment though that it is better if we allow people to remove their own perception of people screaming on their lawn. It is more like someone is screaming on your lawn, but you and everybody else can block out all perceptions of them. So if you don't want them screaming on your lawn, you press a magic button and they vanish from your own perception without a trace, but other people can decide for themselves if they want to hear and see the screaming on your lawn or not. In the cyber environment we can give much more fine grained control to people and I think this is superior. I don't imagine a single forum with leaders and such, I imagine a shared forum-space where every user is the leader of their own perception.
-
I am not a free speech absolutist, I think that you have the right to tell people not to scream stuff on your lawn. I think in the cyber environment though that it is better if we allow people to remove their own perception of people screaming on their lawn. It is more like someone is screaming on your lawn, but you and everybody else can block out all perceptions of them. So if you don't want them screaming on your lawn, you press a magic button and they vanish from your own perception without a trace, but other people can decide for themselves if they want to hear and see the screaming on your lawn or not. In the cyber environment we can give much more fine grained control to people and I think this is superior. I don't imagine a single forum with leaders and such, I imagine a shared forum-space where every user is the leader of their own perception.
That sounds good to me, but will it protect against flooding of the network that ends up censoring all the good speech? We are witnessing it now with the botnet or whatever it is. I wasn't able to connect to the forum for about a day and it's still intermittent for me, so I am being censored by 2 million idiots shouting circuits into the network, even though I can't see them.
-
Astor why would your forum be healthier? Why do you even want to run a centralized forum? Wouldn't it be better if everybody networks with who they want to, and the only forum is the way the threads are organized by the individual user? You seem to advocate for a hierarchical system where some designated person is in charge of what can be said, I am advocating for a non-hierarchical system where every individual is in charge of what they see. I could outsource this to you and you censor spam, or I could just not select to listen to people who spam.
Like I said, that sounds good to me, but my comments were based on the way most forums work. If I could granularly choose which parts of a forum to see, I would do it. There is an ignore feature on this forum, but it appears to be nonfunctional. I've added people like joywind and tedrux to it, but I still see their posts.
-
Spamming, like all advertising, is a form of speech. These things are not mutually exclusive. When you shout at the top of your lungs, other people can't hear me, so your speech censors my speech. Your own argument proves that in order to have useful speech you have to censor that shitty speech that would otherwise drown out everything else, which is what I call noise.
spamming/advertising is speech in the same way that me pounding your girlfriend is community service.
You own the forum because you legitimately paid for the server that runs it, and yes you can do whatever you want with it. I am not your slave. I don't have to let you do anything with my property.
the fuck, u like 50 something? its like having a discussion with a vet, what, what u say, back in my day we used to carry pickles around on easter sunday. yeah a bunch of revolutionaries wrote the fucking constitution that doesnt make it theirs.
But if your argument is true, that's cool. I'll be crashing on your couch tonight. I mean, you can't just kick me off and do whatever you want with it just because you "own" it.
u can crash if u want but u got to make dinner. or i can be a dick like u and kick u out cause i dont like what u made.
Nazis AND Fascism, we are on a roll. Is that your great contribution to this debate?
censorship is a slippery slope my friend.
There's a difference between dissenting opinion and trolls. You can offer a dissenting opinion in a way that is edifying to other people. How many people here have been enlightened by trolls?
so the diff between trolling and debating is how hard u rub the other guys cock?
Trolls are noise. The majority of trolls here don't offer an "opinion" that challenges status quo assumptions, they instigate in order to get a reaction, because they are bored little boys in their basements.
thats funny, i thought bored little boys in basements was ur thing. i mean, why else would u know so much about computers.
-
I think everyone's perfect middle ground is close to Usenet with anonymity and a trust concept wrapped around it. You define, through simple technical means, whose messages you want to see and not see. Generally, most people would start out wanting to see everyone.
Something like the Web of Trust (WoT) concept is great, but taken too far, it prevents new members from joining unless they know an existing member. That might be great for a handful of forum types (carders, CP, etc), but not for the ones I'd be interested in.
The only real downsides to a WoT concept are the technical difficulties with making it easy to manage, and the fact that the more flood/DoS/spam a group gets, the harder it is for new members to be trusted and have people see what they have to say (because people will grow to have digitially "distrustful" settings, and new users who don't know someone will have difficulty bridging that gap).
kmfkewm is perfectly correct that we do, in fact, need some ECDH+AES256 up in this bitch. There's an increasing risk that everyone is currently spending all their time building clever architecture but letting RSA/DSA/DH/ElGamal secure the key exchanges, and I think they're going to regret that. I really was floored by the $11B + 35,000ppl going towards NSA's Consolidated Crypologic Program in that leaked budget. That's a lot of money, and a lot of people.
Of course, with the way the world works, once somebody builds some amazing, PIR/EKS-based system relying on elliptical curve based schemes, as soon as the US government gets sick of it, I can see them encouraging Blackberry/Certicom to push patent infringement suits against the node operators. It's weird, it's almost like they're all up in everyone's business, all the time. I wonder why I get that feeling.
-
Something like the Web of Trust (WoT) concept is great, but taken too far, it prevents new members from joining unless they know an existing member. That might be great for a handful of forum types (carders, CP, etc), but not for the ones I'd be interested in.
The only real downsides to a WoT concept are the technical difficulties with making it easy to manage, and the fact that the more flood/DoS/spam a group gets, the harder it is for new members to be trusted and have people see what they have to say (because people will grow to have digitially "distrustful" settings, and new users who don't know someone will have difficulty bridging that gap).
A WOT or a whitelist is basically an invite-only forum. kmf's unique take on it is that everyone builds their own invite-only forum. As you point out, one issue is how do you find new people and content?
I think a whitelist is too restrictive. I'm actually fine with 95% of people who post on this forum, for example. I'd just like an ignore function that works. When you add someone to the ignore list, it would hide the following:
1. Any thread started by that person
2. Any post by that person in other people's threads
3. Any post that quotes that person
4. PMs from that person
The Philosophy, Economics and Justice subforum would be pretty empty for me in that case, but at least I would enjoy the content that I did see. :)
-
A WOT or a whitelist is basically an invite-only forum. kmf's unique take on it is that everyone builds their own invite-only forum. As you point out, one issue is how do you find new people and content?
I think a whitelist is too restrictive. I'm actually fine with 95% of people who post on this forum, for example. I'd just like an ignore function that works [...]
That's what's beautiful about a WoT and a whitelist done right. As long as you can have the concept of negative trust (i.e. distrust) that's below the baseline level of trust you operate at, you can get exactly that.
Pretend all senders start with a trust value of zero, and you can set a trust threshold (your minimum trust score you want to see messages from) and can assign a trust level to senders ranging from -10 to +10. You set your thrust threshold level to 0. You find that you're getting more spam and assholes than you like. You raise your threshold to 2. They disappear, except for that one guy you really hate listening to. Assign him a trust of -5. He's gone from your view.
Meanwhile, Kiwikiikii is in the same forum. He wants to see all messages, because he feels that any censorship is bad. He sets his trust threshold to -10, and presto, there are all the posts. He realizes that he's sick of the Viagra spam that keeps showing up,so he sets his threshold to -9 instead. Then he distrusts the Viagra spammer to a value of -10, and he has a perfect world. Later, he logs in from a slow connection and realizes that he doesn't have the patience to download twenty million messages that some spammer has uploaded. He temporarily raises his threshold to say.. 0 or 2, and he can see the posts from his friends. Later on, he can log in from a fast connection with his threshold set to -9, and sort through the 20 million messages to see if any have value.
It's not a perfect world, and the weirdest part is that everyone would have to get used to NOT making the assumption that everyone can see every message in a thread. But, completely without ANY central administration of any sort, it's pretty damned close. Obviously, you'd probably want more granularity than -10 to +10 in a real system, but it's very doable.
And people only get censored by listeners who don't want to hear what they have to say.. and that censoring only affects that single listener. It's the listener's loss. Or their right. Same difference.
-
The karma system on this forum is a WOT of sorts. Your proposal would make it actually useful, as opposed to the vanity that it is now.
For example, by blocking all users who have net -20 karma or lower, I would catch almost everyone that I would like to ignore.
I think it's an excellent idea.
Anyone who wants to read everything can go ahead.
-
What's great about the right WoT is that over time, the more you post things that other people think are worth hearing, the more your inherited trust score goes up. And more people see your messages. The more you post worthless things that everybody wants to ignore, your inherited trust score goes down. And less people see your messages. Individually, they can configure their own settings to see whatever they want, so if they have a high appetite for frustration, they're happy. The lazy masses who don't want to screw with it can just rely more heavily on the values they get from their WoT.
If there are two completely opposed groups (say.... predatory pedophiles and Dads with Guns), they will have radically different Webs of Trust. And they'll have radically different (and more appropriate) trust/distrust scores assigned to the same sender. The more moderate parties in either camp will begrudgingly end up with more moderate scores from the opposition, and somewhat civil conversation might even ensue. With the extreme sides of either party never seeing each other unless they chose to.
One of the keys to really making the experience an improvement is to be able to dynamically adjust your trust threshold. You know how sometimes, you go to a message board and just want to catch up on threads with the people you enjoy the most? Set that threshold to 8, 'cause you already set them to +10. Later on, you have more time, and decide you want to broaden your horizon. Set it to 2. Okay, more to read. What the hell, let's set it to -5 and see what else is out there. Oh my god, that's bearded midget leper porn! Back up to a 1. Much better.
-
It's not a perfect world, and the weirdest part is that everyone would have to get used to NOT making the assumption that everyone can see every message in a thread. But, completely without ANY central administration of any sort, it's pretty damned close. Obviously, you'd probably want more granularity than -10 to +10 in a real system, but it's very doable.
Ideally the concept of a thread would be some loose and dynamic thing decided by individual posters. For example, I send a message to fifty of my friends talking about a certain topic. Later I get a message from someone else talking about the same subject with 50 other people. To aide in my own organization of information, I merge the two threads together into one, but a response to a message from one thread only goes to the people who started participating in that thread to begin with. The different posts could be color coded based on which base thread they are a part of, although if somebody is part of both it would be problematic. We could have a color that represents a person who is part of every base thread, but then it is still problematic if there are three base threads and a person is only part of two of them. What if we want to merge the thread into a single base thread where everybody can see the entire discussion? How would that be managed? We probably shouldn't let Alice decide that Bob should see a message Carol sent her, even if Alice and Bob and Alice and Carol are talking about a similar topic. Maybe Carol can mark her posts as 'open' or 'private' and an open post could be merged by Alice into a conversation with Bob and all of the history of the thread becomes available to all participants. But a private message is marked as something that someone only wants viewable by the people they selected to view it, ever. Or maybe there can be some other system that decides this.
These details are largely not to do with the underlying cryptography of the system. A lot of them are GUI problems (how do we represent the interweaved threads? color coded posts?). A lot of them are organizational problems. Most of them are higher level issues that don't need to be worried about a whole lot until we have the fundamental cryptographic components taken care of. But they are still important and substantially unanswered problems.
Right now a few people are working on coding a system like this with me. I think we should go public with the code that is already done and show it to people here, and invite people like Astor, SS, ECC_ROT13 etc to participate and audit what is done. We still have unanswered questions, we still have parts to code. Would anybody be interested in seeing the code that is done so far and helping contribute to the project in an organized fashion? What we are working on is not illegal and is not being built for illegal communities, it is merely software for use by those who like the features. But I personally see nothing wrong with including people from this forum, although some others working on it may be hesitant for it to have any apparent connection to illegal activity (because why make something that is not illegal linked to criminals). Unfortunately I already kind of fucked that up by being involved with it and having the original idea for it :P.
-
One option would be to find a way to thread all of the messages together (maybe via a base thread ID that gets created at First Post, and supports the ability to append child thread IDs to the prefix) as a view. Merge or don't merge, it's the client's call.
And come to think of it, if you could adjust your trust threshold using a slider on a per-thread basis (have to watch precedence for threshold, but thread>board/group>community is probably close), you'd be in much better shape. It more accurately reflects real-world group messaging. Generally, I don't care what X has to say, but in this thread, I want to see what he's saying for context if nothing else.
-
We probably shouldn't let Alice decide that Bob should see a message Carol sent her, even if Alice and Bob and Alice and Carol are talking about a similar topic. Maybe Carol can mark her posts as 'open' or 'private' and an open post could be merged by Alice into a conversation with Bob and all of the history of the thread becomes available to all participants. But a private message is marked as something that someone only wants viewable by the people they selected to view it, ever. Or maybe there can be some other system that decides this.
I think that mixing private and group messages too closely in any interface is dangerous, via user mistakes if nothing else. While behind the scenes, they're probably the same thing at a storage/transport level, they probably should be treated as separate from a user experience. I assume they'll be encrypted differently.
If Carol sends Alice a private message, it should be encrypted so only Alice can read it. If she wants to send it on to Bob, she should RE-send the decrypted text, possibly with commentary, encrypted so only Bob can see it. At that point, if Alice has betrayed Carol's trust, that's on Alice. And since she's only sending Carol's alleged text, Carol has full repudiation, "Alice made that shit up. Bob, I don't know what she's talking about.". Private messages shouldn't ever thread with group messages. If you want to win the UI contest, maybe you insert a red placeholder (like a comment bubble) at that point in the group thread for context, but I think the bubble takes you obviously to PrivateMessageLand. And the private message content shouldn't be quotable to group via GUI. Make them paste it in if they want to quote private messages into group discussion.
If she's referencing a group thread in her private message, it should be with the equivalent of a link (the key, hash, whatever) to the thread that allows the recipient to go see it.
-
Right now a few people are working on coding a system like this with me. I think we should go public with the code that is already done and show it to people here, and invite people like Astor, SS, ECC_ROT13 etc to participate and audit what is done. We still have unanswered questions, we still have parts to code. Would anybody be interested in seeing the code that is done so far and helping contribute to the project in an organized fashion? What we are working on is not illegal and is not being built for illegal communities, it is merely software for use by those who like the features. But I personally see nothing wrong with including people from this forum, although some others working on it may be hesitant for it to have any apparent connection to illegal activity (because why make something that is not illegal linked to criminals). Unfortunately I already kind of fucked that up by being involved with it and having the original idea for it :P.
Fuckin A, absolutely! Could we invite other people over time, like get a few invites per week, to see how the system scales?
Oh well, I guess we'll worry about that later.
BTW, is this the system that could evolve into a market with bitcoin/zerocoin integration?
-
I think that mixing private and group messages too closely in any interface is dangerous, via user mistakes if nothing else.
Well, the software would prevent a user from rebroadcasting a private message to unauthorized users. And if users want to circumvent the software in order to do so, well, we cannot prevent leaks.
While behind the scenes, they're probably the same thing at a storage/transport level, they probably should be treated as separate from a user experience. I assume they'll be encrypted differently.
The encryption is the same for either.
If Carol sends Alice a private message, it should be encrypted so only Alice can read it.
Indeed, it will be. But I take the concept of private messaging to mean more than one on one. Carol can send a private message for Alice and Bob to read and respond to. In such a case it will be encrypted for both of them.
If she wants to send it on to Bob, she should RE-send the decrypted text, possibly with commentary, encrypted so only Bob can see it. At that point, if Alice has betrayed Carol's trust, that's on Alice. And since she's only sending Carol's alleged text, Carol has full repudiation, "Alice made that shit up. Bob, I don't know what she's talking about.". Private messages shouldn't ever thread with group messages.
Right now we are not planning to include deniable signatures, although I suppose that isn't a bad idea. Currently all messages are signed with a private key and impossible for the author to deny having written. This is also how identity is managed, a user essentially is their private ECC key. The threading is entirely up to the user. A user could have one giant thread consisting of all private and public messages if they wanted to, although it wouldn't be very well organized I imagine. The organization of the messages is entirely up to the user, with support from the software. It is up to the user to organize the information into their own perception of a forum, however the software should help them not shoot themselves or others in the foot.
If you want to win the UI contest, maybe you insert a red placeholder (like a comment bubble) at that point in the group thread for context, but I think the bubble takes you obviously to PrivateMessageLand. And the private message content shouldn't be quotable to group via GUI. Make them paste it in if they want to quote private messages into group discussion.
Yes, certainly. The issue I am thinking of is this: Alice wants to talk about a subject and she doesn't really care who reads what she has to say. Similar to how people posting here obviously don't care who reads what they say here. But Alice only has five contacts on her buddy list. So she can send them each her message, but that is the extent to which her message propagates. The idea I had is that Alice can mark the message as public, in which case her five contacts can choose to propagate her message to their contacts as well, and to introduce their contacts to Alice via the message. So Alice writes a message and marks it as public, the message is sent to her only contact Bob. The message is about the effects of a certain drug, and Bob happens to be having a conversation about this very same topic with twenty of his other friends. So Bob adds the message from Alice to this thread. Now nobody else Bob is talking with can see the post from Alice or any responses Bob makes to it, even though to Bobs perception they are part of the same thread. But since Alice marked the message as public, Bob decides that it is a good idea to make the other people he is talking about the subject with aware of Alice's post, so they can see the information Alice has to contribute. So Bob presses a button and it merges Alices post into his original conversation with the twenty others, when this happen Bob rebroadcasts Alice's message and contact information to his twenty other friends. He also rebroadcasts the previous messages and contact information in the thread from his other twenty friends to Alice, provided that their messages are marked public as well. Now Bob's peers see the new message from Alice rebroadcast from Bob, and if they like the content of the message they can click a button to whitelist Alice so they can see future posts from her on this or other subjects. The same happens with Alice, she sees the posts from the others and can whitelist them as well. Now all of them can continue to talk with each other about the subject at hand, and also Alice has added new people to her contact list and Bobs friends have all added Alice to their contact list. But let's say one of Bob's friends said something he only wanted the original 20 people (including Bob) to be able to read. So he marked his message private. In this case, Bob does not rebroadcast this specific message to Alice, and if it is the only message from that poster in the thread, alice is never introduced to him, although he is introduced to Alice as her post was public. But as whitelisting needs to be 1:1 this means neither of them will be able to carry out a conversation with each other or see each others posts in the thread, which is not totally ideal as maybe Bob's friend wants to be introduced to Alice but doesn't want her to see the message he marked as private. So perhaps two settings would be the best option, public/private for posts and introduce/hide for the thread in general (public = share this post with your friends, private = this post is just for you, introduce = tell others I am in this thread and help us communicate with each other, hide = don't tell anyone I am in this thread). But the actual thread itself is actually the composite of several base threads. So we could call one a Weave and the other a Thread.
-
Essentially Bob acts as the filter between Alice and his other friends. Bob's friends have whitelisted Bob already, and Bob can rebroadcast Alice's public marked messages to them if he so chooses to. If Alice is posting some dumb shit or spam, not only will Bob remove her from his whitelist but he will never rebroadcast her messages. And if he does rebroadcast her spam, the friends he rebroadcasts it to are going to remove him from their Whitelists. And it can keep going outward as well, because Bob's friends can then introduce Alice to their other friends after learning about her via Bob. So it is like a massive group managed whitelist, you can only GET messages from people who you whitelist, but they can send you messages from people you have not whitelisted if they choose to do so and the messages are marked public.
-
Right now a few people are working on coding a system like this with me. I think we should go public with the code that is already done and show it to people here, and invite people like Astor, SS, ECC_ROT13 etc to participate and audit what is done. We still have unanswered questions, we still have parts to code. Would anybody be interested in seeing the code that is done so far and helping contribute to the project in an organized fashion? What we are working on is not illegal and is not being built for illegal communities, it is merely software for use by those who like the features. But I personally see nothing wrong with including people from this forum, although some others working on it may be hesitant for it to have any apparent connection to illegal activity (because why make something that is not illegal linked to criminals). Unfortunately I already kind of fucked that up by being involved with it and having the original idea for it :P.
Fuckin A, absolutely! Could we invite other people over time, like get a few invites per week, to see how the system scales?
Oh well, I guess we'll worry about that later.
BTW, is this the system that could evolve into a market with bitcoin/zerocoin integration?
Well what would happen is we would setup a github page or a website for it and just make the code public as it is evolving, but point people here to it so they can look at what is done and contribute if they want to. Pretty sure this is going to happen pretty soon I just need to talk with some others, maybe in a bit over a week I will post a link to the project.
Also, yes a market could easily be built on top of this. After we have forward anonymity (via mixing) and receive anonymity (via PIR) and the encryption and networking etc all wrapped up, we can use this infrastructure for anything. It doesn't need to be specific to a forum, it could be used for file sharing of small files, for E-mail like messaging, for a market, whatever. In this way it will be more similar to Freenet and even I2P, where there is the foundational software (mixes, PIR, etc) but then the other systems that are built on top of it (forum, e-mail, market). And the systems built on top of it will all be so similar that they can really be a single program, or at least be managed from a single GUI.
-
Also, Bob doesn't even need to rebroadcast Alice's public message for his friends to get it, he just needs to share the key for it as well as point them to it. As long as the messages in the PIR-like cache do not get wiped extremely quickly as new messages come in, this will require much less bandwidth. Since Alice's 50 KB message is already encrypted and stored/indexed by the PIR-like servers, Bob can just send to his friends the index tag that points to Alice's message, as well as the encrypted symmetric key to decrypt it.
Here is the rough idea right now:
Alice makes a public message for Bob. This is broken into two parts, the metadata and the actual payload.
Metadata:
A. An Emphemeral ECDH Key
B. A shared secret contact tag
C. The symmetric key that decrypts the payload
D. The tag that the payload is indexed by
Payload:
The payload is the actual encrypted message, it is indexed by the index tag included in the metadata (so that is the keyword people search for to get it).
A. Suggested title
B. Private/public
C. Introduce/hide
D. Message, signed
E. Senders contact details (if message is public or introduce is set)
F. Information allowing the people who obtain the message to determine which of their other contacts have been pointed to the message
First Bob engages in the keyword based PIR like system (whatever that ends up being, PSS and OWI both are options right now, EKS actually isn't as it allows the storage server to see the document returned just not the keyword searched for or the plaintext of the returned document) searching for any metadata packets that are tagged with the shared contact tag between him and any of his contacts. This allows Bob to obtain all metadata packets for all messages anybody points him to. We need to take care to protect from traffic analysis during this process, but because of the PIR-like system no third party or the server itself can tell which metadata packets Bob searched for or obtained.
Next Bob uses the included ECDH ephemeral public key and his private ECDH key to derive a shared secret. He uses this shared secret to decrypt the index tag of the message he is pointed to as well as the key used to decrypt it.
Next Bob engages in the PIR-like protocol again in order to obtain the payload data (now that he knows the tag it is indexed under). Again, the server cannot tell the tag of the message he searches for or the message returned, but we need to take care to protect from traffic analysis.
Next Bob uses the key from the metadata packet to decrypt the message. He then checks who the message is from (he knows Alice pointed him to it, but not if she wrote it). If the message is from one of his contacts, then he verifies this by signature verification. Bob's client then uses the metadata from the message to ask him some questions. Perhaps it compares the suggested title of the message (that Alice picked) to content Bob already has in his local cache, and asks him if he would like to *perceptually* merge this post into an existing thread or keep it independent. Since the message is talking about the effect of a certain drug, and because Bob is already engaging in a coversation about this drug in another thread with twenty of his friends, the software suggests that Bob *perceptually* merges this new post with his current thread, and he does so.
Now at this point if Bob replies to Alice's post in the thread, only Alice will see the message (since Alice wrote the message it in addition to pointing Bob to it). But since the message is marked public, and since Bob likes the content of the message, he decides to *socially* merge it into the thread in such a way that all participants can see it. He does this by making metadata packets for each of his twenty friends as follows:
A. An Emphemeral ECDH Key
B. A shared secret contact tag between Bob and one of the twenty posters in the thread
C. The symmetric key that decrypts Alice's message, which is itself encrypted with the shared secret derived by Bob's ephemeral public key and his contacts private ECDH key
D. The tag that Alice's message is indexed by.
He also does the same thing for each of the other public posts in the thread for Alice, so she can see the previous posts as well. Now Bob's contacts engage in the PIR-like protocol and obtain the metadata packets that Bob pointed them to. At this point they download and decrypt Alice's message, but since they don't know who Alice is they cannot verify her signature. At this point they can select to whitelist Alice, which entails loading the contact information her message has included in it. This allows them to generate shared secret contact tags between them and Alice, so they can tag messages for Alice. Since Alice also is introduced to them, she also gets their contact information, which allows her to generate the shared contact strings for them as well. If they whitelist each other they now have a dynamic (per message), secret contact string that they can use to point each other to metadata that itself points to message content.
Keep in mind that this is a rough protocol, but something like this is what I am picturing. We are just now starting on the PIR-like part of the system as we just finished forward anonymity and wrapping all the crypto and networking and database , etc, stuff up.
-
Sorry for so many posts in a row, but just one more thing to point out if it is not clear:
Whitelisted people can point you to posts to download, but you can still download posts made by people who you have not whitelisted, provided a whitelisted person points you to the post. People who are not whitelisted cannot point you to posts, and this means the only way you can see their posts is if somebody you have whitelisted points you to them. But after you are pointed to the post of someone you have not whitelisted and you obtain it, you can whitelist that person directly from the post by clicking a button. If they also whitelist you, then you can point each other to messages, and easily communicate with each other. If somebody points you to spam, you can just remove them from your whitelist.
-
Essentially Bob acts as the filter between Alice and his other friends. Bob's friends have whitelisted Bob already, and Bob can rebroadcast Alice's public marked messages to them if he so chooses to. If Alice is posting some dumb shit or spam, not only will Bob remove her from his whitelist but he will never rebroadcast her messages. And if he does rebroadcast her spam, the friends he rebroadcasts it to are going to remove him from their Whitelists. And it can keep going outward as well, because Bob's friends can then introduce Alice to their other friends after learning about her via Bob. So it is like a massive group managed whitelist, you can only GET messages from people who you whitelist, but they can send you messages from people you have not whitelisted if they choose to do so and the messages are makred public.
yeah that's called facebook. we got that already.
-
Essentially Bob acts as the filter between Alice and his other friends. Bob's friends have whitelisted Bob already, and Bob can rebroadcast Alice's public marked messages to them if he so chooses to. If Alice is posting some dumb shit or spam, not only will Bob remove her from his whitelist but he will never rebroadcast her messages. And if he does rebroadcast her spam, the friends he rebroadcasts it to are going to remove him from their Whitelists. And it can keep going outward as well, because Bob's friends can then introduce Alice to their other friends after learning about her via Bob. So it is like a massive group managed whitelist, you can only GET messages from people who you whitelist, but they can send you messages from people you have not whitelisted if they choose to do so and the messages are marked public.
yeah that's called facebook. we got that already.
Facebook users friends point them to relevant messages made by arbitrary posters, allowing high quality posts to propagate through the entire community, and high quality posters to expand their social networks, while spam is filtered by users and spammers are cut out by whitelists? I didn't realize that.
-
//////////BUY BITCOINS CHEAPEST ON SILK ROAD\\\\\\\\\\\\\\
*** [REDACTED] ***
-
A few comments on your structure. And remember, I'm just now sitting down to really think about this, and it's obvious you've been thinking about this for a long time, so I'll be a little behind the curve.
I really like the overall mechanism for encrypted multiple recipient communication, and the mechanism to grow the list of recipients while still maintaining the confidentiality of the payload is something I've never seen before. It's a really good approach.
Completely splitting metadata and payload into to separate objects is definitely the right approach as well. The payload doesn't have to be deep copied around. One thing concerns me a little.. "F" in your payload description. I think I get why it's there (so you can know which of your friends has already seen it), but here's what worries me about "F":
1. It can't be an actual list, because the payload stays static after its written.
2. So it has to be something that your client can use to derive search keys and loop through your contacts, seeing who has received a link to
it via more searches. The problem is, what are they searching for? Are clients going to post a read-receipt?
And any time you only require (something from the Payload) and (a list of contacts) to pull that list of who has links, an adversary should be able to loop through *all* known contacts in the system and see if each has seen it.
It prevents you from resending the links to people who have seen the payload, but most of the downside of accidentally resending it to someone who's seen it (deep copies) is negated by the tiny size of the metadata object. You'll know which of your friends were on the threads.
How would a public forum (aka newsgroup, whatever) work? You'd need a way to search for all metadata objects tagged as belonging to that group, grab them, apply WoT/etc, then download payload for whatever you actually want. But public forums aren't one-to-many messaging, they're one-to-ANY messaging, and that's actually significantly different from your described use case.
The main reason that I think that public forums are so important to anonymous communities is that they're one of the few ways for new people to join in discussion. Otherwise, while you might end up with your own slowly-growing circle of trusted friends, and that works great for you. But the new guy can only talk to himself. Plus, you need a plausible way to introduce your sockpuppets to your friends. :)
As an example (of a new guy, not sockpuppetry), I just showed up on SRF because I enjoyed browsing some of the security-related threads. I'm probably not the target market here. Don't buy/sell on SR, haven't seen anything for sale there that fits my boring lifestyle. But this place has an impressively diverse, open-minded crowd, and one that's actively *applying* anonymity technology.
-
One thing concerns me a little.. "F" in your payload description. I think I get why it's there (so you can know which of your friends has already seen it),
Yes, because of a few reasons. For one, because we don't want to waste bandwidth pointing people to things they already know about. For two, in the case of private messages between multiple parties, we want each party to know who the other communicators are, so they can tag responses to the message for each of them as well, and keep the coversation synchronized.
F. Information allowing the people who obtain the message to determine which of their other contacts have been pointed to the message
1. It can't be an actual list, because the payload stays static after its written.
That is true. We could have it be an initial list of people who received the message, but if more people are pointed to the message there is no way to update the initial list. We need to keep tabs of who knows about which messages, otherwise people will waste bandwidth pointing their friends to messages they have already been pointed to a thousand times. Also, nobody will know who can already view a message they make a reply to, and thus things will become totally disorganized.
2. So it has to be something that your client can use to derive search keys and loop through your contacts, seeing who has received a link to
it via more searches. The problem is, what are they searching for? Are clients going to post a read-receipt?
Alternatively, we could make it so only the initiator of a post can authorize new people to see the resulting thread. Except what if somebody made a reply to the initial post that they don't want new people to see? We would need to add another level of control, public, deferred, private. Public posts can be pointed to by anyone, deferred posts defer to the starter of the thread in regards to who can be pointed to them in the future, and private posts cannot be pointed to by anybody. However, we really don't want to give users so many options because it will just make a confusing system for them. Nobody wants to mark every single post they make as [Public, Deferred, Private] and [Introduce, Hide]. But if only the initiator of the thread can invite new people to it, then we can have the initial list of people the message is sent to, and any new people who are invited can be included in a message sent by the person who started the thread.
But this will not work for public posts, because by their very nature we need to let anybody point people to them. But we do not want to have a hundred people point somebody to a post that they already know about, simply because it is a waste of bandwidth to upload and download the metadata packet, as well as a waste of processing power for the PIR-like servers etc.
What we could do for public posts is this. First of all let's remove section F from the payload and make it part of the metadata packet. When Alice makes a public post, she sends it to Bob and Carol. So she includes Bob and Carols name in section F (or perhaps a bloom filter or something will be better, we can think about the technical details of how to do this part next). At this point Bob and Carol both get the message from Alice, and they can determine that the message is viewable by Alice, Bob, Carol. Now Bob is having a conversation with Doug about the same subject, so he merges Alice's post perceptually with Dougs post to make a 'weave', but since Alice's post is marked public Bob decides to socially merge the weave together into a single thread. Now Bob points Doug to the original message, and his metadata packet includes that Alice, Carol and Bob can see it. At this point Bob needs to point Alice and Carol to the previous post by Doug as well, and when he does this he sends them the metadata packet including the pointer to the post from Doug as well as the information that Bob, Alice, Carol and Doug can see it. Now Doug likes this new thread and wants to include his contact Earl in the conversation. Since Doug already knows that the message can be seen by Bob, Alice, Carol, and Doug, when he points Earl to the messages making up the thread he includes in the metadata packet that the previously mentioned names can see these posts, and when Earl makes a response to the thread Dougs client points Alice, Carol, Bob, and Doug to the post by Earl, including Alice, Carol, Bob, Doug, and Earl in the metadata packet.
Something to that extent anyway, I kind of confused myself writing that honestly. Organization of multi party communications without a centralized administration or even forum, or even shared perception of what a 'weave' is, is going to be tricky, and is obviously something that still needs thought given to it. But I don't think it will be harder than it has been to implement all the crytpo crap we have done, or to implement all the crypto crap that we still need to do.
And any time you only require (something from the Payload) and (a list of contacts) to pull that list of who has links, an adversary should be able to loop through *all* known contacts in the system and see if each has seen it.
Anybody who is able to decrypt a post needs to know who else has been able to decrypt the post so that they know to point them to any responses made to the post if they want to. On the other hand, we don't want people to learn that people they do not know have seen the post. There are cryptographic solutions to this problem. The problem being: Alice sends a message to Bob and Carol, and she wants Bob to know to send replies he makes to Carol if he knows who Carol is, but doesn't want Bob to learn anything at all about Carol if he doesn't already know who she is.
It prevents you from resending the links to people who have seen the payload, but most of the downside of accidentally resending it to someone who's seen it (deep copies) is negated by the tiny size of the metadata object. You'll know which of your friends were on the threads.
Yes the metadata can be quite small compared to the payload. However if 100 people point somebody to something that he already knows about, it could still add up and be a big waste of bandwidth and processing capacity of the PIR-like server as well. But even more importantly, there will be no way to organize threads if nobody knows who is a part of them. If Alice sends a message to Bob and Carol, Bob and Carol need to know that each other can see the message to be able to make replies to the original message that both of them can see. Otherwise Bob will only respond to Alice and Carol will only respond to Alice, and there is no group messaging taking place but rather a running conversation between Alice and Bob and between Alice and Carol about the same subject matter.
How would a public forum (aka newsgroup, whatever) work? You'd need a way to search for all metadata objects tagged as belonging to that group, grab them, apply WoT/etc, then download payload for whatever you actually want. But public forums aren't one-to-many messaging, they're one-to-ANY messaging, and that's actually significantly different from your described use case.
There are not group tags but rather individual tags between each user. Picture it more like E-mail I imagine. Okay, using usenet as example. Alice sends a message to a usenet newsgroup, and it is encrypted and tagged. This is the message on the PIR-like server. But now Alice needs to let her two contacts, Bob and Carol, know about the message. So she sends an E-mail to Bob with the tag of the message and the key to decrypt it, and a different E-mail to carol with the tag of the message and the key to decrypt it. She also says in her E-mail to Bob that she also sent the message to Carol, and in her E-mail to Carol that she also sent the message to Bob. So now Bob goes to the newgroup and finds the message with the tag concatenated to it, then he decrypts it and reads the message. Now when Bob responds to the message he posts a brand new message to the usenet group, and now he can either only tell Alice about the message if he does not want Carol to read it, or he can tell both Alice and Carol about the message. And the communication carries on like this, each participant makes a new post to the Usenet group and then E-mails everybody they want to see the post telling them its tag as well as the key to decrypt it. There is not a 'group' that Alice and Bob and Carol are part of, and there is no tag that is between them, rather they act like a group and can communicate like a group because they all know each others E-mail addresses and can inform each other about the posts they make in the 'thread' that they all know the other knows about. Does that make sense?
The main reason that I think that public forums are so important to anonymous communities is that they're one of the few ways for new people to join in discussion. Otherwise, while you might end up with your own slowly-growing circle of trusted friends, and that works great for you. But the new guy can only talk to himself. Plus, you need a plausible way to introduce your sockpuppets to your friends. :)
Yes there is some difference between this and a truly public forum like SR, it will be harder to bootstrap into a system like this. Hopefully not by much. If you can think of a better design please share it, right now we still need to finish the fundamental cryptographic systems before we even really start on the forum that uses them. Next on the list is implementing a PIR like system for message retrieval. Already done with mix code for forward messages, including a pretty sophisticated cryptographic packet format (will post my code in a week or so after I discuss this with some others who have contributed code as well, but I do hope to bring you and Astor and SS on board and release all the current code publicly)
Mix packet format implemented: https://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf
Also have modified Sphinx to support Alpha Mixing which is implemented: http://freehaven.net/doc/alpha-mixing/alpha-mixing.pdf
Also have modified Sphinx to use bloom filter to protect from a potential timing attack: https://en.wikipedia.org/wiki/Bloom_filter
So forward anonymity is totally done at this point, as well as a bunch of other stuff (IE: have the related ECC algorithms wrapped very nicely, Tor interfaced with, etc). Right now I am looking for the PIR-like system to implement. This looks interesting:
www.cs.berkeley.edu/~dawnsong/papers/2009%20new%20techniques%20a16-bethencourt.pdf
so does this
https://www.usenix.org/conference/foci12/one-way-indexing-plausible-deniability-censorship-resistant-storage
Both of these look like they will be a challenge for me to implement, thankfully some others helping have more skill than I do. Do you think you would be able to help implement either of these, after looking through the paper briefly? (in the case of the berkeley paper note that we would need to use the dictionaryless system, as we will not have a premade dictionary of words to search for, and in the case of One Way Indexing some modifications will need to be made as well). The berkley one at least needs a homomorphic encryption algorithm, thankfully there is this already done: http://hms.isi.jhu.edu/acsc/libpaillier/ , also there is a rough implementation of that PSS already implemented in I think C++ but it is not production worthy and I would probably just use it as a reference. However, something more similar to the one way indexing is probably better as it looks like we can prevent the PIR-like server from censoring specific content.
As an example (of a new guy, not sockpuppetry), I just showed up on SRF because I enjoyed browsing some of the security-related threads. I'm probably not the target market here. Don't buy/sell on SR, haven't seen anything for sale there that fits my boring lifestyle. But this place has an impressively diverse, open-minded crowd, and one that's actively *applying* anonymity technology.
Indeed :).
-
Facebook users friends point them to relevant messages made by arbitrary posters, allowing high quality posts to propagate through the entire community, and high quality posters to expand their social networks, while spam is filtered by users and spammers are cut out by whitelists? I didn't realize that.
yeah you got me on the high quality posts bit i must admit, lol
-
This may be the first thread I've ever seen get rescued from a bunch of politics and trolling to something useful and informative. Really enjoying reading this.
-
Syndie is back under development https://syndie.de/index.html
It uses sort of the same spam controls, is decentralized, can be as anonymous as you want including message delays.
-
This may be the first thread I've ever seen get rescued from a bunch of politics and trolling to something useful and informative. Really enjoying reading this.
Agree - Very interesting.
Re: 'Dissent' (funded in part by DARPA) may well be the future. Pretty sure DARPA assisted in the early funding of Tor also. 'Breaking Good'.
-
Syndie can be as anonymous as the infrastructure it uses is. Pretty much we started from the bottom up, making a mix network (done) and PIR-like system (need to do) first with plans to add a forum on top of it, and they started from the top down, making a forum first and never making a strongly anonymous system underneath of it (but rather relying on existing anonymity infrastructure).
The mix network we have coded is provably more secure than Mixminion or Mixmaster, and it gives the user control over the trade off between latency:anonymity, whereas Mixminion and Mixmaster both set the time delay for the user. Also, Mixmaster is considered deprecated at this point, Mixminion was an improvement on it in every way, and the mix network we have is an improvement on mixminion in every way.
In fact, Syndie could probably be interfaced to use our system (Alpha-mix-net + PIR-like-system) with little trouble. However, I still desire to make a different forum software, especially because the UI for Syndie is pretty ugly. I don't know what encryption they are using but I doubt it is ECC. I cannot even tell if they automatically handle key exchange. It looks like they automatically use encrypted zip files, but no documentation on how key exchange is done.
Syndie is cool, so is Frost for Freenet, even BitMessage is kind of neat despite being full of flaws (and it uses PIR even, but in a totally different way than we are going to) but I think we can make huge improvements over all of these systems.
For example, look at the Syndie graphic on the page you linked to. It can make use of already existing anonymizers such as Tor, I2P, Mixmaster, Mixminion, etc. It can use pretty much any anonymity infrastructure. It can also make use of already existing content archives, including HTTP archives, Mail archives and Usenet archives. The content storage systems it can currently use either do not support anonymous receiving of messages (other than via Tor etc), or they only support it via Everybody Gets Everything PIR. So you can delay forward messages to an archive with Mixmaster or Mixminion, but you still need to either use Tor to obtain them or mix many groups together using the same content storage server and then use Everybody Gets Everything PIR to download the messages.
This is in contrast to what we are doing. We are making a new anonymizer (alpha-mixing with sphinx, which is already implemented), and a new content archive (the PIR-like system, which also allows for anonymous receiving of messages without needing Everybody Gets Everything), and then after these things are done we are making a component that is more similar to Syndie (although hopefully with a better user interface, and probably with better automated security features, and other features such as integrated Bitcoin/Zerocoin) to make use of them.
So where their diagram has Syndie -> Mixminion -> Mail Archive, ours will have New Forum Software -> New Mix Network -> PIR-Like-System
and also ours will not have the various other lines connecting it to all the other infrastructure (anonymizer, jap, i2p, mixminion, mixmaster, HTTP archive, Mail archive, Usenet archive, etc). Although technically ours will be:
Forward Messages: New Forum Software -> Tor -> Specific New Mix Network -> Tor -> Specific New PIR-like-system
Message Retrieval: New Forum Software <-> Tor <-> Specific New PIR-like-system
Syndie is:
Forward Messages: Syndie -> Any Or No Anonymizer -> Any content archive
Message Retrieval: Syndie <-> Any Or No Anonymizer <-> Any content archive
-
kmfkewm - I really do get the desire to see which of your friends have seen links to the payload, and the desire to reduce unneeded duplication of metadata to share that.
I'm kinda stuck at this question:
How does the overhead of performing searches relate to the overhead of duplicated metadata objects?
At some point, every viewer of every thread will be performing an "is it read?" search one time for each contact they have that they want an answer for. You can cache results on the client, so it's just a "What about the folks that hadn't seen it last time?" set of searches every time you reopen the client.
You're trading an increase in CPU-time (to perform additional searches) to affect a decrease in storage (duplicated metadata objects).
The DoS/Flood exposure of the PIR/etc method seems to be twofold: CPU, in terms of search flood, and storage. Storage is what actually concerns me the most, without the ability to share the load of storage, ala Freenet. Proof of Work gives you a rate limiting mechanism that can be ratcheted up to an appropriate level, of course. But PoW works well to limit storage.. not so much with searches (CPU). "Please fill out a CAPTCHA to see if Alice has read this... please fill one out for Bob..." Obviously, you could trade a non-human-interactive PoW (some concept of hashcash, etc).
I just don't have a feel for how CPU-intensive each individual search is. If it's trivial, I think I like your method, but remain concerned about the Layer 7 reverse-mapping of content viewers. I'm afraid it exposes too much data about who has seen what, which would be a shame since PIR is starting with the best available method to keep that from happening. But I'll freely admit that I haven't thought this out all the way yet.
Writing or definitively auditing code that people will trust their lives to just isn't something I'm personally comfortable with doing, so I'm not much use to audit your code or implement algorithms from whitepapers. I have a lot of skillsets, but that one is too weak to trust anybody else's life with. Although the more I think about the direction the Internet/surveillance/etc is heading, the more I think that maybe it ought to be my focus moving forward.
-
subbed
-
kmfkewm - I really do get the desire to see which of your friends have seen links to the payload, and the desire to reduce unneeded duplication of metadata to share that.
It is more than just a desire to reduce bandwidth, although that does come into play as well. The primary reason why users need to be able to tell which of their friends have seen a payload, is so they know who to respond to when they make a response to the message in the payload. Picture it with E-mail and a mail archive:
Alice sends a tagged encrypted message to a mail archive. This is the payload. Now she wants Bob and Carol to see the message and be able to respond to it. So she sends Bob an E-mail with the tag of the message and a key to decrypt the message, and she tells Bob that Carol can also see the message. Alice sends the same E-mail to Carol, but letting her know that Bob can see the message. The E-mails Alice sends to Bob and Carol can be seen as the metadata packets. Now, if Bob knows who Carol is, when he makes a response to the message he can know to tell Carol about it. If Alice never told Bob that Carol knew about the message, Bob could only make a response and tell Alice about it. But in this case there is not group communication taking place, rather it is like Alice holds a conversation with Bob and independently holds a conversation with Carol, about the same topic. So it is required for group communication that Bob knows Carol is part of the group communicating. Now there are cryptographic tricks we can do to make this more secure, for example we don't want Bob to learn anything about Carol if he doesn't already know who she is. Additionally, we don't want Bob to even know how many people Alice pointed to the message, unless he knows all of the people that she pointed to the message.
So the most important reason for clients to know who all has seen a message is so that they know who to respond to when they make a response to the message. Saving bandwidth by not resending a ton of metadata packets is just an added advantage of this.
I'm kinda stuck at this question:
How does the overhead of performing searches relate to the overhead of duplicated metadata objects?
I am not sure I understand this question. If there are duplicated metadata objects (although each one is a bit different, even if it points to the same message), that will increase the number of searches that need to be performed as well. If Alice points Bob to a message he already knows about, then Bob still needs to search for that metadata object, because he doesn't know what it points to until he downloads it, and downloading it requires him to search for it. He is capable of searching for it because he has a shared secret search string between him and Alice, but until he actually downloads it he has no idea what it is he is downloading.
Here is an abstract from one of the papers that looks like a suitable candidate (however I still need to read the one way indexing paper, it looks like it might be better actually in that it could provide censorship resistance as well) :
www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA456185
A system for private stream searching, introduced by Ostrovsky and Skeith [18], allows a client to
provide an untrusted server with an encrypted search query. The server uses the query on a stream
of documents and returns the matching documents to the client while learning nothing about the
nature of the query. We present a new scheme for conducting private keyword search on stream-
ing data which requires O(m) server to client communication complexity to return the content
of the matching documents, where m is the size of the documents. The required storage on the
server conducting the search is also O(m). Our technique requires some metadata to be returned
in addition to the documents; for this we present a scheme with O(m log(t/m)) communication
and storage complexity. In many streaming applications, the number of matching documents is
expected to be a fixed fraction of the stream length; in this case the new scheme has the optimal
O(m) overall communication and storage complexity with near optimal constant factors. The pre-
vious best scheme for private stream searching was shown to have O(m log m) communication
and storage complexity. In applications where m > m, we may revert to an alternative method of
returning the necessary metadata which has O(m log m) communication and storage complexity;
in this case constant factor improvements over the previous scheme are achieved. Our solution
employs a novel construction in which the user reconstructs the matching files by solving a sys-
tem of linear equations. This allows the matching documents to be stored in a compact buffer
rather than relying on redundancies to avoid collisions in the storage buffer as in previous work.
We also present a unique encrypted Bloom filter construction which is used to encode the set of
matching documents. In this paper we describe our scheme, prove it secure, analyze its asymptotic
performance, and describe several extensions.
The Internet currently has several different types of sources of information. These include con-
ventional websites, time sensitive web pages such as news articles and blog posts, real time public
discussions through channels such as IRC, newsgroup posts, online auctions, and web based fo-
rums or classified ads. One common link between all of these sources is that searching mechanisms
are vital for a user to be able to distill the information relevant to him.
Most search mechanisms involve a client sending a set of search criteria to a server and the
server performing the search over some large data set. However, for some applications a client
would like to hide his search criteria, i.e., which type of data he is interested in. A client might
want to protect the privacy of his search queries for a variety of reasons ranging from personal
privacy to protection of commercial interests.
A naive method for allowing private searches is to download the entire resource to the client
machine and perform the search locally. This is typically infeasible due to the large size of the data
to be searched, the limited bandwidth between the client and a remote entity, or to the unwillingness
of a remote entity to disclose the entire resource to the client.
In many scenarios the documents to be searched are being continually generated and are already
being processed as a stream by remote servers. In this case it would be advantageous to allow
clients to establish persistent searches with the servers where they could be efficiently processed.
Content matching the searches could then be returned to the clients as it arises. For example,
Google News Alerts system [1] emails users whenever web news articles crawled by Google match
their registered search keywords. In this paper we develop an efficient cryptographic system which
allows services of this type while provably maintaining the secrecy of the search criteria.
Private Stream Searching Recently, Ostrovsky and Skeith defined the problem of “private fil-
tering”, which models the situations described above. They gave a scheme based on the homomor-
phism of the Paillier cryptosystem [19, 9] providing this capability [18]. First, a public dictionary
of keywords D is fixed. To construct a query for the disjunction of some keywords K ⊆ D, the
user produces an array of ciphertexts, one for each w ∈ D. If w ∈ K, a one is encrypted; otherwise
a zero is encrypted. A server processing a document in its stream may then compute the product
of the query array entries corresponding to the keywords found in the document. This will result
in the encryption of some value c, which, by the homomorphism, is non-zero if and only if the
document matches the query. The server may then in turn compute E (c)f = E (cf ), where f is
the content of the document, obtaining either an encryption of (a multiple of) the document or an
encryption of zero.
Ostrovsky and Skeith propose the server keep a large array of ciphertexts as a buffer to accu-
mulate matching documents; each E (cf ) value is multiplied into a number of random locations in
the buffer. If the document matches the query then c is non-zero and copies of that document will
be placed into these random locations; otherwise, c = 0 and this step will add an encryption of 0
to each location, having no effect on the corresponding plaintexts. A fundamental property of their
solution is that if two different matching documents are ever added to the same buffer location
then we will have a collision and both copies will be lost. If all copies of a particular matching
document are lost due to collisions then that document is lost, and when the buffer is returned to
the client, he will not be able to recover it.
To avoid the loss of data in this approach one must make the buffer sufficiently large so that this
event does not happen. This requires that the buffer be much larger than the expected number of
required documents. In particular, Ostrovsky and Skeith show that a given probability of success-
fully obtaining all matching documents may be obtained with a buffer of size O(m log m),1 where
m is the number of matching documents. While effective, this scheme results in inefficiency due
to the fact that a significant portion of the buffer returned to the user consists of empty locations
and document collisions.
Note that we would need to use the extension in 5.2 of the paper which eliminates a globally public dictionary, as the keywords searched for are random, secret, and long.
At some point, every viewer of every thread will be performing an "is it read?" search one time for each contact they have that they want an answer for. You can cache results on the client, so it's just a "What about the folks that hadn't seen it last time?" set of searches every time you reopen the client.
Sure
You're trading an increase in CPU-time (to perform additional searches) to affect a decrease in storage (duplicated metadata objects).
Sort of, but the CPU-time required for a client to decrypt a search is actually quite large. Decryption of search results can take many minutes and a lot of processing power at the clients end. So not searching for things you already have actually is probably a significant reduction in required client CPU-time as well as certainly a decrease in storage at the PSS (in this case) database.
The DoS/Flood exposure of the PIR/etc method seems to be twofold: CPU, in terms of search flood, and storage. Storage is what actually concerns me the most, without the ability to share the load of storage, ala Freenet. Proof of Work gives you a rate limiting mechanism that can be ratcheted up to an appropriate level, of course. But PoW works well to limit storage.. not so much with searches (CPU). "Please fill out a CAPTCHA to see if Alice has read this... please fill one out for Bob..." Obviously, you could trade a non-human-interactive PoW (some concept of hashcash, etc).
Yes storage concerns me the most as well. Pretty much we need to have several PSS servers sharing the load to gain the benefit of decentralization (ie: taking down a single server has no effect on the overall system). But it seems very wasteful to have, say, two servers with 4TB of storage capacity, that need to be exact mirrors of each other. Adding more servers doesn't really increase storage capacity in this case, it only increases redundancy. But at the same time, having it so different messages go to different servers will very likely hurt the anonymity of the system and make it weak to traffic analysis. Just because PIR/PSS/OWI/Whatever makes it so the clients can download messages without the server knowing which message they downloaded, does not make it perfectly immune to traffic analysis. It sets a strong foundation upon which we need to build up our traffic analysis resistance. I think that we should certainly have proof of work to send messages to at least discourage spam and flooding of the PIR-like servers. It is trivial to implement hash based proof of work I could make such a system in half an hour tops. I don't much care for the idea of CAPTCHA , it is easy to circumvent and makes things much less nice for the legitimate users.
I just don't have a feel for how CPU-intensive each individual search is. If it's trivial, I think I like your method, but remain concerned about the Layer 7 reverse-mapping of content viewers. I'm afraid it exposes too much data about who has seen what, which would be a shame since PIR is starting with the best available method to keep that from happening. But I'll freely admit that I haven't thought this out all the way yet.
For clients searching is very CPU intensive. It will likely take several minutes (depending on CPU of course) for the client to decrypt their obtained search results. I don't think searching is actually very CPU intensive for the servers, but let me read the entire document through again before I make that claim. The PIR for Pynchon Gate is extremely CPU intensive for the servers but not very much so for the clients. Most PIR systems I have read about are very CPU intensive for the servers, and there tends to be a direct trade off between CPU and bandwidth (for example, the easiest PIR is that of BitMessage, everybody gets everything, where no significant processing needs to be done by the server but it needs to send enormous amounts of data to every client. Pynchon Gate PIR is the opposite of this, in that the server needs to do enormous amounts of computation, but only needs to send small amounts of data to every client). But in the PSS papers and other PIR-Like systems I have read about, it seems that most of the load is put on the clients, which is actually fantastic.
Writing or definitively auditing code that people will trust their lives to just isn't something I'm personally comfortable with doing, so I'm not much use to audit your code or implement algorithms from whitepapers. I have a lot of skillsets, but that one is too weak to trust anybody else's life with. Although the more I think about the direction the Internet/surveillance/etc is heading, the more I think that maybe it ought to be my focus moving forward.
Do you think you are better programmer / know more about security than the BitMessage people? Do you know not to directly encrypt payloads with RSA? Are you pretty good at math? Do you know C programming? The thing is yeah it is best if only top experts program stuff like this. But to help people we only need to be better than the "competition". Right now the competition for systems like this is not that good. The only systems similar to this are Syndie, BitMessage, Frost (and I suppose Dissent as well, but I don't think anyone is using that yet). Syndie doesn't include a network but only an interface that can be used with any network:storage pair. BitMessage is horrible and obviously not made by people who know what they are doing. Frost I know little about, but it is essentially relying on the anonymity of Freenet, which we can improve upon. Dissent I don't know enough to comment on, but quite likely it is the best of the bunch considering it is coming from academics instead of hobbyists. Also we have solutions like Tor and php forums, and I2P and the same, but that is so far below the aspirations of this that it can hardly even be compared. We need to make this because the solution of Tor + php forum, or I2P and the same, is simply not anywhere near good enough. Even if Tor is programmed perfectly and the PHP forum has no flaws, by the very design it is just not going to be resistant enough to strong attackers.
pretty much what I am trying to say is that even if you are not top security expert in the world, you can still help contribute to this and still look over code that is done. Peoples lives will not depend on you, they will depend on everybody who contributes to the project. The more people who contribute to writing code and to auditing code the better it is going to be. At first I tried doing this all by myself, and two years later I realized I bit off more than I could chew and other people became involved. Things improved! Not all of the people involved are experts, I consider myself only to be a hobbyist as well but I still implemented a cryptographic packet format, and the expert people who I asked to audit my code said it looked correct. That was the first cryptographic system I implemented, and it was hard work, but by sticking to the whitepaper and researching the shit out of things, I was able to produce a system that true experts in the field said I had produced correctly.
-
I'm afraid it exposes too much data about who has seen what, which would be a shame since PIR is starting with the best available method to keep that from happening. But I'll freely admit that I haven't thought this out all the way yet.
There are two different problems. The first problem is preventing people from learning who (IP) has seen what. The second problem is preventing unwanted people from learning who (pseudonym) has seen what. PIR protects from the first problem, it allows an IP address to download messages for a pseudonym without anybody being able to determine which pseudonyms messages they are downloading. Problem two is more a concern from a network analysis perspective. But in many cases we do want many people to know which pseudonym has seen what, so they know who to respond to.
I can give an example with Pynchon Gate. Pynchon Gate is designed for person to person communications, not for group communications (unfortunately, because it is comparatively easy to implemented, in that I could be done with it in like a month given what we already have done). Pynchon gate makes use of the following components:
Mix Network -> To send forward messages
Nymserver -> To receive messages for users, it processes messages sent to a user and bunches them together into buckets for the PIR server
PIR Server -> To store messages for distribution, to let users download message buckets anonymously
Alice wants to talk to Bob. So she makes him a message and sends it through the mix network to the Nymserver, addressed for Bob. The Nymserver gathers many messages for Bob over a period of time called a cycle. At the end of each cycle, the Nymserver groups together all messages for Bob into what is called a bucket, then it pads the bucket to a fixed size , uploads it to the PIR servers. The PIR servers put Bob's bucket in a certain numbered position in their database (PIR is usually indexed by number, if it is indexed by keyword then it is something else based on PIR, usually. The only exception I can think of is everybody gets everything PIR), and then they release a list of every pseudonym and its current bucket number, which everybody downloads with Everybody Gets Everything PIR (username to bucket position is downloaded with everybody gets everything PIR, actual message buckets are downloaded with a more sophisticated PIR). Every cycle every client downloads the entire list of pseudonym:bucket_number pairings. Then they find their current buckets index number and perform the more sophisticated PIR protocol with the PIR servers to download their messages. Because of the PIR, the PIR servers cannot tell which IP is linked to Bob, because even though Bob gets the message bucket at the position known to be 'Bob's messages' , the PIR servers cannot tell this.
Anyway this is fine and great for person to person communications. The Nymserver cannot link Alice to Bob, or anybody else to Bob, because they don't need to say who they are when they send messages to Bob. But if you try to turn this into a group communication system, a lot of problems come up. Let's say that Alice wants to send a message to Bob and Carol. Okay, there are two options here now. Either Alice can send a copy of the message to Bob, and then re-encrypt it and send a brand new copy to Carol, or Alice can send a single message to a node that sends a copy to Bob and Carol.
In the first case, it is simply bandwidth prohibitive. If Alice communicates with 100 people, we cannot have her sending 100 copies of the same message. That drains bandwidth from the mix network in such a huge way that it is not feasible. The second option greatly reduces the bandwidth requirements. In this case, Alice sends the same message to her 100 contacts down the same circuit, and only the key to decrypt the message is encrypted uniquely to each of her contacts. Alice would need to include for the final mix node each of the contacts to send the message to, and at the final mix node the message can be split off to the different nymservers (and where many people share a single nymserver, only one message needs to be sent to it with the list of recipients). This brings the bandwidth requirements into the realm of reasonable, Alice doesn't send 100 messages individually through the mix network, for each one to individually be sent to a nymserver. Now she sends a single message down the mix network and then it is sent one time to each nymserver from the final mix. So this is much better, but there is still a huge problem! The nymservers can see that the same ciphertext has arrived for multiple people, and now they can socially link those people together. This is a social network analysis problem. Since PIR is being used, the PIR servers still cannot link an IP address to a pseudonym, but pseudonyms communicating can be linked together by a third party (the nymserver, the final mix).
Using Keyword Search instead of a nymserver can eliminate all of these problems. Now instead of sending a message to Bobs' nymserver and telling the nymserver the message is for Bob, Alice merely sends a message tagged with a shared secret between her and Bob directly to the PIR-like server. The PIR-like server no longer associates this message with "A message for Bob", but rather can only tell that it is some message for somebody. So now Bob can download the message by doing a keyword search. And this actually scales to group communications. Because when Alice sends the same message for all 100 of her contacts, yes there is a single ciphertext still but it is indexed by 100 different arbitrary strings. It is not a message for Bob and Carol and Doug etc, it is a message for 100 random strings that will never be used in the future and have never been used in the past. So this solves the problem of network analysis and actually allows us to totally remove the nymserver.
But of course if Alice sends a message to Bob and Carol, she wants Bob to know to tell Carol about any replies he makes to it. Otherwise there is not group communication taking place. So the problem is third party network analysis, communicating parties should be able to determine who all is involved in the communication.
So yeah there are a few closely related things here but they are distinct. Internet network analysis and social network analysis. PIR takes care of the first problem, but not the second problem. I think we can use Keyword Indexed PIR-like-systems to take care of the second problem. But in some cases it is not even a problem, because we need communicating parties to know that they are communicating with each other, we just don't want unwanted third parties to know this. If we allowed for unwanted third parties to know who is communicating with who from a *social network* perspective, and only not from a *computer network* perspective, we could just use a modified Pynchon Gate and be done with this a hell of a lot faster :).
-
It is more than just a desire to reduce bandwidth, although that does come into play as well. The primary reason why users need to be able to tell which of their friends have seen a payload, is so they know who to respond to when they make a response to the message in the payload. [..]
So the most important reason for clients to know who all has seen a message is so that they know who to respond to when they make a response to the message. Saving bandwidth by not resending a ton of metadata packets is just an added advantage of this.
That makes sense. I was still thinking of this as a private message thread between X individuals, but it's actually something different that's a mix between that and a Usenet posting.
(excuse long example.. have to go a ways to explore this) You, I, and Alice can have a discussion thread between the three of us, and about ten messages in, you decide, "You know, Bob would have a good opinion on this", then invite Bob to the thread. Bob doesn't have anything to say, but he thinks Carol and Dave might. So he sends it to Carol and Dave, who both post responses to the thread. I decide to send it to Frank. At this point, Frank thinks, "Bob would like this", and you're wanting a way to prevent Frank from resending it to Bob. At this point, everyone who can see the thread can see that you, me, Alice, Carol, and Dave are participating, because we've posted messages in the thread. But they don't know who else has *seen* it but not posted a public response.
Just throwing this out there: What if you tagged every payload posting/response with a (non-brute-forcable) index key that lets everyone who knows the index key find all the messages? Then rely on users inviting others by sending metadata objects containing the thread index key and the session key to the thread payloads? I know you give up unique per-public-message-in-thread session keying, but only when someone becomes a full participant (they need the index key and the session key) in the thread. And in either scenario, a compromised thread member with access to public messages in the thread results in a disclosure of all public messages in the thread.
I'm sure my suggestion gets rid of some of the granularity in the invitation process in exchange for simplicity/less searches, but regardless of method, anyone with a full view to the thread can disclose the contents of the thread however they want anyway. And you should kick the first person to suggest DRM as an answer firmly in the nuts.
I'm having trouble imagining a scenario where I want a new thread partcipant to only see new messages going forward, or to not see some public messages because he doesn't know the author (except for his WoT/whitelist/etc settings, which are on him). And any control I can imagine to keep that from happening is trivially subvertable by any member of the thread with a full view. Or even with just "a better view" than the new partcipant, since they can send him the decrypted messages if all else fails.
Sort of, but the CPU-time required for a client to decrypt a search is actually quite large. Decryption of search results can take many minutes and a lot of processing power at the clients end. So not searching for things you already have actually is probably a significant reduction in required client CPU-time as well as certainly a decrease in storage at the PSS (in this case) database.
Yes storage concerns me the most as well. ... But it seems very wasteful to have, say, two servers with 4TB of storage capacity, that need to be exact mirrors of each other. Adding more servers doesn't really increase storage capacity in this case, it only increases redundancy. But at the same time, having it so different messages go to different servers will very likely hurt the anonymity of the system and make it weak to traffic analysis.
I think you're gonig to just have to rely on PoW as a limiting factor. You can't have any Freenet-style caching of content, because there's no concept of last-access-time (without massively diluting the blindness of server owners that's 99% of the point of PIR), so you just have to eat the oldest data first. Everything is a tradeoff, and in this case, storage is the achilles heel of being completely blind to content in any form. And that's the whole point of the PIR/EKS/etc system. It's worth that tradeoff. But as soon as somebody can give you 4TB worth of flood, your database has been effectively emptied. All messages have been lost, thanks for playing. Move along.
For clients searching is very CPU intensive. It will likely take several minutes (depending on CPU of course) for the client to decrypt their obtained search results. I don't think searching is actually very CPU intensive for the servers, but let me read the entire document through again before I make that claim.
Fantastic.. The more of the CPU load you can shift to the client, the better I like it. You get some PoW-like benefits, and your server scalability improves.
Do you think you are better programmer / know more about security than the BitMessage people? Do you know not to directly encrypt payloads with RSA? Are you pretty good at math? Do you know C programming?
Being honest here: no, yes, hell yes, no, and barely. :)
I'm not saying I have to be djb before I can help, I'm saying programming has never been my focus. Again, I'm starting to work on that, but I'm a long ways off. I've been duct taping things together for a few decades, and I'm reasonably comfortable reading C code up to a point, and will definitely look at anything you post on github, but I'm just setting proper expectations. I could probably design and implement something like Tails from scratch (great example of integrating and duct-taping other people's code not actually developing anything with an attack surface from scratch) without a technical problem. But if I wanted to write Bitmessage from scratch, that'd take me a year and a shitload of learning. And you'd laugh your ass off at it when I got done.
pretty much what I am trying to say is that even if you are not top security expert in the world, you can still help contribute to this and still look over code that is done.
I'm happy to help however I can.
-
Well I read the one way indexing whitepaper and it is not what we are looking for. Their abstract was much more impressive sounding than the rest of their paper.
-
Just throwing this out there: What if you tagged every payload posting/response with a (non-brute-forcable) index key that lets everyone who knows the index key find all the messages? Then rely on users inviting others by sending metadata objects containing the thread index key and the session key to the thread payloads? I know you give up unique per-public-message-in-thread session keying, but only when someone becomes a full participant (they need the index key and the session key) in the thread. And in either scenario, a compromised thread member with access to public messages in the thread results in a disclosure of all public messages in the thread.
This is something I have been recently considering as well. In many ways it would be an improvement if we didn't break things down to such a fine degree, by which I mean we had forum shared index tags and group shared keys, rather than pairwise indexing tags and pairwise keys. It would be a lot more like a traditional forum. Instead of Alice tagging a message with an index string, and then telling Bob and Carol about it with a metadata packet indexed with a secret tag between Alice and Bob for Bob and Alice and Carol for Carol, Alice could tag a message with a single group index tag and the message could be decrypted with a shared group key. This would have several advantages:
1. As people are invited to the forum, they can easily download all the old posts of the forum that are still in the cache. All the posts are indexed by the same group index tag, and all of them are encrypted with the same key. Someone who has the group index tag and the key can easily download all forum messages, without having to be pointed to each of them. This would make it a lot more like a traditional forum.
2. In being more like a traditional forum, it would probably make group organization a lot easier. Like you said earlier, a public forum is one-to-any, not one-to-many. If someone has the group index tag and the group shared encryption key, they can make a post and immediately know that anyone in the group can read it, even if they don't know everybody in the group. I don't know everybody on this forum, but when I make a post on this forum I know anybody who knows about this forum can read it.
3. It would make things less complicated. Pretty much, we would be nearly done with everything after implementing private stream searching. The actual forum itself would mostly just be a GUI, we wouldn't need to implement systems for people being able point to posts, etc. In general it would be much simpler and easier to understand.
There are also several really big problems with such a system though.
1. It is less compartmentalized. If Alice makes a post and she only wants Bob and Carol to be able to read it, well she cannot tag it with a group shared tag anymore, unless only Bob and Carol are part of the group. If they have a group shared tag between them, it is just complicating the original system more. What if Alice then wants to make a post for only Carol and Doug? Do they need a new group tag between them? It would be easier in such circumstances if they all have individual pairwise tags between them, and build up groups by adding pairwise shared index tags to the messages they send. This makes it easier to dynamically create groups on the fly.
A. On the other hand, Alice could tag the post for the group, but include in it a decryption key only encrypted for Bob and Carol. It could also be tagged with a shared secret between only Bob and one for only Carol. This would allow the entire group to see that someone posted a message, but only Bob and Carol would be able to decrypt it and know it is for them. On the other hand, this also has some problems of its own.
2. This is probably the biggest issue. If the messages to the group are all indexed by the same tag, then every single time Alice searches for posts, she will get all of them! By indexing messages with a one use shared secret string, Alice knows she will only get new messages with each search, because she will not search for tags she has already found messages associated with. How could we have unique group tags for every message, if we assume that Alice doesn't know everybody in the group? We could hash out the original group tag I suppose. But there are some anonymity issues with this.
3. How do we protect from spammers? If messages are searched by a group shared tag, whitelists will not work. If a spammer learns the group shared tag, what is to stop them from spamming thousands of messages? Everybody in the group will download the messages. The only way around this is to have two keywords and to only download items that match both, the first being a group tag and the second being some sort of individual identifier. But it is a question in itself how we could make the individual identifier in such a way that it can not be forged. The first solution that comes to mind is a pairwise shared secret between the poster and every single person they expect to read the message. Any solution likely requires that every member of the group either has a whitelist of known posters, or that they are vulnerable to being spammed to death. If they go with a whitelist, then it is back to one-to-many instead of one-to-any, and we have the same social bootstrapping problem as before.
4. Groups of any significant size will not be able to put much faith in the encryption key of their messages. This goes back to the loss of compartmentalization. If there is a group shared key, that means for a large group thousands of people could have the decryption key. And if a single key is compromised, all messages encrypted with that key can be decrypted, if there is a group shared key that means all of the group communications can be decrypted. This isn't so much a problem for public groups, since they will not really be concerned about the encryption of their messages anyway (although they will still be encrypted at least so the PSS servers can have some deniability). But what about a private group with twenty members? I suppose that even if each message is encrypted with a new key, if one of the clients is compromised or malicious, all messages that they can decrypt are already compromised. From a cryptographic point of view, it is much easier to break a single key than it is to break a single key for 100,000 messages, but I suppose we hope that it is impossible to break any of the encryption even a single time.
I suppose that what you are saying is a bit of a mix between the extreme singularity of a public forum and the extreme granularity of what I suggested. You suggest per thread encryption and indexing, whereas I was originally suggesting per message encryption and indexing, and the other alternative would be per group indexing and encryption. Perhaps per thread will be the best compromise. It will certainly make organization easier, which will be one of the biggest hassles with per message indexing/encryption. It still has some issues we would need to think on though.
I think this is one of the areas that we still need to discuss, and I appreciate any feedback from anyone else reading this. Pretty much at this point I believe we are done with posting messages to the forum, and we are done with encryption of messages, and we are done with all of the ground work. At this point the primary thing to work on is receiving messages from the forum, which is very likely going to be done with Private Stream Searching, as One Way Indexing didn't turn out to be as cool as I hoped it would. Actually using PSS presents one problem, in that it is not resistant to censorship. But back to the main point:
Assuming we have a strong system for making posts and a strong system for receiving posts, how do we actually make a system for group communications on top of this? At one extreme we can have a totally public forum type system, but then we need to take care of at least several of the points I made above. At the other extreme we can have an extremely compartmentalized per-post indexed messaging system that users sort of work into a forum looking thing themselves. In the middle we have things like per-thread indexing. We still have time to figure out what will be best, because no matter what we decide on it can use the same fundamental infrastructure. Actually, as the design work on the fundamental infrastructure is near completely done at this point (it pretty much is done unless we can find something with the properties of PSS that is also censorship resistant), then the best way people who are not programmers can help is by thinking of how we want the actual forum/communication part of the system to work, as that is something that is less solidified in design at this point.
I'm sure my suggestion gets rid of some of the granularity in the invitation process in exchange for simplicity/less searches, but regardless of method, anyone with a full view to the thread can disclose the contents of the thread however they want anyway. And you should kick the first person to suggest DRM as an answer firmly in the nuts.
Yes the system I originally suggested is extremely granular. In some ways that is a good thing, in other ways it is probably a bad thing. It is particularly bad when it comes to keep a single perspective of a single forum, and organization will be a challenge to say the least. Keep in mind that single searches can include multiple keywords and return multiple documents. Indeed one of the huge advantages of PSS over PIR is that we can return all documents tagged with either "From Bob" and "From Carol", rather than the single document at position 321 in the PIR database (which requires a nymserver to make sure only messages to Alice are put at that position).
I'm having trouble imagining a scenario where I want a new thread partcipant to only see new messages going forward, or to not see some public messages because he doesn't know the author (except for his WoT/whitelist/etc settings, which are on him). And any control I can imagine to keep that from happening is trivially subvertable by any member of the thread with a full view. Or even with just "a better view" than the new partcipant, since they can send him the decrypted messages if all else fails.
Yes that is a problem with the system I suggested as well, and another advantage I could add toward the less granular designs. In the less granular designs, it will be much more likely for new participants to see messages made in the past, whereas with the more granular design it will be much less likely but still possible. In the more granular design, it will be much less likely that an individual can see all posts in the thread even, but rather might only be able to see some subsection of them, although we would hope they can see everything people want them to see.
I think you're gonig to just have to rely on PoW as a limiting factor. You can't have any Freenet-style caching of content, because there's no concept of last-access-time (without massively diluting the blindness of server owners that's 99% of the point of PIR), so you just have to eat the oldest data first.
I think that POW is the only real solution as well, unfortunately. It will at least make it significantly harder for a single person to spam the shit out of the system. We could have it so users upload posts in popular threads though. The users know if a thread is popular.
Everything is a tradeoff, and in this case, storage is the achilles heel of being completely blind to content in any form. And that's the whole point of the PIR/EKS/etc system. It's worth that tradeoff. But as soon as somebody can give you 4TB worth of flood, your database has been effectively emptied. All messages have been lost, thanks for playing. Move along.
All messages on the server have been lost, but users keep content client side. It would be like if I have a complete mirror of SR, and then an attacker wipes the SR server. Well, I still have a copy of it!
Fantastic.. The more of the CPU load you can shift to the client, the better I like it. You get some PoW-like benefits, and your server scalability improves.
Indeed
Being honest here: no, yes, hell yes, no, and barely. :)
I'm not saying I have to be djb before I can help, I'm saying programming has never been my focus. Again, I'm starting to work on that, but I'm a long ways off. I've been duct taping things together for a few decades, and I'm reasonably comfortable reading C code up to a point, and will definitely look at anything you post on github, but I'm just setting proper expectations. I could probably design and implement something like Tails from scratch (great example of integrating and duct-taping other people's code not actually developing anything with an attack surface from scratch) without a technical problem. But if I wanted to write Bitmessage from scratch, that'd take me a year and a shitload of learning. And you'd laugh your ass off at it when I got done.
I'm happy to help however I can.
Something tells me you could write BitMessage in much less than a year. It is actually a really simple system. Keep in mind that a lot of programming stuff is indeed gluing other peoples code together. I didn't write ECDH or AES, but I am making extensive use of both. Perhaps the best way you can help is by helping on the design of the forum component. Assume we have an anonymous system for making posts and an anonymous system for receiving posts by keyword, and anything can be encrypted strongly. How do we go from this to a group communication system? How do we go from the group communication system to a full fledged marketplace? Largely, those are the remaining design questions.
-
for future reference: (ps: I hope to post code soon)
http://spar.isi.jhu.edu/~mgreen/correlation.pdf
1 Introduction
The past decade has seen growing interest in techniques for protecting critical data even in the face of catastrophic
storage failure. Recently, a number of loosely-related approaches have surfaced which guarantee data availability by
massively replicating records across decentralized, potentially untrusted, “survivable” storage networks. To ensure the
continued availability of content after storage nodes fail or leave the network, survivable storage networks continuously
re-distribute replicas from machine to machine. Ideally, content redistribution is provided as a service of the network,
and should not require the active participation of content publishers.
Recently, Srivatsa et. al. [29] showed that a number of survivable storage systems (e.g, [20, 1]) are vulnerable to
targeted denial of service attacks, as these systems make no attempt to hide the location of content replicas within the
network. An adversary can locate selected file replicas via the network’s search mechanism, or by manually examining
stored collections for identical instances of a replica. Once located, the adversary can limit access to the selected files
(and defeat survivability) by disabling the small subset of storage nodes which host the target content.
In this work, we propose techniques for correlation-resistant storage, which protect content replicas from tar-
geted attacks while allowing for continuous re-distribution by the storage network. The approach we describe allows
untrusted nodes to dynamically re-encrypt (i.e., randomize) file replicas such that an adversary cannot link the new
replicas to others within the system. Simultaneously, we provide a flexible search mechanism which allows authorized
receivers to locate any matching replica by querying storage nodes on information such as a keyword or other identifier.
We note that maintaining correlation-resistance while achieving this remote search facility is challenging when storage
nodes are untrusted, as one must prevent malicious nodes from re-using search queries to locate matching replicas at
other locations in the network. In that regard, the primary contribution of this paper is a new form of searchable public
key encryption scheme which allows for node-targeted keyword search, i.e., queries sent by users to a specific node
cannot be re-played by that node to locate files stored elsewhere. Our keyword search scheme is related to the schemes
of [12, 34], but enables randomization of indexes and is provably secure in the standard model.
-
This paper discusses the bandwidth and processing costs of a PSS algorithm
www.cs.berkeley.edu/~dawnsong/papers/stream-search.pdf
Right now the only missing piece of the puzzle is censorship resistance. I still have not found a way to allow a server to host content that it might itself be able to download, without it being able to remove the content from itself. I am not sure if it is even possible. Perhaps we will need to use multiple PSS servers and hope they do not all censor content, but this requires us to bring traffic analysis into more consideration. Anybody have any thoughts?
Our system for private stream searching allows a
range of applications not previously practical. In par-
ticular, we have considered the case of conducting a
private search on essentially all news articles on the
web as they are generated, estimating this number to
be 135,000 articles per day. In order to establish the
private search, the client has a one time cost of ap-
proximately 10 MB to 100 MB in upload bandwidth,
based on various tradeoffs. Several times per day they
download about 500 KB to 7 MB of new search results,
allowing up to about 500 articles per time interval. Af-
ter receiving the encrypted results, the client spends
under a minute recovering the original files, or up to
about 15 minutes if many files were retrieved. This per-
formance would be typical of a desktop PC; a mobile
device would be capable of handling a somewhat less
demanding scenario. To provide the searching service,
the server keeps about 10 MB to 100 MB of storage for
the client and spends roughly 500 ms processing each
new article it encounters. These costs are comparable
to many free services currently available on the web
(e.g., email and webhosting), so it is likely the private
searching service could be provided for free. With high
probability, the client will successfully obtain all arti-
cles matching their query, and in any case the server
will remain provably oblivious to nature of their search.
Most of the parameters of this scenario (e.g., the
number of distinct articles generated per day, the num-
ber of distinct words per file, the size of a file, etc.)
are probably less than one or two orders of magnitude
different than for the other online searching situations
mentioned in Section 1 (such as blog posts, USENET,
online auctions). We expect our techniques to be ap-
plicable to many of these searching applications. The
complete algorithms for the private searching scheme
are presented along with complexity analysis and for-
mal security proofs in [2].
-
So long-term, how do you see this implemented from a client/server perspective? A server or two on public IP addresses, with clients connecting directly, not hiding the fact that they're connecting to the service, but with the specific details of their traffic obviously hidden by the protocol and underlying encryption? Or letting the server ride on top of Tor/i2p/etc to provide a layer of (decorative?) transport anonymity?
I'll start thinking about the groups structure.. haven't really put much/any thought into anonymous marketplaces (weird, right?)s, but will start putting some thought into that as well.
-
He said a long time ago it would work over Tor.
-
I've been thinking quite a lot about group messaging and the whole PIR/EKS/etc architecture (will just call it PIR), and I think that the PIR architecture may be the wrong way to implement any "public" group messaging involving Web of Trust and thresholds like we've discussed earlier in this thread.
Group discussion is one-to-any communication at its core. This means that payload encryption is decorative at best (if you're really going to give everyone access to decrypt it, why are you encrypting it?) Verifying integrity, authenticity, and probably non-repudability through crypto are all great features for group discussion, but the payload itself doesn't support effective encryption.
Without some method of trivially usable read/write storage shared by all clients, many easy things become hard. Namely, we can't share WoT easily.
Because of PIR's strong anonymity protections against tying client actions to content, there's no effective mechanism to edit existing content. Which means that whenever kmfkewm's WoT changes, he has to automatically post a full version with the update. For me to inherit WoT properly, I have to periodically perform a search for the latest version of the WoT for each of the trusted members of my own WoT. The more CPU-intensive each search is, the bigger of a deal this is (probably to the client more than the server, from the sounds of it).
Implementing WoT requires a read/write location for the storage and public-visibility of individual Identities' Web of Trust databases. If I trust kfmkewm, and extend some level of trust to whomever he trusts, I need a way to view his WoT database. And tomorrow, when he adds astor to it, or just increases his level of trust for astor, I need to be able to see the change to his WoT to properly calculate my own inherited value of trust for astor.
Without some method of trivially usable read/write storage shared by all clients (that we totally take for granted in traditional architectures), some easy things become hard. And I think WoT is one of those things.
Additionally, the largest benefit of decentralized WoT-based group messaging is that it's a whitelist-driven activity without any central administration. You don't download messages from people you don't trust, or, phrased a a better way, you don't download messages that are below your threshold of trust for your current view. The PIR makes this especially complicated, since I need to query something roughly like this:
For Message in (QUERY CommunityBoardThread == "SRF:Security:Dissent.." && MessagePostDate > myLastReadMsgDate):
If Get_WoT_Value(Message.Sender) > MyTrustThreshold:
(download message)
I think that's impossible with PIR. Particularly the server calculating my WoT values for each message and doing the filtering for me.
So I need to swap it around, and do the query by my WoT, but that's an even larger set of queries as my WoT (and inherited trust) expands exponentially:
1. Pre-calculate all Identities who would have a trust value above my Trust Threshold and store that in InterestingIdenties
2. For Identity in (InterestingIdentities):
For Message in (QUERY Message.Sender == Identity && MessagePostDate > myLastReadMsgDate):
Filter based on Community/Board/Topic/etc.
The problem with the second option is that it prevents me from downloading messages from unknown identities (so I can't choose to view messages from people I don't know). It forces a quandary: "Either download everything then throw away stuff from people you distrust, or download only messages from people you know". You either end up with a killfile, or end up living in a fishbowl.
And WoT will end up with a huge number of queries if I'm looping through my identities, picking up possible messages from each. I trust ten people. My ten people trust ten people. Etc. It's a significant number of queries pretty quickly.
The only other way to do it that I can see would be:
1. Download all metadata objects for group messages created after MyLastReadMsgDate with a Community/Board/Topic of "SRF:Security:*"
2. Make a unique list of Senders (authors) of those messages.
3. For each Sender, calculate WoT. If above Threshold, download payload.
WoT can be stored two ways.. you can either have everyone publish their full WoT database, with values they've assigned, or you can have everyone publish digitally signed statements of trust for each Identity they trust, then bulk query all available signed statements for a given Identity. When they change, I guess you take the one with the latest version.
So I'm starting to think that the only two ways you could do WoT-based group messaging are either with a central server handling the WoT calculations (so you've built a "decentralized" system that runs on a central server), or with extremely large numbers of queries being required to get new messages (I'm guessing Freenet FMS falls into this category). In the latter case, queries have to be trivially cheap to perform, because clients make a lot of them.
PIR is perfect for one-to-one messaging, and you can extend that to one-to-many with clever mechanisms, but I think for one-to-any messaging, it may not be as good of a fit.
-
is this a real thing or one of those pipe dream projects
-
Group discussion is one-to-any communication at its core. This means that payload encryption is decorative at best (if you're really going to give everyone access to decrypt it, why are you encrypting it?) Verifying integrity, authenticity, and probably non-repudability through crypto are all great features for group discussion, but the payload itself doesn't support effective encryption.
Group discussion isn't always one-to-any, but public discussion is. There are plenty of private forums with screened memberbases, they might want to encrypt communications to prevent outsiders from being able to see them while allowing members to see them. Some forum have 80 members, others have 600. In the past there have even been groups that encrypted all of their messages with GPG with a group shared key, this would essentially do the same thing but automatically and better. Also, in some instances three or four people might want to talk together, they would consist of a group, but they don't want outsiders to see their communications. Group OTR would be an example of something that would solve this problem, and I think it is a problem worth solving and something that people would need. The only case where the encryption becomes decorative is in the case of public messages that anybody can see, but it still serves a purpose. Freenet message level encryption can be thought of as largely decorative as well, keys to decrypt the content on Freenet are widely available in many cases. In many cases nodes can identify the encrypted content passing through them if they want to, and they can decrypt it as well with the publicly available decryption keys. But the message level encryption still serves the purpose of plausible deniability, because even though a node could decrypt a message it doesn't mean it actually can, just like Freenet nodes could decrypt a lot of the content they host, but it doesn't mean they went out and got the publicly available decryption keys. So in this case public message encryption would serve the function of protecting the PIR servers by providing them some plausible deniability. Let's say the government knows CP has been uploaded to the network and then they seize a PIR node and say they found CP on it. They can say look it was right here in plaintext! You must have known about it! But if it is encrypted and the same thing happens, the person operating the PIR node can say well sure there is CP there but I am not involved with CP so I never looked for the keys to decrypt CP! It might not buy a lot to encrypt publicly viewable messages, but given how trivial it is to do, and the fact that it seems to buy a little, I think it is worth it.
Without some method of trivially usable read/write storage shared by all clients, many easy things become hard. Namely, we can't share WoT easily.
Trivially usable read/write storage isn't anonymous or secure.
Because of PIR's strong anonymity protections against tying client actions to content, there's no effective mechanism to edit existing content. Which means that whenever kmfkewm's WoT changes, he has to automatically post a full version with the update. For me to inherit WoT properly, I have to periodically perform a search for the latest version of the WoT for each of the trusted members of my own WoT. The more CPU-intensive each search is, the bigger of a deal this is (probably to the client more than the server, from the sounds of it).
I don't see a way around having to redownload my entire WoT in any scheme, short of me keeping track of who has my WoT and which version of it, and only sending them newly added people.
Implementing WoT requires a read/write location for the storage and public-visibility of individual Identities' Web of Trust databases. If I trust kfmkewm, and extend some level of trust to whomever he trusts, I need a way to view his WoT database. And tomorrow, when he adds astor to it, or just increases his level of trust for astor, I need to be able to see the change to his WoT to properly calculate my own inherited value of trust for astor.
Without some method of trivially usable read/write storage shared by all clients (that we totally take for granted in traditional architectures), some easy things become hard. And I think WoT is one of those things.
Tons and tons of easy things become hard when you make something that is cryptographically secure. The solution is not to compromise security for ease of use and implementation. Nothing with trivial read/write storage is secure enough. The closest thing you get is hidden services with PHP scripts on them, similar to this forum. But in obtaining that trivial read/write, you lose all of the benefits of mixing and all of the benefits of PIR. It would be great if a trivial system like Tor existed with strong security and anonymity properties, but it is an open research question in academia if it is even possible and most people think it isn't. There is no known system that is as easy to use as Tor Hidden Services that comes anywhere near the level of anonymity that can provided by Mixing and PIR. Tor is a BB gun, Mixing and PIR are assault rifles. To put things into context, the attacks against Tor are almost entirely different from the attacks against mix networks, because mix networks have solved almost all of the attacks against Tor and then some. The only attacks that work against Tor that also work against Mix networks are attacks that are inherent to all anonymity networks and entirely impossible to defend against, such as long term pseudonym/IP intersection attacks carried out by a GPA, which also work against DC-nets.
The problem with the second option is that it prevents me from downloading messages from unknown identities (so I can't choose to view messages from people I don't know). It forces a quandary: "Either download everything then throw away stuff from people you distrust, or download only messages from people you know". You either end up with a killfile, or end up living in a fishbowl.
People you know can point you to messages from people you don't know.
PIR is perfect for one-to-one messaging, and you can extend that to one-to-many with clever mechanisms, but I think for one-to-any messaging, it may not be as good of a fit.
PIR is one of the only highly anonymous way to receive data long term.
BTW there are updateable PIR schemes:
Private Keyword-Based Push and Pull with Applications to Anonymous Communication
We propose a new keyword-based Private Information Retrieval (PIR) model that allows private modification of the database from which information is requested. In our model, the database is distributed over n servers, any one of which can act as a transparent interface for clients. We present protocols that support operations for accessing data, focusing on privately appending labelled records to the database (push) and privately retrieving the next unseen record appended under a given label (pull). The communication complexity between the client and servers is independent of the number of records in the database (or more generally, the number of previous push and pull operations) and of the number of servers. Our scheme also supports access control oblivious to the database servers by implicitly including a public key in each push, so that only the party holding the private key can retrieve the record via pull. To our knowledge, this is the first system that achieves the following properties: private database modification, private retrieval of multiple records with the same keyword, and oblivious access control. We also provide a number of extensions to our protocols and, as a demonstrative application, an unlinkable anonymous communication service using them.
Another option would be to allow the forum to operate like a normal public forum for public messages. Meaning messages in plaintext uploaded through the mixnet to the PIR server, indexed by things such as the subforum they are in. Then I could make a message and post it to the security forum, and I would do that just by uploading it (through the mix net) to the PIR servers, indexed with a tag like ForumA::Security-Subforum-day. Then someone could obtain all messages in ForumA::Security-Subforum by searching for all messages tagged with ForumA::Security-Subforum-day , where day is the current day (or the last day since they got messages). This makes things easier for public forums, but it also has a few problems. For one it makes spamming easier, because nothing stops anybody from using the tag, and people wont know who a message is from until they download it. In the case of public messaging, the encryption is kind of decorative, but some level of deniability is lost from the PIR servers. The biggest problem I see is that it makes spamming much easier. Also it makes it harder to differentiate messages with the same tag, we don't always want to get every message tagged with ForumA::Security-Subforum-day , we might only want to get NEW messages with that tag, since we already downloaded half of the messages last cycle. Keep in mind that there is a limit to the number of messages a client can download per period of time, and that this is required to protect from a class of intersection attacks. If a client cannot download all ForumA::Security-Subforum-day messages in one go, how are they able to get the remaining messages the next time they try? It seems like they will end up getting the old messages they already got, and will miss all the messages for that time period that they cannot fit into one of their buffers. Solutions like tagging messages ForumA::Security-Subforum-day-a, ForumA::Security-Subforum-day-b, etc would work but they would introduce some anonymity attacks if there are collisions, and since it is high latency there will be (IE: Alice sends the first message of the day ForumA::Security-Subforum-day-a and has it mix for an hour before making it to the PIR server, at the same time Bob notices there are no new messages for the day so he makes the same post ForumA::Security-Subforum-day-a). One solution would be for the PIR servers to add the a variable, but since there are many servers getting messages at different times, what happens when server 1 gets its first message and labels it message A, but then another server gets its first message and labels it message A, when they are different messages?
Perhaps something like the push-pull PIR above would make the most sense. Pretty much no matter what it needs to be PIR based though, because nothing else is anonymous enough.
-
Seems like a lot of talk without any substance to it. I think everyone should look into this more to understand all the features of this program and how it is different from tor, and how those differences make it superior to tor. But "we guarantee anonymity" is the same thing many failed sites have claimed. I think we should be as skeptical as possible, although the idea is very nice.
Also the "bandwidth share" sentence was quite disconcerting.
-
Seems like a lot of talk without any substance to it. I think everyone should look into this more to understand all the features of this program and how it is different from tor, and how those differences make it superior to tor. But "we guarantee anonymity" is the same thing many failed sites have claimed. I think we should be as skeptical as possible, although the idea is very nice.
Also the "bandwidth share" sentence was quite disconcerting.
Feel free to do some research
http://freehaven.net/anonbib/cache/alpha-mixing:pet2006.pdf <--- done
http://www.abditum.com/pynchon/sassaman-wpes2005.pdf <---- not doing, talks about the idea of using PIR + protecting from intersection attacks
http://www.esat.kuleuven.ac.be/~cdiaz/papers/cdiaz_inetsec.pdf <---- discusses mixes in general
http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA456185 <---- need to do
http://research.microsoft.com/en-us/um/people/gdane/papers/sphinx-eprint.pdf <---- done
http://spar.isi.jhu.edu/~mgreen/ZerocoinOakland.pdf <--- Talks about Zerocoin, the first distributed blind mix. Might integrate, others have coded an alpha version
http://www.cl.cam.ac.uk/~rja14/Papers/bear-lion.pdf <--- block cipher for Sphinx, done
that should get you started
Why it is superior to Tor:
A. Tor is vulnerable to end to end correlation attacks
Client <-> Client ISP <-> Entry Node ISP <-> Entry Node <-> Entry Node ISP <-> Middle Node ISP <-> Middle Node <-> Middle Node ISP <-> Exit Node ISP <-> Exit Node <-> Exit Node ISP <-> Destination ISP <-> Destination
if the attacker has control over any of the members of {Client ISP, Entry Node ISP, Entry Node} AND any of the members of {Exit Node ISP, Exit Node, Destination ISP, Destination} then Tor offers no anonymity to the client. Tor tries to protect from this primarily by having a very large network of nodes, which makes it hard to own both the clients entry node and exit node. As the Tor network has more nodes added, it becomes increasingly difficult to do a fully internal attack, however it is still entirely possible to. The big thing is that it is easy to monitor destination site at its ISP, or even to seize the destination server and monitor from that point. Even hidden services have relatively crappy anonymity. So the Tor security model is little more than if you have a good entry node you might not be fucked but still could be, and if you have a bad entry node you probably will be fucked but might not be. Attackers like the NSA monitor traffic at the ISP level, quite intensely apparently, and they could very well be capable of watching your traffic at Client ISP and Destination ISP, which means they can link you to your destination. Attackers with a lot of nodes on the network still stand a significant chance of owning both Entry Node and Exit Node, especially if they have high bandwidth nodes. It is estimated that there are already groups of node owners who can deanonymize huge parts of the Tor network in little time, right now the hope is that they are not malicious. Adding more nodes to the Tor network, from diverse groups of node operators, can continue to protect from internal attackers nearly infinitely, as more nodes are added by good people it becomes less likely for bad people to have nodes on your circuit, unless bad people also keep adding more nodes. Also, even if bad people keep adding nodes, it makes it less likely that any given bad person will have a node on your circuit. Anyway, despite the ability to do a somewhat decent job at protecting from internal attackers, there is a hard limit to the protection Tor can provide from external attackers. There are only so many different networks making up the internet, and most of them are exchanging traffic through a much much much smaller number of exchange points. Monitoring a few hundred key points on the internet is enough to monitor most traffic on the internet, and no matter how many Tor nodes are added to the network it doesn't matter, because once all Client/Entry/Middle/Exit/Destination ISP's are being monitored by an attacker, they can break Tor in 100% of cases. But they don't even need to monitor all of these ISP's or even all of the IX's they exchange traffic through, in order to deanonymize a huge subset of Tor users, because Tor entry guards rotate over time for one and eventually you will use a bad entry or a good entry with a bad ISP, and for two if the attacker monitors your entry and your destination externally they can still deanonymize YOU even if they cannot deanonymize the entire internet. In addition to this keep in mind that a tremendous amount of traffic on the internet is crossing over the US, and most international traffic is going through multiple countries. The graph I drew above isn't even high enough resolution, in reality there are tons of points between these nodes and a compromise ANYWHERE between client to entry + a compromise ANYWHERE from exit to destination, is enough to deanonymize a user.
High latency mix networks are not nearly as vulnerable to this sort of attack. The goal of anonymity is to remove any variation between one user and their communications and another user and theirs. Any variation can be turned into an anonymity attack. Tor takes a few measures to add uniformity to communications. First of all, due to the encryption of onion routing, all streams are, at the content level, 'the same' in that they all are indistinguishable from random noise at each hop. If you send the message "Example" down the following path:
Client -> Node 1 -> Node 2 -> Node 3 -> Destination
an attacker who owns Node 1 and Node 3 has no trouble at all linking the message from Node 1 to Node 3, even though there is Node 2 in the middle. This is because the content of the message is exactly the same, and it sticks out from other messages. If onion encryption is used, the message "Example" might look like "I9aiPS1" at the first node, "9!@jU9A" at the second node, and "AHZ12(a" at the third node. Now the attacker at node 1 and 3 cannot so easily link the message, because it looks completely different at node 3 versus node 1. It blends in with all the other messages node 3 is handling, in that all of them look like completely random noise, and there is an extraordinarily low probability that the same pattern of noise has been seen anywhere else ever in the existence of the entire universe.
Tor also pads all information into packets that are exactly the same size, 512 bytes. If not for this, we would have problems like this:
Client -> Node 1 -> Node 2 -> Node 3 -> destination
sending the stream
[1byte][6bytes][3bytes][10bytes]
4 packets with different size characteristics. In reality we could expect many more packets with vast size differences, if not for the padding of Tor. The attacker at Node 1 and Node 3 can easily link the stream, because they see that there is a correlation in the packet size of the stream they are handling, and chances are it is pretty damn unique at that. With Tor it is like this
Client -> Node 1 -> Node 2 -> Node 3 -> destination
sending the stream
[512byte][512bytes][512bytes][512bytes]
Each packet has the same actual payload data as before, but now it is padded such that all packets are the same size. Now the attacker at node 1 and 3 cannot use this characteristic to link the streams, because all the packets are the same size, and all streams all the nodes handle have packets of the same size so variation has been removed and replaced with uniformity.
So Tor is good to do these two things, and they are indeed improvements over many VPN technologies, and they have in fact ended up making Tor harder to attack at a technical level (just look at fingerprinting of VPN traffic, which approaches 100% accuracy, versus fingerprinting of Tor traffic, which has taken a long time to get past 60% accuracy). But Tor still has many problems left over, and most of them are inherent to the design of Tor. There is still the possibility of variation in interpacket timing.
imagine each . equals a small unit of time, like 1ms
Client -> Node 1 -> Node 2 -> Node 3 -> destination
[packet].......[packet]..[packet].......[packet].........[packet]...[packet]
When the client sends a stream of packets down Tor, they are sent as they are constructed, not at fixed time intervals. Even if they were sent at fixed time intervals, any of the nodes could delay individual packets as much as they want before sending them forward, and in fact even though they don't need to do this they can still do it to speed up this type of correlation attack. Now we are right back to where we started, the attacker at node 1 and 3 can link the stream by analyzing interpacket arrival times and looking at the packet spacing. Tor does not protect from this attack at all, short of natural network jitter, which is not enough.
Mix networks protect from this attack because the nodes completely wipe interpacket timing characteristics between nodes.
Client -> Mix 1 -> Mix 2 -> Mix 3 -> destination
If Mix 1 sends a message to Mix 2 with the following packet properties (though on a mix network the entire message can be one very large padded packet):
[packet].......[packet]..[packet].......[packet].........[packet]...[packet]
it doesn't matter, because mix 2 waits until it has the entire message, and then it sends it forward. It doesn't send packets as it gets them. This removes all of the interpacket arrival time that mix 1 inserts into the message.
[packet].......[packet]..[packet].......[packet].........[packet]...[packet]
Becomes
[packet][packet][packet][packet][packet][packet], prior to node 2 sending it to node 3. The fingerprint is sanitized.
Another issue with Tor is that packet counting is not protected from at all.
Client -> Node 1 -> Node 2 -> Node 3 -> destination
[p1][p2][p3]
The client sends a total of 3 packets to the destination. Now the attacker at node 1 and 3 can use this to try to link the stream to the destination back to the client. Because if node 1 processes a 3 packet stream, and node 3 shortly after processes one, then there is a high probability that the streams are related. Node 3 knows the 5 packet stream it just processed is not the 3 packet stream that node 1 just processed. Tor very minimally protects from this by padding single packets to the same size, but it doesn't pad entire streams to the same number of packets. Mix networks protect from this because every message is just one really large packet, so all 'streams' are one packet.
These are just two of many examples of how higher latency mix networks are superior to Tor. I was going to make a list so I started with A, but after typing this all out I realize that A by itself should be enough to show you why mix networks are superior to Tor. I could go on to B, C, D, and E, but for right now I am tired of typing.
To summarize point A:
Tor adds packet size invariance (by padding packets) and payload content invariance (by onion encryption) but leaves stream size variance and interpacket arrival time variance. Any message variance leads to correlation attacks! Good mix networks remove stream size variance by having all messages as a single fixed size packet, and also removes timing variance by the same method as well as by introducing timing delays at each hop enough for mixes to remove interpacket timing delays and add interpacket timing invariance between mixes (even assuming multiple packets are used). A good mix network will remove *ALL* message variance at *EACH* mix. So it doesn't matter if:
Client -> Bad Node -> Good Node -> Bad Node -> Destination
happens, because the good node in the middle makes the message totally uniform and removes any possible fingerprint that could be identified between the first and last bad node. On a good mix network, every single message either has exactly the same characteristics, or essentially the same characteristics in that all messages are totally randomized at each hop. Anything that is not invariant between hops is randomized at each hop. This alone massively increases the anonymity of a good mix network over Tor, but many other things come into play as well. I was not talking out my ass when I said the attacks on mix networks are almost totally different from the attacks on Tor. This is because a good mix network fixes all the attacks on Tor that can be fixed, and also should take measures to protect from more advanced attacks that are in the realm of mix networks. The mix network threat model is totally beyond the Tor threat model, in that it fixes the problems of Tor that can be fixed, and moves on to the problems of mix networks.
And as for the problems of mix networks, let's start by saying that the threat model against mix networks assumes that the attacker can view *ALL* links between mix nodes. Keep in mind that the threat model for Tor assumes that an attacker who can see all links between all Tor nodes can 100% compromise Tor. So mix networks assumed attacker is the attacker that can deanonymize Tor traffic in real time.
One of the problems of older mix networks (all the currently implemented remailer networks, in and of themselves anyway) is the long term statistical disclosure intersection attack. Assume that the attacker can see all links between all mix nodes. Alice communicates with Bob over the mix network that has 1,000 members. The attacker can see how many messages every client sends and receives, due to the fact that he can see the links between all nodes. In the simplest example, the attacker is Alice. Alice sends 1,000 messages to Bob. Now Alice waits to see which nodes on the network receive 1,000 messages in the period of time she knows it will take for all of her messages to be delivered. Any node that doesn't receive at least 1,000 messages can be ruled out as Bob. Over enough (little) time, Alice can identify Bob in this way.
Note that like nearly all anonymity attacks (and many security attacks), the culprit here is variance. In this case, the variance is in the number of messages received. Not all nodes receive the same number of messages, and Alice can use this to her advantage in order to find Bob. This attack also allows third parties to link Alice to Bob, even if Alice and Bob are both good and not attacker, although that takes a little bit more math to explain. Anyway, the first solution presented to this problem was to use PIR.
Now Alice sends 1,000 messages to Bob's PIR server. She has no way to directly send 1,000 messages to Bob. Instead of the network pushing 1,000 messages to Bob (as happens in the old designs), Bob now pulls messages from a rendezvous node (via PIR to protect his anonymity). The key difference is that Bob will only download 50 messages every day, regardless of how many messages he has waiting for him. And every other member on the network will only download 50 messages per day as well. They can communicate with their PIR server over the mix network if they want to tell it to delete certain messages, or to prioritize certain messages over others, but no matter what they will only download 50 messages per day. Now it doesn't matter if Alice spams Bob with 1,000 messages, because in the best case Bob can just delete them and never end up downloading them at all, and in the worst case Bob will download 50 of them a day for 20 days, and end up obtaining 1,000 messages over a 20 day period, the same as every other user of the network.
This protects from long term statistical disclosure attacks from third parties (linking Alice to Bob) and from a malicious Alice trying to locate Bob. However, it doesn't prevent a malicious Bob from trying to locate a non-malicious Alice. Let's say that Bob notices he has obtained 25 messages from Alice over a period of ten days. Now, remember that Bob can also see the entire networks links. So now Bob can do the intersection attack himself, saying that Alice must be one of the nodes that sent at least 25 messages over this period of time. And again, over enough time, Bob can locate Alice with this attack. The only way to protect from this attack is for each client to send an invariant (yes, once again, variance was the culprit) number of messages per period of time. This can be accomplished via random dummy messages being sent when legitimate messages are not. Essentially Alice has 50 messages she can send in a 24 hour period, and every so often either one of the legitimate messages in her outgoing queue is sent, or if she has no messages in her outgoing queue, a dummy message is sent in its place. Now every 24 hour period of time, Alice and every other node on the network sends 50 messages, and the variance is erased, preventing Bob from carrying out his attack.
Both of these techniques are modern and neither has been implemented in a real network yet (although Usenet + Mix Network has been used, aka everybody gets everything PIR, to protect from malicious Alice and malicious third party carrying out this attack. However, everybody gets everything PIR does not scale), currently all mix networks are weak to this type of attack (in themselves, when everybody gets everything PIR is layered on via usenet or shared mail archives you can protect from the first attack). Pynchon Gate was the first whitepaper to describe a (scalable) way to prevent this attack from locating Bob or a third party linking Alice to Bob, I cannot name a specific paper that was the first to suggest using dummy traffic to prevent Bob from locating Alice with this attack, but probably one of the early papers on Dummy Messages mentions something about it.
-
One attack can work against any sort of network with any sort of pseudonymity. Pretty much the only way to protect from it is for all posters to be anonymous in the true sense, meaning without a name. It simply consists of the attacker (who again, can see the links between all nodes on the network) taking note of all of the nodes who are on the network at a given time that the attacker receives a message from a user. Because messages are time delayed the attacker can not simply say that the sender of the message must be one of the clients currently on the network, but they can guess with high probability that the message was sent from a client they observed sending a message in the past week, or month, and realistically probably the last couple of hours or days. If the attacker is Alice, and she assumes that Bob has his messages delayed for no more than two days, after she gets a message from Bob she can cross out all of the clients she did not see send a message in the previous two days. Now ideally all clients send the same number of messages every day, but realistically clients will not be online every single day of the year. Maybe Carol went on vacation for two weeks and didn't connect her client at all, but Bob kept sending messages during this time period. Alice can find the maximum time delay of Bob's messages by simply seeing how long it takes her to get a response to one of her messages. If she gets a response to a message from Bob in one day, she knows that he did not delay that message for more than one day, and so he must be one of the nodes that sent a message in the previous day.
This sort of attack even works against otherwise information theoretically secure systems such as DC-nets. In a network that maintains ideal conditions it can be protected from, but in the real world networks don't maintain ideal conditions. Also, a very powerful attacker can force a network to not maintain ideal conditions, by cutting internet access to certain countries even if need be. Oh Bob still wrote me messages while I had cut internet to Iran? Bob is probably not in Iran after all!
The only way to really fully protect from this attack is to not have pseudonyms or other identifying characteristics, including writeprint. This attack only works if you can see at least two different snapshots of the network, associated with at least two actions of a target. If the target cannot have a second action identified, the attack cannot work. Other techniques could involve the use of covert channel networks to hide from Alice the fact that Bob's IP is in communication with an anonymity network, even if Alice can monitor the entire traffic of the entire internet. Note that once again the weakness is a result of variety, namely the clients that were online changed from action one to action two, and this led to an attack on anonymity. Invariety between actions, ie: the network has the same exact clients using it when both messages were sent, would protect from this attack.
Thankfully in practice this attack can be made to take a long time to carry out. Provided there is a network with a substantial number of users, and that it bootstraps itself into its initial anonymous state. But it depends on the up time of the clients using the network. Ideally clients would start sending dummy traffic to the network, but not posting on it, for a period of several days to a month before actually using the network to send messages. This allows them to blend in with other new clients who join in the same time frame, and other people who may already be using the network but decide to make new identities. Maybe there will be a hundred or so people in your anonymity set from this attack. It could take quite a while for them to go offline long enough for you to be identified as not them. But slowly over time it is likely that your initial anonymity set will be chipped away, and in the end it will be only you left in it. Even anonymity systems that are proven as being as anonymous as any system can possibly be are weak to this attack if pseudonyms are used.
-
Lets be realistic here, we dont really need a system that protects against everything. Lets prioritize some things first (as they pertain to SR):
1. Hidden services are a must. While i enjoy browsing porn with tor the only real reason i use it is SR. If the system cant support the hosting of sites then there is very little usefulness to it and we might as well resort to freenet.
2. Low latency. If people are to make any real use of this it needs to be accessible and responsive. Having a big clunky system that protects against SuperSaiyanNSA9000 wont do us any good if it takes forever to do anything.
3. Invariance or more variance? You make a point that if all traffic were uniform in size and timing then it would be all but impossible for even two bad nodes in the chain to track it. But then this creates massive overhead that is impracticable with current tech. So lets fudge that idea, we have packet encryption and packetsize invariance, this protects our contents, but when it comes to timing what we need is more variance, that is if every node added a random number of seconds to most of their packet then while it might slow down the network a bit it will be impossible for even a global adversary to determine the flow of a packet just based on timing. Packet numbers can be fudged this way too, add a few more or less packets here and there, not everything needs to be manipulated, just enough that an attacker wont know what has or hasnt been manipulated.
What i want to know is what exactly protects against an attacker that owns the majority of relays in a network?
-
Lets be realistic here, we dont really need a system that protects against everything.
Or, phrased another way, "If we want to protect against everything, we need more than one system". I agree.
As we head into esoteric anonymity discussions, everyone tends to forget that each individual wants something slightly different. Many people here want a more secure replacement for SR. kmfkewm wants a way to share content with private groups of people that can't be traced or censored, and won't put the server owner in jail for hosting it. Personally, I want a decentralized platform for pseudononymous group discussion. Many other users just want a way to surf the clearnet anonymously (those people are going to be increasingly disappointed as time moves on, I think).
SR as a marketplace that works because:
1. Bitcoin allows for payments between entities in a less-than-obvious manner.
2. Tor (particularly TBB and Tails) provides end-users an easy method to connect. It's instant gratification, even if everybody bitches about how slow it is. Go to coffee shop. Boot tails. Wait a minute. You're on SR. Everybody focuses on the low-latency part, but it's actually the fire-and-forget part (just fire it up, use it for a half hour, shut it down, and move on) that makes it usable. That's why Tor has a bazillion users, and i2p and Freenet doesn't.
3. PGP allows the securing of one-to-one communications between buyers of sellers
4. SR is centrally administered instead of decentralized, allowing that central administration to act as both escrow service and housekeeper.
Your concern is mostly around #2. If SR required users to download Freenet, then configure lots of Java programs, eating up a few gigs of RAM, then required users to stay online and connected 24/7 (pretty much requiring that be at their house at that point), it wouldn't have any users. So you're probably right about usability. If everyone who uses SR ends up tossed in jail, I'm guessing it's not going to have any users, either. So kmfkewm is probably right, too.
What i want to know is what exactly protects against an attacker that owns the majority of relays in a network?
Nothing prevents deanonymization or some level of traffic correlation. Encryption prevents content inspection en route. Tor's central directory service structure (and methods for choosing relays) dampen the threat, but can't negate it. When the core of your anonymity protocol is "I'll route through lots of relays so nobody can see what I'm doing", then when somebody owns all of those relays, you're not fooling that guy. Sybil attacks are a bitch, and they always will be to decentralized anonymous networks. The alternative is to centralize a network, and have one entity vet every relay. Which is damned handy, because that central entity is exactly where the National Security Letter (or death threat from Los Zetas..pick your poison) will be delivered to.
-
Why cant the one central entity in your scenario hide within the network itself?
-
Why cant the one central entity in your scenario hide within the network itself?
I'm assuming you mean a central authority to vet relays. Central authority hiding behind hidden services/anonymity networks seems to have worked well for SR's central administration.
It works well for them because they don't hide while touching the non-hidden world at a million measurable points. All of their actions occur behind the shroud of Tor and bitcoin.
However, when it comes to using a hidden central authority to bless entry/relay/guard/exit nodes on an anonymity network, things change pretty quickly.
The central authority has to be trusted implicitly by everyone participating. A benevolent dictator of sorts.
They need a way to actually vet the relay before allowing it to join. How do they do that?
1. They can pick people they know and trust. That's great, but since relays have to be discoverable by clients, I can always make a list of relays. And if I put on my evil NSA/Zetas hat, I can perform an intersection attack against the operators of those relays (they have public IPs, are at hosting providers that someone's paying for, etc). What is their real life network? What do they have in common? Who's paying for their bandwidth? When did they first join the network? What do their other communications look like for the month preceding that joining?
2. They can inspect the server configuration or something. And it can change as soon as they're done. That one's worthless. If somebody suggests that they be given a backdoor to inspect all servers all the time, that's great, but every time they log into a known, public relay, they're at risk of being deanonymized a million different ways. Fire up a relay, wait a month, do suspicious shit as bait, watch them log in.
3. They can find ways to measure the relays externally, and blacklist them at will. This part probably is somewhat doable, but it makes for an arms race of "can you catch all the evil shit you can do with a relay?". And in anonymity technology, arms races are almost always won by the most motivated group with the most significant resources. This basic concept is one of the two reasons why you always see the Tor developers talking nonstop about the Tor Metrics project. (the other is that they need those metrics to justify funding). Not that they blacklist nodes to any significant degree.
The biggest problem with building anonymity systems around clusters of people "you can trust" is that even if you *really* can trust them right now, you can't actually trust them forever. People end up in situations where they're no longer controlling their own actions. Sabu with Lulzsec/Antisec is a great example. There's a nice, tight cluster of people who all said they could trust each other. They're almost all in jail. Here, you might end up with a technically smart, fairly trusted person who's all about protecting SR. But the dude loves his CP. He gets busted for that, and the FBI owns him, his servers, and his trusted relays. Another relay owner gets heavily in debt to Los Zetas in a year over some business arrangements gone wrong, and now they're operating him.
-
Good reading on this thread.
Whichever platform becomes the next transport it will have to deal with the shift away from peer-to-peer model consumer connectivity to the slowly coming broadcast-only model. It is a model where our endpoints are unable to establish connections with other arbitrary endpoints on the Internet and we can only connect to approved addresses on approved ports. This has been slowly coming for the last 10 years or so. Tor will have difficult surviving that change in its current form.
Other desirable characteristics:
1) The ability to relay/mix traffic 'off net' - eg if I have a relay I may choose to wirelessly pump some of that data to another relay via a private wireless link or similar - then back on to the net. This would frustrate adversaries with the ability to conduct widespread traffic analysis of the Internet (assuming they're not also monitoring my private networks)
2) Broadcast model for message delivery - think the Bitcoin ledger system. All clients see all messages but can only decrypt messages for them. An adversary, even one who can see all traffic, cannot say where the recipients are without access to their key. Very inefficient but much better anonymity for recipients. I think a satellite would serve us well for this - pricey though.
-
1. Hidden services are a must. While i enjoy browsing porn with tor the only real reason i use it is SR. If the system cant support the hosting of sites then there is very little usefulness to it and we might as well resort to freenet.
Freenet isn't a bad choice. Tor is a featherweight boxer, quick but easily knocked out by a heavyweight attacker. Freenet is kind of a medium weight boxer. Mix networks are the heavyweights, slow but powerful. But it is likely if the NSA can pwn Tor they can pwn Freenet as well. It is a bit harder to pwn Freenet, but an attacker who can watch most links on the internet + who owns a decent number of Freenet nodes can break Freenet. We are assuming that the NSA can watch most links on the internet, it isn't going to be hard for them to own a decent number of Freenet nodes as well. In the case of Tor the attacker doesn't need to own a single Tor node if they can watch the majority of links on the internet. So the cost to attack Freenet in a major way is more than the cost to attack Tor in a major way, but it might in this case be the difference between someone in the Forbes top 100 buying a ten million dollar house or a twenty million dollar house.
2. Low latency. If people are to make any real use of this it needs to be accessible and responsive. Having a big clunky system that protects against SuperSaiyanNSA9000 wont do us any good if it takes forever to do anything.
This is not a realistic goal to have if you want strong anonymity. The only systems that allow for strong bidirectional anonymity and low latency require so much bandwidth that they only exist on paper. Strong unidirectional anonymity is in the realm of possibility, but even that doesn't scale so well in most cases. BitMessage is a good example, since everybody gets every message it has ideal receive anonymity, and it isn't very high latency either since messages are only slightly delayed at each node. But it wont scale very large and has a host of other problems with it.
3. Invariance or more variance? You make a point that if all traffic were uniform in size and timing then it would be all but impossible for even two bad nodes in the chain to track it. But then this creates massive overhead that is impracticable with current tech. So lets fudge that idea, we have packet encryption and packetsize invariance, this protects our contents, but when it comes to timing what we need is more variance, that is if every node added a random number of seconds to most of their packet then while it might slow down the network a bit it will be impossible for even a global adversary to determine the flow of a packet just based on timing. Packet numbers can be fudged this way too, add a few more or less packets here and there, not everything needs to be manipulated, just enough that an attacker wont know what has or hasnt been manipulated.
It is not impractical for many applications. For E-mail is certainly is not impractical there have been deployed mix networks for E-mail that removed all message variance. I think in many cases it is not impractical at all. For uploading and downloading high definition movies it isn't really practical, but for posts on a forum? For small and simple websites? For E-mail? For sharing small files like .pdf? For a blog?
The general rule of thumb is that uniformity is always good, and randomness is good sometimes but much less often. This can be seen in timing attacks against cryptographic systems as well. Random variance can often be filtered, invariance can never be filtered. In the case of layered encryption, randomization is secure. When it comes to inserting random jitter at each hop, I think it would either be insecure (in that even if it makes attacks harder, they would be realistic to carry out), or require such a massive range of time delays that it would actually be many orders of magnitude faster to just use a mix network and remove variance. I am sure some of the literature on anonymity discusses the idea of adding random jitter in depth, but I cannot off the top of my head think of a paper for you. The following paper on flow watermarking has demonstrated that attackers can filter substantial amounts of jitter in low latency networks though:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.3789&rep=rep1&type=pdf
Fortunately, our interval centroid-based watermarking
could self-synchronize the decoding offset with the encod-
ing offset even if 1) the clocks of the watermark encoding
host and decoding host are not synchronized; 2) there is
substantial network delay, delay jitter or timing perturba-
tion on the watermarked flows.
Actually they did an analysis on a rather sophisticated low latency system that utilized mixing, artificial jitter, dummy packets, etc, and were still able to insert readable interpacket fingerprints. So I think we will not have luck with this approach, and rather must take care to make messages totally uniform between hops (which means obtaining all packets before sending them forward to the next mix, to remove any interpacket timing fingerprint. which means significant time delay).
Despite of the significant flow transformations (i.e.,
repacketization, flow mixing, and packet dropping) and net-
work delay jitter introduced by www.anonymizer.com to the
Web traffic, we were able to achieve surprisingly good re-
sults in linking the information sender and receiver through
our flow watermarking technique. When we decoded the
32-bit watermark from a network flow, we allowed a few
bits mismatched with the watermark we were seeking. The
number of allowed mismatched bits is called the Hamming
distance threshold in our watermark decoding. Figure 11
shows that we can achieve a 100% watermark detection rate
with a Hamming distance thresholds of 5, 6, 7, and 8, re-
spectively, and redundancy of 20 from the Web traffic re-
ceived at the client side. This only requires less than 11 min-
utes active browsing. With less than 6 1/2 minutes of active
browsing traffic, we were able to achieve a 60% watermark
detection rate with a Hamming distance threshold of 5.
What i want to know is what exactly protects against an attacker that owns the majority of relays in a network?
Mix networks are actually much better protected from attackers who own a lot of nodes. This is because a single good mix on a messages path buys it significant anonymity, and there is no hard limit to the number of mixes on a path. This is in contrast to systems like Tor, where adding more nodes to a circuit doesn't help from some of the most dangerous and easy to carry out attacks. If you have three nodes or fifty nodes it doesn't matter if packet streams can be linked regardless of their location on the circuit. The attacker who owns the entry and exit can still link clients to destinations. In the case of a mix network, the attacker can own 49 of the mixes on your messages path and still not be able to deanonymize you if your message went over a single good mix at any point in time.
-
1) The ability to relay/mix traffic 'off net' - eg if I have a relay I may choose to wirelessly pump some of that data to another relay via a private wireless link or similar - then back on to the net. This would frustrate adversaries with the ability to conduct widespread traffic analysis of the Internet (assuming they're not also monitoring my private networks)
The good thing about mixing is that it essentially happens off net. That is the big advantage of mix networks really. It happens inside the mix nodes RAM. If the attacker cannot see the state of the RAM on the mix node, then they cannot learn what is happening, even if they can see all traffic into and out of the mix. Replace "My private networks" with "The state of my RAM" and you have a mix network.
2) Broadcast model for message delivery - think the Bitcoin ledger system. All clients see all messages but can only decrypt messages for them. An adversary, even one who can see all traffic, cannot say where the recipients are without access to their key. Very inefficient but much better anonymity for recipients. I think a satellite would serve us well for this - pricey though.
There are much more efficient systems that can obtain nearly the same anonymity as this. The general concept is called PIR, the type you advocate for is everybody gets everything PIR. There are types of PIR that require several orders of magnitude less bandwidth, while still maintaining a high degree of anonymity. In fact, in some cases the anonymity of everybody gets everything can be matched while bandwidth required can be taken down orders of magnitude.
-
Freenet isn't a bad choice. Tor is a featherweight boxer, quick but easily knocked out by a heavyweight attacker. Freenet is kind of a medium weight boxer. Mix networks are the heavyweights, slow but powerful. But it is likely if the NSA can pwn Tor they can pwn Freenet as well. It is a bit harder to pwn Freenet, but an attacker who can watch most links on the internet + who owns a decent number of Freenet nodes can break Freenet. We are assuming that the NSA can watch most links on the internet, it isn't going to be hard for them to own a decent number of Freenet nodes as well. In the case of Tor the attacker doesn't need to own a single Tor node if they can watch the majority of links on the internet. So the cost to attack Freenet in a major way is more than the cost to attack Tor in a major way, but it might in this case be the difference between someone in the Forbes top 100 buying a ten million dollar house or a twenty million dollar house.
The more I think about the "What's doable today?" question for building anonymous groups or marketplaces, the more I like Freenet.
The primary downside is that opennet Freenet is succeptible to Sybil attacks. Every "open, anybody can participate" anonymous network is. Stand up n many nodes, watch the nodes around Person X, figure out what Person X is requesting or posting. Some sort of lets-all-get-together darknet Freenet is a bad idea since the actual network *is* the intersection point.
But for a marketplace, it has a lot of advantages over Tor/i2p:
1. It funnels all user traffic into uniform static requests that are processed by the marketplace. The marketplace has a smaller attack surface.
2. Assuming strong encryption, an attacker could theoretically identify *who* is communicating with the marketplace, but not what they're doing.
3. Correlation should be much harder. You'd need to be able to see User X sending a request, and you'd need to see the Marketplace retrieving it. (EDIT: Actually, I'd have to do more research before I could say this is true.. as I think about it, your request to the marketplace may need an identifier so the marketplace could find it in order to receive it. )
4. It already has plenty of other traffic to carry the network. Mostly people sharing movies and warez and shit.
It has some downsides, though:
1. It's slow. It's a disconnected, download static content and wait a while technology.
2. Inventory management is impossible with high-latency networks. If I'm selling widgets, and I have five widgets left in inventory, I could receive fifty orders for widgets before anybody sees that I'm out of stock. Really have no idea if SR handles this at all, though, since I don't use it. May be a non-issue.
3. It's hard. This is a big one. Freenet is a pain in the ass to set up. It uses Java and eats lots of resources.
4. It requires network partcipation: To use it effectively, you have to agree to route traffic for everyone else, and not just for the 30 minutes a day you're using it. You lose the fire-and-forget nature of Tor + Tails/TBB. You're going to have to run it from somewhere that can stay connected. That's going to end up being your house. Running it from a VPS negates many of Freenet's advantages. And is even more of a pain in the ass to set up.
Freenet would require a shift in how many people think about darknets, and it requires a shift in security practices. Because of the architecture, you're trusting the technology
more. It's always on. And you can't do a Tails-style amnesiac experience with it like you can with Tor. You're ultimately relying on full disk encryption, but trusting the technology to keep FDE from mattering, since you're much more difficult to identify in the first place. But the whole "Aside from this Tails USB in my pocket, nothing in my house is even related to anonymous communications" option goes out the window.
It's possible to separate the Freenet routing node from the client (so you could use a Tails-like client OS with it), but it's not too easy. Most of Freenet seems to be built on the assumption that all services are available at 127.0.0.1 (the same box running the Freenet software).
My gut feeling is that in a pinch, you could move an existing marketplace with enough loyal userbase to Freenet. You probably couldn't start one from scratch, because it's too hard for new users to do properly. But if you boiled it down to just "install the Freenet jar" and "load this plugin", it's probably not a show-stopper.
-
The main problem is freenet as it stands now, performance aside, is that all its content is static. if you update a website you have to reupload the whole site then everyone who wants to view it has to download the new version. it just not an interactive environment. if the mixnet networks require this design then they are doomed to fail.
Here's a crazy idea, why dont we fight the NSA/BOTNET/SPAMMER situation economically. If someone wants to connect to the network they have to pay $100/yr. Yeah this would be beans for the NSA, but if for instance you only allowed CC's or some sort of ID linked payment, how is the NSA gonna sign up 1,000,000 diff accounts to pwn a network? Also at tors real-life usage numbers, 500k, development for the network would be substantially afforded and we wouldnt have to rely on generous grants FROM OUR OPPONENTS to keep it going. I dont know what else to say, we can invent elaborate clunky networks to fight the free-for-all nature of these networks, but at one point or another we are gonna have to make people pay for their consumption.
-
The good thing about mixing is that it essentially happens off net. That is the big advantage of mix networks really. It happens inside the mix nodes RAM. If the attacker cannot see the state of the RAM on the mix node, then they cannot learn what is happening, even if they can see all traffic into and out of the mix. Replace "My private networks" with "The state of my RAM" and you have a mix network.
Yes and I think mix mechanisms are important but mixing within a single nodes RAM does not seem enough. Ideally I want to be able to take a message out of the network, potentially deliver it to the next hop by another out of band network, possibly even by physical usb stick transfer, to another node and then back into the network and on to its destination. The ability to add completely arbitrary time delays and take the data off-network or indeed on-network at anytime potentially make traditional traffic analysis far less reliable.
If Apple and Android would allow it, I would love to see a peer to peer PAN (bluetooth) app that basically acted as a variable-latency mixing relay in a loosely coupled mesh. Imagine - I would send a message from my phone it would at some point be passed on to somebody nearby running the App or possibly straight onto the Internet to a relay, zapped half way around the world to a client endpoint running the ap then back into a local PAN for a hop of two on a busy street in Jakarta, back into the Internet and over to some guy stood on a Metro platform in New York - over the tracks to a gal on the other platform then back into the net and so on....finally getting to it's destination n hops later.... it will come
There are much more efficient systems that can obtain nearly the same anonymity as this. The general concept is called PIR, the type you advocate for is everybody gets everything PIR. There are types of PIR that require several orders of magnitude less bandwidth, while still maintaining a high degree of anonymity. In fact, in some cases the anonymity of everybody gets everything can be matched while bandwidth required can be taken down orders of magnitude.
Yes I have heard of this but most of the material I have come across seems quite academic. I suppose one could adopt the concept of channels (effectively) and receive only a subset of the 'torrent' - but even with that I do not see how you can maintain the same level of recipient anonymity as the everyone gets everything model.
-
Here's a crazy idea, why dont we fight the NSA/BOTNET/SPAMMER situation economically. If someone wants to connect to the network they have to pay $100/yr. ...
It seems tempting, but then we're back to all the previously discussed problems with central administration, and added all the problems with allegedly-anonymous-but-not-really VPN services :
1. Every Tor user would be trivially identifiable/deanonymizable by their billing information. They'll have to log in somewhere to prove they've paid. Every login is a snapshot of Real IP + Real Identity. How does each node know its routing traffic for legitimately paying users? Do we send the billing information to each relay so they can verify it before they forward the connection?
2. We've lowered the bar from "make your adversary go build a worldwide SIGINT interception system" (so far, we have only one known taker on that equation) to denaonymize a Tor user down to "Get any judge in any jurisdiction to just subpoena the records". Or created a scenario where a NSL can just get dropped on the front doorstep and demand that they backdoor the software. i.e. Lavabit, hushmail, etc.
3. Suddenly, all the relays are part of a centralized business venture. This doesn't seem like a big deal, but it's a huge one. At the 5000 user mark, you're talking about enough money that most first world governments will go crazy to get their cut from a tax perspective. At 500k users? Wow.
Pay-as-you-go just doesn't work for anonymity, and Bitcoin can't solve that one. It's just too easy for first-world governments to exert leverage to influence, coerce, and shut down business if they don't get what they want. But an open-source project, where people download the software and choose to donate their time and money to run it? Who do you issue that NSL to, exactly? Who do you force to backdoor the software? Tor is obviously concerned about it, thus their recent interest in deterministic builds, so you and I can see if the downloadable Tor binary matches the open source code.
-
Yes and I think mix mechanisms are important but mixing within a single nodes RAM does not seem enough. Ideally I want to be able to take a message out of the network, potentially deliver it to the next hop by another out of band network, possibly even by physical usb stick transfer, to another node and then back into the network and on to its destination. The ability to add completely arbitrary time delays and take the data off-network or indeed on-network at anytime potentially make traditional traffic analysis far less reliable.
Mixing inside RAM is enough. If an attacker notices a signal go into your machine, and you take this on a USB to another machine two countries away prior to forwarding it on, sure it could confuse the attacker. But if the attacker notices a signal go into your machine, and then a thousand other signals, and then they see 1001 signals leave your machine in a randomized order, they will be just as confused.
If Apple and Android would allow it, I would love to see a peer to peer PAN (bluetooth) app that basically acted as a variable-latency mixing relay in a loosely coupled mesh. Imagine - I would send a message from my phone it would at some point be passed on to somebody nearby running the App or possibly straight onto the Internet to a relay, zapped half way around the world to a client endpoint running the ap then back into a local PAN for a hop of two on a busy street in Jakarta, back into the Internet and over to some guy stood on a Metro platform in New York - over the tracks to a gal on the other platform then back into the net and so on....finally getting to it's destination n hops later.... it will come
Here is a paper you would probably be interested in, it discusses wireless meshnet anonymity systems:
http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA495688&Location=U2&doc=GetTRDoc.pdf
These systems are kind of esoteric. Not much research has been done on them that I am aware of. This paper discusses several of them, in addition to many other things. The networks in this category have names such as SDAR, AnonDSR, MASK, ARM, ODAR, AMUR, HANOR, ANODR, SDDR, ASR, ZAP, AODPR, AO2P, SAS, ASC and ASRPAKE. I know essentially nothing about any of these designs, other than that they are meant to use mobile wireless nodes.
Yes I have heard of this but most of the material I have come across seems quite academic. I suppose one could adopt the concept of channels (effectively) and receive only a subset of the 'torrent' - but even with that I do not see how you can maintain the same level of recipient anonymity as the everyone gets everything model.
Well, if there are 5 PIR servers hosting a shared database for the PIR protocol from pynchon gate, a client can obtain data from them with information theoretic security unless all 5 of the nodes are owned by the same attacker. With everybody gets everything though they would have information theoretic security even if all of the nodes are owned by the same attacker. So in this instance the level of potential maximum anonymity is the same, but yeah everybody gets everything is superior because the absolute minimum anonymity is vastly different (nothing vs total anonymity).
Though in some, I admit rather contrived, examples it becomes less clear that everybody gets everything is superior. Imagine there is a standard centralized pynchon gate server cluster of 4 nodes. The clients using the system can download messages without the servers knowing the messages they downloaded, unless all of the servers are malicious. So 4 malicious servers is enough to break the security of this system. Now imagine an everybody gets everything network like BitMessage. Imagine only 6 nodes are part of the network. If 4 of the nodes are malicious, they can link messages from one client to another, since all messages are between the remaining two clients. In some instances they might even be able to tell which client sent which message (and from this deduce which client it was sent to), depending on the network topology. In this case both of the systems can fail with the same number of bad nodes. But this isn't really a fair comparison, since it isn't looking directly at the primitive used in the case of everybody gets everything, but rather looking at a way it could be used in a system that would weaken it. If anything this highlights the sometimes subtle difference between the types of security a system can have.
So in summary, I probably should not have said that there are systems equally secure to everybody gets everything. There are systems that can be as secure in practice, but theoretically they are still weaker, and this weakness could manifest in practice. And there are systems based on both primitives that can be equally insecure, but this is not really a fair analysis because it isn't looking directly at the primitive but at the way it could be integrated into a system. When I made that comment you quoted, the idea in my mind was the example I gave of 4 bad nodes being enough for a compromise in everybody gets everything PIR and Pynchon Gate PIR, as I explained above. But on giving it more thought, this isn't really a correct way to look at things, because it is looking at a system that could be built on everybody gets everything, rather than the primitive itself, which is more secure than the pynchon gate PIR, as it cannot be theoretically reduced from information theoretically secure, whereas Pynchon Gate PIR can be information theoretically secure but can also be theoretically and practically reduced from this level of security.
-
Here's a crazy idea, why dont we fight the NSA/BOTNET/SPAMMER situation economically. If someone wants to connect to the network they have to pay $100/yr. ...
It seems tempting, but then we're back to all the previously discussed problems with central administration, and added all the problems with allegedly-anonymous-but-not-really VPN services :
1. Every Tor user would be trivially identifiable/deanonymizable by their billing information. They'll have to log in somewhere to prove they've paid. Every login is a snapshot of Real IP + Real Identity. How does each node know its routing traffic for legitimately paying users? Do we send the billing information to each relay so they can verify it before they forward the connection?
2. We've lowered the bar from "make your adversary go build a worldwide SIGINT interception system" (so far, we have only one known taker on that equation) to denaonymize a Tor user down to "Get any judge in any jurisdiction to just subpoena the records". Or created a scenario where a NSL can just get dropped on the front doorstep and demand that they backdoor the software. i.e. Lavabit, hushmail, etc.
3. Suddenly, all the relays are part of a centralized business venture. This doesn't seem like a big deal, but it's a huge one. At the 5000 user mark, you're talking about enough money that most first world governments will go crazy to get their cut from a tax perspective. At 500k users? Wow.
Pay-as-you-go just doesn't work for anonymity, and Bitcoin can't solve that one. It's just too easy for first-world governments to exert leverage to influence, coerce, and shut down business if they don't get what they want. But an open-source project, where people download the software and choose to donate their time and money to run it? Who do you issue that NSL to, exactly? Who do you force to backdoor the software? Tor is obviously concerned about it, thus their recent interest in deterministic builds, so you and I can see if the downloadable Tor binary matches the open source code.
Think bigger. Im not talking about TORPROJECT LLC handling all of this, that would be way too easy of a target for them to become, and i dont think they want to deal with billing anyways. A number of independent companies would sign on with TORPROJECT to accept subscriptions from users. The users who would pay them in whatever way they want BTC/CC/MO would have a pair of login details that are solely theres. Not only would the users get access to the network they would get a unique and unpublished entry guard which would prevent the NSA from being able to do shit to track them or break tor. Its like having a bridge node except there is no way for the NSA to map it out without controlling a number of middle nodes, which they wouldnt be able to do because part of that $100 also goes to a fund to setup more middle and exit relays around the world.
This is the whole problem with TORS design at the moment, the torproject actively publishes the ips off all relays and its easy for the NSA to watch them, and because tors a free for all bridge nodes are still published even if in a roundabout way. If everyone had a private bridge then there could be no correlation attacks as the entry point would not be known to your attacker.
-
It is actually pretty easy for an attacker who watches all the links on the internet to find all of the Tor bridges, even if they are not published.
-
Think bigger. Im not talking about TORPROJECT LLC handling all of this, that would be way too easy of a target for them to become, and i dont think they want to deal with billing anyways. A number of independent companies would sign on with TORPROJECT to accept subscriptions from users. The users who would pay them in whatever way they want BTC/CC/MO would have a pair of login details that are solely theres. Not only would the users get access to the network they would get a unique and unpublished entry guard which would prevent the NSA from being able to do shit to track them or break tor. Its like having a bridge node except there is no way for the NSA to map it out without controlling a number of middle nodes, which they wouldnt be able to do because part of that $100 also goes to a fund to setup more middle and exit relays around the world.
This is the whole problem with TORS design at the moment, the torproject actively publishes the ips off all relays and its easy for the NSA to watch them, and because tors a free for all bridge nodes are still published even if in a roundabout way. If everyone had a private bridge then there could be no correlation attacks as the entry point would not be known to your attacker.
NSA doesn't have to control *any* nodes. They just have to be able to watch traffic to/from them, even at a distance of a few hops away. Which requires them to identify nodes, and that seems to be what your suggestion aims to make harder.
The problem is.. NSA has a view of all traffic transiting a number of peering points in the US, as well as trans-continental connections.
So, let's pretend we're the NSA. We make a list of known Tor relays, since those are public. Then, we make a list of known "public bridges", which Tor tries to only hand out a handful at a time to keep folks like China from blocking all of them. Mostly, China gets these bridges the same way the users do, then blocks them, so it's an arms race on that front. Then we have "private bridges", which anybody can set up. It's a bridge that doesn't report back to Tor. So your suggestion is to create a for-pay Tor private bridge provider.
I'm guessing there are more clever ways to do it, but here's one easy one:
Take list of all known Tor relays and bridges, and take the metadata from all their connections for a 24 hour period. Subtract all traffic with source/destination addresses that match other known Tor relays/bridges. Everything that's left is either a user, or it's a private bridge that's connecting a user. Sort that list by (amount of total traffic OR number of connections over a 24 hour period..either would work fine). Everything above the x-th percentile on that list is a private bridge. Add that private bridge to a new list, view all src/dst traffic going into/out-of it. If it's routing traffic for other people, it's definitely a bridge. If it's not, it's a Tor client.
-
Adding more nodes, including private bridges, is never the answer when it comes to protecting from an attacker that can see all links on the internet. Once you get to that you pretty much have five options to pick from. Constant rate cover traffic, DC-nets, Mixnets, PIR, Covert Channels.
-
Why does a relay have to only relay tor traffic? Why cant you confuse them by relaying things other than tor, so whatever network map they build will be muddled by unrelated connections? If they can get an ip of a tor relay and assume every connection going to and from it is related to tor then yes its going to be easy as hell to map out the network, but why cant we hide among the clearnet too?
Also what about obfsproxy, isnt this supposed to mask the tor connection? like how skyemorph is designed to make a tor connection look identical to a skype video call which is all p2p anyways. im not seeing whats the difference between invariant traffic and traffic that is identical to whitelisted traffic.
-
Why does a relay have to only relay tor traffic? Why cant you confuse them by relaying things other than tor, so whatever network map they build will be muddled by unrelated connections? If they can get an ip of a tor relay and assume every connection going to and from it is related to tor then yes its going to be easy as hell to map out the network, but why cant we hide among the clearnet too?
Also what about obfsproxy, isnt this supposed to mask the tor connection? like how skyemorph is designed to make a tor connection look identical to a skype video call which is all p2p anyways. im not seeing whats the difference between invariant traffic and traffic that is identical to whitelisted traffic.
I think you can get a small, incremental gain from some of those ideas. You can make it a little harder. But not enough harder.
As to why other traffic on a relay doesn't buy that much, they're building a network of Tor traffic and they're coming at the equation from the other direction. They're not looking at every node on the Internet, blindly looking for Tor traffic. They're taking known Tor nodes and just seeing who they communicate with, and on what ports. And following that trail until they map as much of the network as they can with their view of traffic.
Once you reach that point, and somebody is looking directly at the traffic in/out of one specific relay, it's not too hard to figure out which traffic is which. And if you can't, you can just throw out everything that you *can* identify, so you can focus on only what's left. They've already dug through billions of streams to get here, so another ten thousand connections of white noise also coming into the relay isn't going to make them quake in fear.
Obsfproxy and the whole category of obfuscating wrappers around traffic can make the traffic look different enough that it's harder to find. They're useful for trying to stay out of a net in the first place. But once you're in that net, once somebody is actively looking at the traffic on that relay, they'll immediately know it's NOT legitimate web or Skype traffic. It's something else.
There was a great white paper on that concept published last spring: "The Parrot is Dead: Observing Unobservable Network Communications": http://freehaven.net/anonbib/cache/oakland2013-parrot.pdf
Tor just isn't missing one clever little trick here, or one more hidden hop there, and presto it's good. The risk is statistical analysis, and it's almost impossible not to have whatever you're hiding stand out from the rest of the noise once they find the right filter. So obfuscation buys you something, but not enough to actually hide lots of people behind it for very long. And it only lasts a little while, until your adversary discovers it.
-
Why does a relay have to only relay tor traffic? Why cant you confuse them by relaying things other than tor, so whatever network map they build will be muddled by unrelated connections? If they can get an ip of a tor relay and assume every connection going to and from it is related to tor then yes its going to be easy as hell to map out the network, but why cant we hide among the clearnet too?
Also what about obfsproxy, isnt this supposed to mask the tor connection? like how skyemorph is designed to make a tor connection look identical to a skype video call which is all p2p anyways. im not seeing whats the difference between invariant traffic and traffic that is identical to whitelisted traffic.
http://freehaven.net/anonbib/cache/morphing09.pdf
Recent work has shown that properties of network
traffic that remain observable after encryption, namely
packet sizes and timing, can reveal surprising informa-
tion about the traffic’s contents (e.g., the language of a
VoIP call [29], passwords in secure shell logins [20],
or even web browsing habits [21, 14]). While there are
some legitimate uses for encrypted traffic analysis, these
techniques also raise important questions about the pri-
vacy of encrypted communications. A common tactic for
mitigating such threats is to pad packets to uniform sizes
or to send packets at fixed timing intervals; however, this
approach is often inefficient. In this paper, we propose
a novel method for thwarting statistical traffic analysis
algorithms by optimally morphing one class of traffic to
look like another class. Through the use of convex op-
timization techniques, we show how to optimally modify
packets in real-time to reduce the accuracy of a variety
of traffic classifiers while incurring much less overhead
than padding. Our evaluation of this technique against
two published traffic classifiers for VoIP [29] and web
traffic [14] shows that morphing works well on a wide
range of network data—in some cases, simultaneously
providing better privacy and lower overhead than na ̈ve ı
defenses.
-
I see, so your are saying its basically guilt by association and unless it is a perfect darknet there is no way to evade both your connection to the network and subsequently your packet flow characteristics at both the entry and exit points that may link you, despite all attempts at obfuscating this.
Well i can certainly say i dont understand how, if some intermediate set nodes takes my unique signature flow, changes it to no longer resemble the original flow and holds onto it for a minute or two, how someone watching the exit traffic can possibly know that that flow is the same flow that i sent into the network. That defies logic. It would be no different than talking to someone over VOIP, you speak clearly into the phone, your unique signature, the VOIP servers mangle your call, and out comes a bunch of static with bits and pieces of your voice. If I were in a room full of people and someone on the other end picked up the phone, even if they knew every person in that room and the sound of their voices, how could they tell it was me based on a the garbled signature of my VOIP'd voice? This of course is assuming that i dont include as part of my conversation my name, address, and social security number, which i presume most Tor users dont. And of course an IP != Person, so where exactly is the risk here with someone monitoring the entire network? Weak correlation and speculative analysis?
Really though, tor is only marginally more effective than public wifi at hiding ones identity. What tor is great for has been hidden services. So this discussion should really revolve around "if someone can monitor the entire internet, can they find SR's servers?"
-
I see, so your are saying its basically guilt by association and unless it is a perfect darknet there is no way to evade both your connection to the network and subsequently your packet flow characteristics at both the entry and exit points that may link you, despite all attempts at obfuscating this.
n
Well i can certainly say i dont understand how, if some intermediate set nodes takes my unique signature flow, changes it to no longer resemble the original flow and holds onto it for a minute or two, how someone watching the exit traffic can possibly know that that flow is the same flow that i sent into the network. That defies logic. It would be no different than talking to someone over VOIP, you speak clearly into the phone, your unique signature, the VOIP servers mangle your call, and out comes a bunch of static with bits and pieces of your voice. If I were in a room full of people and someone on the other end picked up the phone, even if they knew every person in that room and the sound of their voices, how could they tell it was me based on a the garbled signature of my VOIP'd voice? This of course is assuming that i dont include as part of my conversation my name, address, and social security number, which i presume most Tor users dont. And of course an IP != Person, so where exactly is the risk here
with someone monitoring the entire network? Weak correlation and speculative analysis?
Really though, tor is only marginally more effective than public wifi at hiding ones identity. What tor is great for has been hidden services. So this discussion should really revolve around "if someone can monitor the entire internet, can they find SR's servers?"
But if we're starting with the assumption that an NSA-level adversary can follow TCP connections around effectively based on their view of traffic, your VOIP analogy isn't exactly correct. It's more like you're talking into a tin can, with a string connecting you to another tin can, which they already can find. So they follow the string. Doesn't matter what silly voices you use, or what elaborate code phrases you rely on, there's a frickin string between the cans, and it's in your hand. :)
I'm just saying that sprinkling a lot of not-Tor traffic onto a server that's routing Tor traffic doesn't confuse things as much as everyone would hope. At a minimum you have (src ip, src port, dst ip, dst port, packet size, TCP sequence numbers if TCP, timing between packets, etc) as variables to filter on even without moving to the Deep Packet Inspection discussion.
If it adds a random minute or two of delay, then that makes it much more difficult, assuming you have enough "identical" other traffic that you don't know which is which. It's a simple mix at that point. But adding a random two minutes of delay between HTTP clients on a low-latency network probably isn't going to leave you with the low-latency network you wanted. And, back to kfmkewm's point about spreading load across mixnet servers, you'd actually be safer with the busier server than with a more private one that only you were using. Obviously, if in a five minute period, only one set of connections came in and came out, the two minutes of random delay isn't buying you anything.
And obsfucation helps, but it just creates another type of conversation that the adversary needs to recognize. Once they can recognize it, it's no longer adding value. The more folks you share it with, the easier it is for them to find.
But it's also important to remember that just identifying Tor users or traffic doesn't buy that much. Without having a view into the other end of the conversation, there's nothing to correlate. The risk to Tor users from NSA's possible capabilities is the combination of their view of a conversation early (i.e. around the guard node) and late (before or after the exit node). This probably allows them to correlate the two connections and realize theyr'e the same. Deanonymizing the user and associating them with the specific traffic through the exit node.
For users connecting to hidden services, to correlate traffic, they have to be able to know where the hidden service is. Without that, they just have the entry guard. So everyone's best bet for Tor's longterm viability for darknet use (i.e. hidden services) is that hidden services continue to evolve to be increasingly more difficult to deanonymize. Because if you can't see the hidden service, you can't correlate the traffic to it.
The primary risk without a view of the hidden service is that they can do fingerprinting of the encrypted traffic to identify what it is. i.e. based on timing, content size, etc, it's likely traffic going to SRF. Or SR. There have been a number of whitepapers on this that operate in a <1000 website control group, identifying SSL traffic without decrypting it, and they've had decent success rates. But all they can really do is establish that it's *probable* that the traffic was going to a specific site, not that it definitely was. And that's an important distinction in most jurisdictions.
Now if a hidden service's Introduction Points or the service itself are located well within their view, I'm guessing there may be methods of statistical analysis that could lead them to the hidden service. Which is why I'm guessing that SR and other darknet sites are very careful with their Introduction Points. They can't be anybody that's directly tied to them, but they can't be chosen at random, either. They have to be ran by an entity that the hidden service owner trusts, but isn't linkable to.
And everyone is saying they *suppose* that NSA *could* do this. Not that they are. Or that NSA gives a shit about hidden services not involving terrorism. But the awareness is that while Tor can keep you safe from almost everyone, NSA probably isn't on the list of people it can keep you safe from.
-
Given that no government entity will ever have full access to the entire internet what are some key points to follow in regards to ones own government? For say im located in the US and i access a site in the US, will the NSA be able to correlate my connection, even if my relays go through iran, nkorea and russia? This seems to be what i am hearing.
Now lets say i want to visit a site in romania and my connection goes through the same relays, i cant imagine the 5 eyes will be able to see anything. But if for say when i connect to the romanian site my connection goes through canada, UK and australia, then they would be able to correlate? Do they need to see the whole network of relays and endpoints to correlate traffic or just particular points?
-
Given that no government entity will ever have full access to the entire internet what are some key points to follow in regards to ones own government? For say im located in the US and i access a site in the US, will the NSA be able to correlate my connection, even if my relays go through iran, nkorea and russia? This seems to be what i am hearing.
Now lets say i want to visit a site in romania and my connection goes through the same relays, i cant imagine the 5 eyes will be able to see anything. But if for say when i connect to the romanian site my connection goes through canada, UK and australia, then they would be able to correlate? Do they need to see the whole network of relays and endpoints to correlate traffic or just particular points?
They need both ends. The beginning and the end of the connection. Take a normal Tor clearnet connection through an exit guard:
You->SomeNetworkHops1->EntryGuard->SomeNetworkHops2->MiddleRelay->SomeNetworkHops3->ExitNode->SomeNetworkHops4->Clearnet Destination
In that scenario, they need to monitor something between You and the EntryGuard (so something in "SomeNetworkHops1") to tie you to the Tor connection. If their closest monitoring point to you is at SomeNetworkHops2, they can't see who's using the EntryGuard
But they need the other end, to know what you're doing. They either need to monitor ExitNode, or (maybe) SomeNetworkHops4, or (maybe) Clearnet Destination
The "maybes" are because I haven't seen any formal research showing that, but logically, if their correlation is based on timing, timing patterns should be identical all the way to to the destination. When your session is complete, it should get torn down at a similar rate across the entire path. The capture filter in the SomeNetworkHops4 area would look a lot like "Show me all traffic to/from known Tor exit nodes".
Between your house in the US and Iran, I'd expect at least 2-3 points where they have visibility, probably more. Probably at one or more IX's in the US that your traffic traverses, almost certainly in the trans-continental fiber connections. Beyond that, who knows. Coming back after you bounce through Iran, same story. Might be different cables or different IX's, but they should have multiple views of your traffic coming back for the US server.
But none of that matters unless they can know where you're going. And they *have* to have the other end to get beyond thinking "you probably were going here" and actually knowing to a reasonable level (to them, not to a court) that you were REALLY going here.
Hidden services work differently. They have more hops, and more going on. You can find some nice graphics on the Tor Project site that explains them. In a nutshell, both the client and the hidden service build three hop connections to a point in the middle. kmfkewm outlined an attack about a month ago where by taking control of HSDir servers, in conjunction with a view of the entry guard, you could probably see who was trying to make a connection to "somesite.onion". He's right, but it's still not *proof* of anything except that you looked up a hidden service's address.
But until they deanonymize the hidden service, they can't correlate your user traffic to it (ignoring the whole SSL fingerprinting concept). Once they have visibility into the traffic coming in/out of a hidden service, they just need to correlate that with the users at EntryGuards, and presto. Users of the hidden service who are coming from behind Entry Guards they *can't* see behind are not deanonymized.
Assuming they have both ends:
You->NSA->EntryGuard = Screwed
You->EntryGuard->NSA = Not screwed. If they don't have any visibility between you and the Entry, they can't find you this way.
You->NSAsEntryGuard = Screwed. Duh.
Logically, NSA has enough pieces to be able to do this. What nobody knows is how easy/hard it is for them, how much effort is involved, and what are their internal boundaries on how they can use the information. It could be mostly automated and deliver bulk deanonymization of all Tor users, it could be technically possible, but require a dozen analysts two weeks to dig through a billion records in different databases to correlate a single Tor connection. It could be impossible because we're misunderstanding and overestimating their capabilities. Nobody knows.
-
sub
-
I see, so your are saying its basically guilt by association and unless it is a perfect darknet there is no way to evade both your connection to the network and subsequently your packet flow characteristics at both the entry and exit points that may link you, despite all attempts at obfuscating this.
Well the paper I linked to actually showed that the technique you mentioned (called traffic morphing) is good enough to prevent website fingerprinting attacks. I imagine it is effective, to various degrees, to reduce the effectiveness of interpacket watermarking attacks as well. But it isn't enough to save the day by itself (just like invariant traffic characteristics isn't enough to save the day by itself, it is just one part of the puzzle). I will need to read more papers on traffic morphing to come to any set conclusion on it, it certainly is useful for a variety of things, but it isn't the route I want to go in any case, and it isn't going to be more effective than interpacket timing uniformity.
Well i can certainly say i dont understand how, if some intermediate set nodes takes my unique signature flow, changes it to no longer resemble the original flow and holds onto it for a minute or two, how someone watching the exit traffic can possibly know that that flow is the same flow that i sent into the network. That defies logic.
If your flow is totally morphed they cannot do that. The paper I linked to talked about traffic classifiers, which are used for website fingerprinting and other things. Morphing traffic helped prevent classifier attacks, as their paper shows. So if a relay loads google, and then record the exact stream characteristics, and then when your watermarked traffic passes through them they hold it long enough to modify it so that it looks like the stream they loaded from Google, it should destroy the watermark in the stream and actually make it look like you are loading Google if somebody is running a classifier against that individual stream. But adding random interpacket delays to try to destroy the watermark will be more difficult than morphing the stream to look exactly like another stream, and it has been shown already that attempts to do this in low latency have failed to remove interpacket timing watermarks. The difference is between traffic morphing (making one stream look exactly like another previously seen stream) and adding random jitter (randomly delaying packets for small amounts of time). I think if traffic is perfectly morphed it destroys interpacket timing watermarks, but randomly added jitter can usually be filtered, especially if you want to maintain low latency.
It would be no different than talking to someone over VOIP, you speak clearly into the phone, your unique signature, the VOIP servers mangle your call, and out comes a bunch of static with bits and pieces of your voice. If I were in a room full of people and someone on the other end picked up the phone, even if they knew every person in that room and the sound of their voices, how could they tell it was me based on a the garbled signature of my VOIP'd voice? This of course is assuming that i dont include as part of my conversation my name, address, and social security number, which i presume most Tor users dont. And of course an IP != Person, so where exactly is the risk here with someone monitoring the entire network? Weak correlation and speculative analysis?
Because it only takes identification of a very few bits of information to identify a watermark. Even if it is horribly mangled, the few bits of the watermark that get through will betray you. Of course if you morph one stream to have the exact characteristics of another, that will destroy the watermark. But if you just add random noise, that will not usually be enough to save the day and it can often be filtered, just like cryptographic timing attacks can be slowed down by adding random delays to crypographic functions but the random delays can still be filtered. The solution with cryptographic functions is to make them constant time, ie: uniform time to complete regardless of input, the solution with traffic is to make it uniform or to morph it to look exactly like some other traffic. If you perfectly randomized the interpacket timing characteristics it would work perhaps, I will need to try to find some paper on this, but generally you don't want to add noise to gain security, because noise can often be filtered but uniformity cannot. The only time you would want to add noise as your primary security technique, would be in cases where adding noise is required to obtain uniformity. Most of the research I found on traffic morphing is to counter classifiers not to counter watermarks, but I imagine it will work for either.
Really though, tor is only marginally more effective than public wifi at hiding ones identity. What tor is great for has been hidden services. So this discussion should really revolve around "if someone can monitor the entire internet, can they find SR's servers?"
Tor is hopefully a lot more effective than public WiFi at hiding identifies. Tor hidden services have actually been notorious for being the least secure part of Tor.
-
Recent research in 2013 found that protocol morphing is not enough to prevent protocol fingerprinting in practice, only in theory (well , it might be possible in practice, but it is so hard they claim it is pretty much impossible). Keep in mind that this is looking at the technique and attack from a higher level, in which only the protocol used is being attempted to be obfuscated or identified. In some traffic morphing schemes they go to a lower level than this, and try to obfuscate the entire traffic stream (using a cookie cutter template of actual observed traffic, and forcing the new stream into that mold, ie: "This is what traffic from Google looked like, let's make this new traffic stream look exactly like it") instead of the protocol (ie: "This is what the skype protocol looks like, let's try to make our protocol look like this"), and the attackers try to identify the fingerprint of actual specific content (ie: he loaded Google over Tor!) rather than a protocol (He used Tor to load something!).
The Parrot Is Dead: Observing Unobservable Network Traffic
Abstract—In response to the growing popularity of Tor
and other censorship circumvention systems, censors in non-
democratic countries have increased their technical capabilities
and can now recognize and block network traffic generated by
these systems on a nationwide scale. New censorship-resistant
communication systems such as SkypeMorph, StegoTorus, and
CensorSpoofer aim to evade censors’ observations by imitating
common protocols like Skype and HTTP.
We demonstrate that these systems completely fail to achieve
unobservability. Even a very weak, local censor can easily
distinguish their traffic from the imitated protocols. We show
dozens of passive and active methods that recognize even a
single imitated session, without any need to correlate multiple
network flows or perform sophisticated traffic analysis.
We enumerate the requirements that a censorship-resistant
system must satisfy to successfully mimic another protocol and
conclude that “unobservability by imitation” is a fundamentally
flawed approach. We then present our recommendations for the
design of unobservable communication systems.
Of course once you consider bidirectional fingerprinting it becomes more of a challenge. Not only would your entry node need to make the traffic going to you look exactly like traffic from Google, but you would need to make traffic coming from you look exactly like traffic going to Google. It is interesting research for sure, at all levels, and for attacks and defenses, but again it isn't the route I would go. There is no debate in regards to interpacket timing uniformity fixing the problem of interpacket timing watermarks, and no debate in regards to packet/message size uniformity preventing message fingerprinting, so might as well go right to that instead of trying to implement much more complex systems, that in some cases are on uncertain grounding.
Also most of the quotes I have given so far are about using these techniques for different reasons. One reason to morph traffic is to hide that it is related to a certain anonymizer, so for example you would morph Tor traffic to make it look like Skype traffic. Another reason to morph traffic is to prevent content fingerprinting attacks, so for example you would make SR traffic look like Google traffic. None of the papers I have read so far mention watermarking attacks directly, but website fingerprinting attacks are close enough to watermarking attacks that I think it is safe to assume morphing traffic would indeed help prevent them (as one of the papers I linked to shows that this is effective to protect from fingerprinting/classifier attacks).
Again, interesting stuff, not something I want to spend too many brain cycles on. These problems (watermarking attacks and fingerprinting attacks anyway, not protocol obfuscation that is a current area of research) have all been solved with a simple technique: make everything uniform. If we wanted to have low latency with higher protection than Tor, we might be forced to look into these techniques more. But my goal is not to make a better Tor (which would still probably be broken by strong attackers anyway, since once we fix all of the remaining things that can be fixed we will still be left with all the things that cannot be), my goal is to make a good mixnet (which would stand a chance at preventing strong attackers). I am willing to have high latency in return for being able to use effective techniques to prevent these remaining attacks on systems like Tor, and I would rather spend my brain cycles thinking of ways to prevent the remaining attacks on mix networks that don't already have perfect solutions.
-
Oh i see, kmfkewm and ECCROT13 are the same person. The writing style is the same, the use of commas, the grammar, the large parenthetical blocks, the post volume, the same general knowledge and commentary. Dont worry about that pm, posting.
-
Oh i see, kmfkewm and ECCROT13 are the same person. The writing style is the same, the use of commas, the grammar, the large parenthetical blocks, the post volume, the same general knowledge and commentary. Dont worry about that pm, posting.
What. The. Fuck?
I've actually spent quite a bit of time trying to help explore some of your ideas. When you've posted questions, I've generally tried to help you.
I'm really not sure what to say in response, but here's a start:
1. I came to this forum specifically because it's the *only* anonymous forum I've seen online where there are people who actually *think* about this shit, as opposed to just reposting dumbassed ideas they've found somewhere. kmfkewm, astor, SS, comsec, and others usually have valid, well-thought-out points. I've learned a lot from all of them. I've also learned a lot from people who *aren't* as technical as I am, because they think about things differently than I do.
2. While I don't always agree with kmfkewm, I've ended up learning a lot through discusssions with him here. He's obviously spent years thinking about this, and he's bright. He gets the technical issues involved at a very technical, detailed level. Some of what I've learned has been from things he's said, and much of it has been through my own thought processes while disagreeing with him and responding.
3. If I'm some sockpuppet of kmfkewm's (or is he my sockpuppet, not sure? He's been here longer, so I guess that makes me his.), what fucking multiple personality disorder would make him argue with himself ad infinitum about PIR and a million esoteric concepts? Just some masterbatory enjoyment on his part?
4. Just guessing here, but if our posts look similar, it's because: a. We probably both upload/paste our posts as part of get-in-then-get-the-fuck-out methods of interacting with this forum and other anonymous sites. b. We both tend to figure things out as we're typing, so you end up with too much thought process in each post. c. We're both just wordy motherfuckers. And d. We both seem to spend a lot of time doing research and thinking things out before we post. That last part is a really novel concept, BTW.
I guess I should probably expect stupid shit like this from some of the more random fucktards that post here, but really didn't expect it from you.
(edited to remove a paragraph that was mostly based on my misunderstanding of the Hail Mary.. thread)
-
This has been a great thread, but unfortunately kmf has retired from the community.
He was one of the smartest people on the forum, and I will miss his input.
-
Wow, I missed kmfkewm's going away post since I only read the security forum.
I guess railroakbill's sockpuppet accusation makes more sense now. Without context, it seemed like it was coming out of the blue.
But he posted it well before kmf posted his goodbye message. What am I missing here?
-
Oh i see, kmfkewm and ECCROT13 are the same person. The writing style is the same, the use of commas, the grammar, the large parenthetical blocks, the post volume, the same general knowledge and commentary. Dont worry about that pm, posting.
I don't believe they are the same person. A distinguishing feature of kmf's writing is that he rarely uses contractions while ECC_ROT13 uses a lot of contractions. Apparently kmf was unaware of this until I pointed it out to him, but I never got the impression they were the same person.
-
Here's a crazy idea, why dont we fight the NSA/BOTNET/SPAMMER situation economically. If someone wants to connect to the network they have to pay $100/yr. ...
It seems tempting, but then we're back to all the previously discussed problems with central administration, and added all the problems with allegedly-anonymous-but-not-really VPN services :
1. Every Tor user would be trivially identifiable/deanonymizable by their billing information. They'll have to log in somewhere to prove they've paid. Every login is a snapshot of Real IP + Real Identity. How does each node know its routing traffic for legitimately paying users? Do we send the billing information to each relay so they can verify it before they forward the connection?
2. We've lowered the bar from "make your adversary go build a worldwide SIGINT interception system" (so far, we have only one known taker on that equation) to denaonymize a Tor user down to "Get any judge in any jurisdiction to just subpoena the records". Or created a scenario where a NSL can just get dropped on the front doorstep and demand that they backdoor the software. i.e. Lavabit, hushmail, etc.
3. Suddenly, all the relays are part of a centralized business venture. This doesn't seem like a big deal, but it's a huge one. At the 5000 user mark, you're talking about enough money that most first world governments will go crazy to get their cut from a tax perspective. At 500k users? Wow.
Pay-as-you-go just doesn't work for anonymity, and Bitcoin can't solve that one. It's just too easy for first-world governments to exert leverage to influence, coerce, and shut down business if they don't get what they want. But an open-source project, where people download the software and choose to donate their time and money to run it? Who do you issue that NSL to, exactly? Who do you force to backdoor the software? Tor is obviously concerned about it, thus their recent interest in deterministic builds, so you and I can see if the downloadable Tor binary matches the open source code.
Think bigger. Im not talking about TORPROJECT LLC handling all of this, that would be way too easy of a target for them to become, and i dont think they want to deal with billing anyways.
It would be interesting if we could have an open-source distributed autonomous corporation handle the financial transactions.
http://invictus-innovations.com/i-dac/