Silk Road forums
Discussion => Security => Topic started by: astor on May 09, 2013, 04:58 pm
-
Edit: If this is your first time reading this thread, skip my stream of consciousness, and read this post first:
http://dkn255hz262ypmii.onion/index.php?topic=157711.msg1112047#msg1112047
=================
The principle design feature of Tor is distributed trust. It could also be described as division of information. No single entity can see the whole network, and no single entity can see your whole circuit. If the Tor Project ran every relay, you would not be anonymous. If the same person ran your entry and exit nodes, you would not be anonymous. Instead, a diverse group of volunteers from around the world add their relays to the network. Your entry node can see who you are (your IP address), but not what you are doing (the sites you are visiting). Your exit node can see what you are doing but not who are you. There are known attacks on this model, but your anonymity and safety are perserved if it holds.
Now consider a black market like SR. There are a few key pieces of information involved in every transaction. The buyer's identity, the seller's identity, the product, and the amount of currency exchanged. Under the current model, we trust the market admins with our currency and the product listings. The buyer can (and should) encrypt their address, so only the seller knows that. Nobody knows the seller's identity, although their city or rough geographical location become know to the buyer when the product is received. So, theoretically the market admins know the products and currency involved, but don't know the buyer and seller identities. The buyers and sellers know each other's identities (to some extent) and the product involved, but not the details of the currency exchange. The information is distributed to some extent, but I think it can be improved.
I'm interested in the concept of a blind market, where the market admins don't know what products are being sold. (This reduces their liability as well.) They only know that buyer1 sent 3 BTC to vendor2 for product <hash3>, or something along those lines. If the market were compromised, LE would only see a series of anonymous transactions for unknown products. The only ones they could deduce would be transactions that they were personally involved in as buyers or sellers.
I'm stuck on how this could be implemented. You can't just hash every product listing, because you wouldn't be able to reverse the hash to present it to buyers. And even if you could, you would be able to log in as a buyer to connect the hashes to the products.
So I'm wondering if there's any research on blind markets, or if anyone knows about this. Maybe the concepts behind blind mixes would be useful here. Any ideas?
-
If accounts are readily available like on SR, the admins could create accounts and see every listing, like you said... I don't know if it could work, but it's a pretty good idea!
-
Yes, of course, but the idea is that they couldn't link a particular listing to a particular transaction in their database (unless they made the purchase themselves). The prices would probably have to be rounded to specific numbers, like 0.1 BTC, so that many listings are at a specific price.
I don't know if it's possible, really. I'm just throwing it out there. :)
-
Hm... I don't think it's possible the way you envision it -- I'll explain and we can see if my logic holds or not :)
In order to purchase a listing, the buyer must know what that listing is for. One way or another, that information has to be available. The only way to pull off your idea would be to make it conditionally available; conditional on what basis I don't really know: any listing that you've bought being blacked out wouldn't work, and that would just make it even easier to tell what you've bought. A rotating schedule sort of thing would be... strange, and no good at all anyway, since it wouldn't hide the information all the time.
I'm afraid the only way I can think of to do this is to make it a sort of invite-only system, where you have to explicitly give access to a buyer before they can actually view what your listing(s) is for. But that limits the pool of potential customers so severely, it doesn't seem viable.
I don't know. The whole buyer needing to know what he's buying thing is the issue I think there may not be any decent way around. I suppose you could use a web of trust kind of model, to allow access only to those in your circle or "extended circle" (in the circle of a friend kind of thing). Something like that. The problem there is that you have to bootstrap it somehow, because initially nobody's gonna be in anybody's circle. I guess just prior customers trusting you would be enough to get it going, so long as the participants weren't *all* 100% unfamiliar to everybody.
I dunno. Just my thoughts :)
-
Yeah, I'm starting to think that the only way this could work is if you got rid of the market admins altogether and used a decentralized database like the block chain. The "network" maintains it. As long as there is at least one node, the market continues to exist. Sellers submit listings which are encrypted but provide key words. Buyers discover listings by searching for key words. Transactions are p2p, so only the buyer and seller know that a particular transaction involves them (although anyone who downloads the database knows that a transaction exists, just like bitcoin transactions today).
You lose escrow and resolution, but that could be provided by third party services for a fee. In fact, that's what the role of "market admin" in today's markets could evolve into. Before engaging in a transaction, buyer and seller could agree to use a third party escrow/resolution service. They inform the service, which initiates the transaction, but the service doesn't know the content of the transaction (the product and price) until a dispute happens and it is revealed to them. In order to complete the transaction, both parties must sign it, or something along those lines. So a dispute happens when at least one party refuses to sign. At that point, the transaction details are revealed to the escrow/resolution service, which decides on a result. Another way of putting it is, a third party initiates a special type of transaction, whose contents are not revealed to the third party as long as the first two parties (buyer and seller) sign it within some amount of time.
In this system, some transaction details would be revealed to outsiders, but all the transactions that conclude successfully remain private between buyer and seller. That's better than the current situation, where one entity knows about all transactions.
-
I should add that this system doesn't compete with SR or put it out of business. There would be a need for escrow services. Would you trust Atlantis Escrow Service or Silk Road Escrow Service? People trust DPR and gang. They would get plenty of business as escrow agents for the decentralized market. The difference is that they wouldn't know about the contents of the 95% of transactions that complete successfully (although they would still get paid for overseeing them), so the buyer and seller are safer. Even DPR would be safer, since he wouldn't be a central target, storing all the info. He would only know about 5% of the transactions. The rest would exist anonymously in a database that anyone could inspect already.
-
And of course, if it's decentralized, there's no central point of failure, no central server to seize or DOS. It would be as hard to shut down as the bitcoin network.
The more I think about it, the more I like this idea. Tell me where I'm horribly wrong? :)
-
Ok, one thing I realized is that in order for escrow agents to get paid, the price of the transaction must be known to them. That's fine. The product could still be hidden. And there's no reason this market would have to be for drugs only, so there's always plausible deniability.
Then there's the issue of sellers deciding to trade based on buyer stats. I think this can be solved with multiple keys, like Tor relays use. Each buyer can keep a long term "identity" key and use "ephemeral" keys for each transaction. The ephemeral keys are cryptographically related to the identity key. If a seller requests to see a buyer's stats, he reveals the identity key, and the seller can interrogate the database to find other transactions signed by keys related to that key (this could be done automatically by the software). And since the price is public, they could compute # of transactions and total spent (same stats you get on SR today). Alternatively, buyers can maintain more privacy by refusing to reveal their identity keys and only buying from sellers who don't request it. At least it would be a choice.
Also, transactions that are handled by escrow agents and that result in a resolution could be marked in a special way, so sellers could get stats on that too.
Just thinking out loud here.
-
I don't know man... I don't think that works -- the buyer has to know what he's buying, and even if you use a one way hash, anybody could just come along and try hashing all the products until they found one that matched. Then by using the key words, they could figure out what it was as well as the buyer who originally bought it. Am I missing part of what you're describing?
-
Ok, I see this going two ways now.
1. Revealing the price of the transaction could be optional, so it would only have to be revealed in cases where escrow services are used, although I imagine a large majority of transactions would involve escrow.
2. Prices of all transactions would be public. In this case, the entire market could be an extension of the bitcoin protocol, where the info about products, shipping address, the key signing stuff I mentioned above, become encrypted metadata associated with bitcoin transactions.
You might say, doesn't revealing the price of each transaction violate the privacy I'm trying to create with such a market? Not really. The bitcoin network already operates this way. All transactions are public, but you don't know what they are for. You can find transactions involving hundreds of thousands of dollars, and they could be money laundering, funding terrorism, large drug buys, etc. The extended protocol simply includes that (encrypted) metadata in the transaction to create a market. Well, it's not that simple, there would have to be an elaborate key signing / management mechanism to allow third parties (escrow agents) to view transactions when it's necessary, and do all the buyer/seller stats stuff.
Still, in case 2, it could be relatively easy to implement.
-
I don't know man... I don't think that works -- the buyer has to know what he's buying, and even if you use a one way hash, anybody could just come along and try hashing all the products until they found one that matched. Then by using the key words, they could figure out what it was as well as the buyer who originally bought it. Am I missing part of what you're describing?
No, that's correct. It would have to be the case that anyone could discover all the listings, if any random buyer is expected to be able to find any particular listing. The key is a third party wouldn't know which products the transactions are involving, nor who the buyers or sellers are. Without the long term identity key, someone analyzing the block chain wouldn't know that Buyer X was involved in transactions 11386, 16882, and 23319. Not even the escrow agent would know that.
-
Perhaps the prices could be "fuzzed" within a range. ie, the seller puts up a product and says that it can sell for between 1.0 and 1.2 BTC. The buyer agrees to buy it and initiates the transaction (or gets an escrow agent to initiate it), and the system picks a uniformly random value between 1.0 and 1.2 BTC to create the transaction. With enough overlapping price ranges, you wouldn't know which transaction price links to which product.
Sellers who expect to get paid 1.0 BTC for their product could pick a range like 0.9 to 1.1 BTC, knowing they would get that, on average, over many transactions.
-
So, while discussing this with some people, I overlooked a trivial defense against linking prices to listings. A seller could list a legal item at the same price. If a buyer wants to evaluate a seller, he asks for the seller's identity key. With that, he knows how many transactions the seller was involved in, and the amount of BTC involved in each transaction. Let's say the seller has one item, a gram of cocaine at $98.50 (a unique price, which is dumb to begin with, but go with it). My idea with fuzzing prices was that the seller would accept random transactions between $88.50 and $108.50 (or some range), so the public price couldn't be linked to the listing. But if the seller simultaneously lists a legal item (or two or three) for $98.50, then there's no way to prove (within the requirements of the legal system) that the other $100K in previous transactions were cocaine.
If the buyer is LE, all he could do is present a list of transactions at the same price, even though other products exist at the same price. He couldn't prove how many of each product were sold.
I really need to bounce these ideas off kfm.
-
I also came up with an equivalent to vendor reviews. Since most vendor reviews are 5 or 1 in practice, we can reduce that to 1 or 0, or signed (endorsed) or not. Vendor reviews can be based on a web of trust of anonymous transaction signing. When transactions are successful, buyers can sign them, even though others don't have their keys. They can still see that many seller transactions were signed by many anonymous buyers. So, when the seller provides his identity key, you are exposed to the number of transactions, the prices involved (to protect against sellers making 100 purchases of $1), the number of resolution disputes (through a special mark on the transaction), and the number of endorsements (or anonymous key signings).
The key difference is that you can't prove what the products in any of the transactions were.
-
this is a situation where full homomorphic encryption would be ideal.
-
this is a situation where full homomorphic encryption would be ideal.
Hm... you know, you're right. It's so limited I tend to forget such a thing is even possible, but it would be very useful here if it were more robust :)
-
Using a centralized blind currency mixing scheme can help solve this some. Silk Road could implement 'Silk Road Coins' (SRC) based off one of the many papers on centralized blind tokens. Then Alice can exchange 100 Bitcoins for 100 SRC. When Alice places an order from Bob she would pay him with SRC which he can redeem for bitcoins from the SR server, and because of the blinding algorithm the admin of SR will not be able to tell that Bob got the SRC from Alice, only from someone who has enough SRC to pay for the order. Since SRC could be reblinded and spent anonymously, they wouldn't even necessarily have to know who has how many SRC at any given time. A vendor could stockpile SRC for a while before exchanging them for Bitcoins, and then the admins wont even know the amount of transactions the vendor made to get those SRC since the SRC made from many deals would be mixed together into one arbitrarily sized pot. They could still collect their fee simply by charging a 5% fee for the exchanging between SRC and BTC.
As far as the product listings go, that is a little bit more tricky. It is pretty much going to be impossible to hide the products from the site admins while making them available to everybody else, because as was previously mentioned the admins could just make accounts themselves. It will also be hard to separate the listings from the vendor offering them without removing the ability to do reviews on vendors. You could break the review system apart into reviews on single product listings rather than reviews on vendors with many listings, but that will make scamming easier (also you would need open anonymous registration for all vendors and customers, so no more charging for vendor accounts. Perhaps you could charge a smaller fee for individual listings, payable with SRC). Many of the things you are mentioning could be done, but not with a centralized design. So with a centralized design I think you can have a 'taxed' market where transaction details (who is paying who for what) are hidden from the site admin, but the product listings (what is available) would need to be publicly available.
Maybe if you clarify the specific requirements of what type of system you are looking for, I will be able to think of a way to do it. Right now I am a little shaky on the details of exactly what criteria you want this market to meet.
edit: actually if anybody can anonymously exchange BTC for SRC, the site admin wont even need to know who has how many SRC at all, vendors or customers.
-
Yeah, I'm starting to think that the only way this could work is if you got rid of the market admins altogether and used a decentralized database like the block chain. The "network" maintains it. As long as there is at least one node, the market continues to exist. Sellers submit listings which are encrypted but provide key words. Buyers discover listings by searching for key words. Transactions are p2p, so only the buyer and seller know that a particular transaction involves them (although anyone who downloads the database knows that a transaction exists, just like bitcoin transactions today).
You lose escrow and resolution, but that could be provided by third party services for a fee. In fact, that's what the role of "market admin" in today's markets could evolve into. Before engaging in a transaction, buyer and seller could agree to use a third party escrow/resolution service. They inform the service, which initiates the transaction, but the service doesn't know the content of the transaction (the product and price) until a dispute happens and it is revealed to them. In order to complete the transaction, both parties must sign it, or something along those lines. So a dispute happens when at least one party refuses to sign. At that point, the transaction details are revealed to the escrow/resolution service, which decides on a result. Another way of putting it is, a third party initiates a special type of transaction, whose contents are not revealed to the third party as long as the first two parties (buyer and seller) sign it within some amount of time.
In this system, some transaction details would be revealed to outsiders, but all the transactions that conclude successfully remain private between buyer and seller. That's better than the current situation, where one entity knows about all transactions.
Sorry for some reason I didn't see the posts past SS lol. When you have a distributed system you can start to do what you are talking about pretty easily. One thing you could do is called 'Encrypted Keyword Search'
Encrypted Keyword Search involves a set of servers holding encrypted documents, they are queried by clients with keywords (that the servers can not see), they search the encrypted documents (which they cannot actually see the plaintext of) for the keywords, and then they supply all of the resulting hits to the clients (without the servers being able to determine which of the documents they are supplying to the clients). This is a very advanced sort of cryptosystem that is actually built up by incorporating many different techniques.
-
See, that's why I was hoping for your input. I'm sure there are innovative crypto concepts that would be helpful here, which I simply don't know about. :)
-
I'm actually not that interested in hiding the listings. If every listing is an illegal item, then you can conclude that every person who engages in a trade in that market is doing something illegal, but if 5-10% of the listings are legal items, like on SR, and many sellers would put up dummy legal listings for their protection -- and there's no reason it couldn't be a general market anyway, I mean why do Amazon and eBay need to know what you are buying? -- then you get plausible deniability. The important feature is that the products involved in each trade are unknown to third parties. Of course, the escrow agents would find out about some of them, but if 95% of transactions complete successfully, the situation is much better than it is now.
And as I argued last night with some folks, in the *worst case scenario*, you get exactly what you have now. A trusted third part escrow agent learns your entire transactions history.
-
Encrypted keyword search allows a server to perform
a search over a set of encrypted documents on be-
half of a client without learning the contents of the
documents or the words being searched for. Design-
ing a practical system is challenging because the pri-
vacy constraint thwarts standard indexing and rank-
ing techniques. We present Mafdet, an encrypted
keyword search system we have implemented. Our
system makes the search practical even for large data
sets. We evaluated Mafdet’s performance on a set of
queries and a large collection of documents. In these
queries, Mafdet’s accuracy is within 6% of Google
Desktop, and the search time is on the order of sec-
onds for document sets as large as 2.6 GB.
...
Controlled searching. The server cannot learn
anything about the contents of documents, ex-
cept when the client performs a search.
Hidden queries. The client can search for a
set of documents containing a keyword without
revealing the keyword to the server.
Query isolation. From the query result, the
server learns nothing about the plain text other
than the set of documents that match the query
(and possibly the limited statistical information
used to perform ranking).
Update isolation. The server learns nothing
more from updates than it would if we main-
tained no additional metadata for the purpose of
performing searches.
-
I'm actually not that interested in hiding the listings. If every listing is an illegal item, then you can conclude that every person who engages in a trade in that market is doing something illegal, but if 5-10% of the listings are legal items, like on SR, and many sellers would put up dummy legal listings for their protection -- and there's no reason it couldn't be a general market anyway, I mean why do Amazon and eBay need to know what you are buying? -- then you get plausible deniability. The important feature is that the products involved in each trade are unknown to third parties. Of course, the escrow agents would find out about some of them, but if 95% of transactions complete successfully, the situation is much better than it is now.
And as I argued last night with some folks, in the *worst case scenario*, you get exactly what you have now. A trusted third part escrow agent learns your entire transactions history.
If you are not interested in hiding the listings you can solve the problem completely with blind mixing.
I don't really get the advantage of having escrow agents. If the customer can always request a refund, why no just have vendor send product first and then customer pays them? If they split 50-50 on disputes, why not have customer send 50 percent up front? I don't see the point of having escrow.
-
So, here's how I envision it working.
You download a client, which is similar to SelfSovereignty's MetaSilk. In fact, MetaSilk might be a good starting point for such a client. There could be many different clients created by different people, like the variouis bitcoin clients, and you could choose who you trust.
The client launches a Tor instance and connects to other nodes in the network, and downloads/updates the database. Obviously, some nodes will have to run as hidden services, so there would be an option to create a hidden service for your client to let other clients reach you and download the database. In practice, a small percentage of nodes would run as hidden services, while most people would run their clients in non-hidden service mode, but once they are connected to other nodes, they can fully participate in the market.
The client manages your keys. The first time you run it, it generates a longterm identity key, which you will want to back up so you can maintain your buyer/seller history. It also generates one-time transaction keys and calculates stats when you are given other people's identity keys.
The client also manages your funds. It's like a bitcoin-qt client in that way. It maintains a wallet.dat, or perhaps multiple indepedent wallets for added privacy. You load coins into your "buyer account" by sending them to the bitcoin addresses in your market client.
It has an interface for searching and viewing listings. When you are ready to buy, you click a button to submit the transaction, which is signed with a one-time key signed by your identity key.
You can also submit listings at zero cost.
The most important part is that there's an option to include an escrow agent. You enter the agent's identity key (the Silk Road Escrow Service identity key would be well known, for example), and your client contacts his client (which is a hidden service) to initiate the transaction. You would only have to do this once, and your client would store the escrow key for all future transactions. The escrow client could simply be a marketd, running on a server, with scripts that automate the escrow process until human interaction is needed during resolution.
The escrow client initiates a transaction with encrypted data that the operator can't see. The only thing he sees is the BTC value, from which he will receive a fee. Each escrow agent can set his own fees. You agree to the fee when you elect to use the escrow agent's key and send him a notice to initiate the transaction.
The escrowed transaction has a time limit. The buyer and seller must sign it again within a certain amount of time, or the contents of the transaction are revealed to the escrow agent. This is effectively the same as finalizing.
** I'm not sure exactly how this time-based decryption system would work. This is where I need your help kmf. **
If the transaction is disputed, the escrow agent resolves it and signs the transaction to either return the funds to the buyer, send them to the seller, or some kind of split.
If the transaction completes successfully, as presumably > 90% of transactions would, both parties sign it and the funds are sent to the seller.
There are other things that could be built on top of this sytem. For lazy people, those who don't want to run desktop clients, or those who don't care about their security, there could be hidden service web sites where they could create accounts, and which manage all of this stuff for them. In fact, SR could be an interface for this. It's analogous to people choosing between hosting their coins on an ewallet or with their own bitcoin-qt client, even though everyone knows it's safer to run your own bitcoin-qt client. If people choose to use a third party service, that's no worse for their privacy than the current situation. In fact, it's exactly the same.
The great thing is that the user has choice in every step of the process. You can choose to use a third party market service or not. You can choose to use an escrow service or not. You can choose to reveal more info or not. There is no default transaction history or buyer stats that someone must know in order for you to participate.
Now, who wants to build it? :)
-
If you are not interested in hiding the listings you can solve the problem completely with blind mixing.
I don't really get the advantage of having escrow agents. If the customer can always request a refund, why no just have vendor send product first and then customer pays them? If they split 50-50 on disputes, why not have customer send 50 percent up front? I don't see the point of having escrow.
Because of disputes. You need a trusted and disinterested third party to resolve them.
-
Instead of a traditional escrow, how about using mutually assured destruction to create Nash equilibrium.
Example:
Alice want to sell her bra for 1 BTC and Bob is willing to buy it using this new type of escrow.
Now Alice have to give the price she ask for the bra to the Ecrow, so she gives 1 BTC to Escrow.
Bob has to Finalise Early - pay first.
If there's a dispute - Alice scams Bob by not sending the bra after Bob paid - Alice's 1 BTC held by the Escrow will be destroyed, so Alice scammed 1 BTC from Bob, but lost the same amount because there was a dispute.
Alice can't win by scamming Bob because she will loose the same amount she had to deposit to the Escrow.
Bob still can fuck with Alice by crying a scam, thereby Alice will loose twice as much as Bob (she lost 1 BTC and her Bra, while Bob just lost 1 BTC he had to pay)
There are also variants of this same system that might be more applicable for our use (where both of them have to deposit 1 BTC each so if Bob wants to fuck with Alice and pretending to be scammed, he will loose the same amount as Alice, so it costs him 2 BTC to destroy Alice's 2 BTC)
2 by 2 multi-sig transactions can be used by the Escrow to hold-release-destroy the deposits. Escrow makes a multisig address with Bob, and a separate one with Alice, so they will need Escrow's signature to spend that 1 BTC. If there's a dispute Escrow will destroy his private key for that address thereby unable to sign those transactions and both BTC's are lost. If there's no dispute Escrow signs so Alice and Bob can have their 1 BTC back.
An other type of blind market could be where the buyer doesn't know who's he buying from. He can choose the product, but the system will select the Vendor from a pool almost* randomly.
*A better feedback could mean better chance to be selected.
This would solve the problem where LE can catch the Seller by ordering multiple times from the same Vendor , build a profile, and waterhole attack the post offices or drop boxes he used. (The stamps on the packages contain the exact post office that was used to ship it, at least in the EU).
LE can place an order than leave some LEOs at the Post offices, and tell clerks to look for a package that looks like the profile they built up from their previous orders, and the address. They can also try to use the CCTV footage to find the same person who was at the nearby drop off points at the time of every shipping, try to follow the person to his car or home on the archive footages etc.
-
Those are some interesting ideas for escrow services, summer. There are trade offs. In the first example, Bob loses 1 BTC even though Alice breaks even on the scam, so a malicious person could scam just to hurt other people, at no personal benefit. On the other hand, people could elect to use such an escrow service because it never finds out their transaction details.
The great thing about a decentralized market is that all of these escrow services could exist. People could choose which ones they want to use. Seller's could include in their product listings the types of escrow services they are willing to use. A trusted escrow provider could offer different types of escrow: a "traditional" service where live humans get info and evaluate the situation, and various automated and blind services, perhaps at different percentage rates.
Let a thousand flowers bloom. This is how innovation happens, when you escape the grip of central control. :)
-
Yep, let there be many type of escrow to choose from.
The question is why would Alice destroy Bob (the buyer's) BTC while she can't make profit from the scam (she scammed 1 BTC from Bob, but lost the deposited BTC)
None of the parties can have direct financial gain from scamming the other, both of them will in fact loose if one of them decides to scam.
It's still safer for both parties than the usual escrow system where the buyer can scam the vendor claiming he did not get any product and get it for half price, or the vendor can ask the buyer to FE and get the full price.
-
Yep, let there be many type of escrow to choose from.
The question is why would Alice destroy Bob (the buyer's) BTC while she can't make profit from the scam (she scammed 1 BTC from Bob, but lost the deposited BTC)
None of the parties can have direct financial gain from scamming the other, both of them will in fact loose if one of them decides to scam.
It's still safer for both parties than the usual escrow system where the buyer can scam the vendor claiming he did not get any product and get it for half price, or the vendor can ask the buyer to FE and get the full price.
There are people in this world who gain pleasure from hurting others. This is an unfortunate fact of the human race. You have to expect it and compensate.
-
That's true. And it's also the case that no human escrow agent makes 100% correct decisions, so no system is perfect, which is why a market of escrow service types would be a good thing. The market can decide in aggregate what the best form of escrow is.
I was originally trying to include an escrow protocol in the market, but now I see that a decentralized market protocol should be agnostic to that. It should merely provide a protocol to contact escrow agents, while preserving the privacy of buyer and seller, and the transaction details, unless they opt for a service that requires that info.
-
i'm interested in the concept of a blind market, where the market admins don't know what products are being sold. (This reduces their liability as well.) They only know that buyer1 sent 3 BTC to vendor2 for product <hash3>, or something along those lines. If the market were compromised, LE would only see a series of anonymous transactions for unknown products. The only ones they could deduce would be transactions that they were personally involved in as buyers or sellers.
Now we have the exact opposite. Instead of hiding buyer stats from the admins, they are publicly exposed.
KMF, what is the ETA on that decentralized market?
-
Bump and sub to this thread because love the "blind market" idea and that astor guy is just boss like that.
-
Fantastic discussion.... so very interesting.
-
i'm interested in the concept of a blind market, where the market admins don't know what products are being sold. (This reduces their liability as well.) They only know that buyer1 sent 3 BTC to vendor2 for product <hash3>, or something along those lines. If the market were compromised, LE would only see a series of anonymous transactions for unknown products. The only ones they could deduce would be transactions that they were personally involved in as buyers or sellers.
Now we have the exact opposite. Instead of hiding buyer stats from the admins, they are publicly exposed.
KMF, what is the ETA on that decentralized market?
Well only two more really difficult parts are left to implement and a bunch of trivial stuff after that. If it had someone working on it 8+ hours a day it could probably be done in 3-5 months. If it was only for private messaging and not group communications or anonymous transactions and other fancy things it could be done in 1 month because it could use a simple PIR instead of a sophisticated EKS. If it used trusted blind mixing instead of trustless blind mixing it could probably be done in a little over a month. Using EKS allows it to scale to group communications while resisting social network analysis, and also allows it to be used for a lot more general purpose things than one to one messaging, including group messaging and blogs and possibly even basic hidden websites (like Freenet Freesites more than I2P or Tor hidden services) and small scale file sharing (documents and images, not movies and games). The main problem right now is getting 8 man hours a day on it for 3-5 months, the people making it are not being paid or planning to make money off it and they have already put thousands of hours into coding what is done already, I think they are burned out a bit and not having as much time to donate to coding as they did at first. Figuring out how to implement EKS is going to be the biggest time sink, implementing fairly advanced cryptographic systems is time consuming and delicate work and it could take 3 months of research to even understand a whitepaper well enough to attempt implementing it. Hopefully trustless blind mixing can borrow heavily from Zerocoin, even implementing one of the centralized blind mixing schemes could take nearly a month.
So for 1:1 messaging it could be done in a month with 8+ hours a day spent on it, for 1:1 messaging + Centralized Blind Bitcoin Mixing maybe two months, for 1:1 messaging + Trustless Blind Bitcoin Mixing 2-3 months depending on how easy it is to use the existing Zerocoin code for it or to base a similar system off the existing Zerocoin code and research, for a true "blind market" I wouldn't guess it would be done any sooner than 3 months but think it could be done by 5. It already has about 12 months of coding into it at about 8 hours a day.
These are just rough figures as well it is really hard to estimate how long it will take to implement a given component. Implementing the anonymous routing algorithm for the mix network took several months by itself though and I don't think EKS looks any easier and actually it is probably harder but some of the components for it are already in available libraries, and there is already a really rough implementation available that can work as a frame work.
-
Why not put the code on Github and maybe other people will be interesting and help out? Somebody forked Bitwasp, so others are interested in the same concept, although their approach is different. If they don't know how to implement EKS, they could help with the GUI or whatever.
Why not accept donations? Of course, for that it would be nice to have a semi-working example to look at.
-
Original BitWasp: https://github.com/Bit-Wasp/BitWasp
The fork called Amped Market: https://github.com/ampedup/AmpedMarket
-
I have just read through the full thread here, and it is a lovely idea, Whether it could be succesfully implemented would be another story, I do however think that people would be willing to make donations without a doubt as it is for everyones benefeit. If you look at the spare coins thread on the forums, People are always happy to donate to others that they dont know and gain nothing from. If they were gaining another layer of security and worst came to the worst you would end up with no less than what we have now, I for one would be happy to make the first donation.
I think astor you may be on to something here, with kwf and SS, you three could seriously put some work into this and make it a reality I reckon, It would of course require many "test runs" so to speakto try it out and find/fix some holes in the system, but after some thorough prodding im sure it could be implemented without much failure/loss.
I also think it would be more of a system for people with slightly more knowledge of how things work, for instance, having to search for hashes etc, Can you imagine the posts in the forum with all the newbies asking about it? It would be chaos.
Summer your system actually seems very adequate.
I wish you all the best with this endeavour and will be watching with interest.
-
Decentralization is the the future. Freedom Hosting and Tormail taught us that in a painful way. 80% of onionland is gone, and dozens of threads have been created, asking about Tormail alternatives. Centralized services are easy targets. SR is now the biggest target in onionland by lightyears. We must decentralize the darknet drug markets before they are gone too.
A good but imperfect example is Torchat. It's the only fully decentralized messaging system that runs over Tor by design. As long as you and your friends have Torchat clients, nobody can stop you from communicating anonymously. There is no server to seize. A small drawback is that Torchat runs a hidden service on your computer, which makes you vulnerable to certain attacks, but in my view it's not a problem for the vast majority of people, as long as they don't make their Torchat IDs public.
When I started this thread, I wanted to solve a different problem: how to keep my activities private from the operators of the service that I'm using. The conclusion that I came to, the only way to do it it effectively, is to use a decentralized market that has no admins. Now I see that that solution solves an even bigger problem: authoritarian censorship.
-
I also think it would be more of a system for people with slightly more knowledge of how things work, for instance, having to search for hashes etc, Can you imagine the posts in the forum with all the newbies asking about it? It would be chaos.
The users wouldn't search for hashes. Pretty much you and I do an ECDH key exchange and derive a shared secret, which is then hashed out three times:
54157144abd1905dc3112d50822bd6329f004f7c0dbe773b98b0dbe1febe5fcb
fe3f89e29d144fe1ab5865a1d1d3eb16834949970e8cacd72286523b7a322720
74cc5fec94a2b633da2c628aa0d23d850a3ad5ddaa31469c6219897de3cfd8e5
The first hash is our shared private contact string, the second hash is my first public contact string and the third hash is your first public contact string. When I send the first message to you, I tag it with: fe3f89e29d144fe1ab5865a1d1d3eb16834949970e8cacd72286523b7a322720 as the keyword. When I send future messages to you they are tagged with the hash of our private contact string and the previously used public contact string, so the second message I send to you is tagged with
sha256sum(54157144abd1905dc3112d50822bd6329f004f7c0dbe773b98b0dbe1febe5fcbfe3f89e29d144fe1ab5865a1d1d3eb16834949970e8cacd72286523b7a322720) = cb5a1f9e6af49dab36718c85259730bfaf54cb0bf524401682db840a3c0820de
since we both have our private and public contact strings, we can always search for messages from each other. But it isn't like you need to actually type the hash in yourself, or even know about it. All of that happens transparently to the user, every now and then their client connects to an EKS server over Tor
Client <-> Tor <-> EKS
and sends a list of keywords to search for, the list being all of the future public contact strings they can expect from any of their contacts. The EKS server then returns any messages tagged to the user to the user, and because of the properties of EKS it is incapable of determing the keywords the users client searched for and incapable of determining which messages the user obtained.
To send messages to the EKS server clients send them through a variable latency network of hidden service mixes (this part is done already)
Client <-> Tor <-> Mix 1 <-> Tor <-> Mix 2 <-> Tor <-> EKS
so forward anonymity comes from the mix network + Tor, and receive anonymity comes from EKS. Additionally, all the message encryption and authentication etc (this part is done already) takes place transparently to the client, as far as they can see it is just like using regular PM system or posting on regular forum or loading a blog page with a long random looking address. All of the technically advanced stuff happens behind the scenes.
The EKS servers themselves mirror each others content: EKS 1 <-> EKS 2 <-> EKS 3 <-> EKS 4
and users can connect to any of them. Might need to think of a way to make it use space more optimally than total mirroring, but that will open up anonymity problems probably. At first at least it will be total mirroring.
So there are three things left to do:
1. Implement EKS
2. Implement trustless bitcoin mixing
3. Make a GUI and take care of organization related things. Pretty much this part is "make a forum" , the crypto / anonymity / networking / database parts are what were worked on first, and making a forum is going to be extremely easy compared to them (especially since it is largely based on the database code already done, and especially since a cross platform GUI tool kit has already been wrapped up pretty nicely for easy GUI construction for the project).
-
Also users wont even really be aware that they are doing ECDH key exchange in the first place. Essentially you have an address generated that looks like
YourName@random-ass-but-very-small-compared-to-RSA-public-key-because-it-is-an-ecdh-public-key-plus-also-a-little-metadata-all-base64.Agora
you can load the persons contact address via the GUI and this loads everything you need to communicate with the person, including their ECDH/ECDSA key (it is secure to use the same ECC key for ECDH and EDSA so there is no need to use two). This allows your client to also generate your first set of contact strings for tagging messages to the contact, although they will not know how to search for it until they load your address as well. So after loading this you can send encrypted messages addressed to the contact entirely from the GUI, entirely transparent to the user that any encryption or anything advanced is taking place.
You can organize your contacts into groups and send messages to groups of people, or arbitrary selections of one or more person. In such cases the message is encrypted with a single random key and tagged with the contact strings for each of the people you are sending the message to, and an ephemeral ECDH key + 256 bit encryption of the payload key are sent with the message as well, in addition to a little bit of metadata to help keep things synched between all of the users.
-
But the good thing about using EKS is that the underlying system could be used for a lot more than this. We could have "blogs" that are tagged with a single string for keyword search, and let only the owner of the blog edit it (over the mix network) but anybody else gain access to it via EKS. We can also have arbitrary files uploaded via the mix network and downloaded via EKS with actual keyword searches, where the user types in the keyword to search for, more similar to P2P filesharing networks of the past (although it wouldn't be appropriate to use it for large files). The EKS also totally removes the need for a trusted nymserver like Pynchon Gate requires, which allows it to be used for group communications without having the risk of social network analysis, and also without being infeasible from a bandwidth perspective. So EKS is a huge advantage over PIR, but it is also hugely more difficult to implement (I could implement the PIR from pynchon gate in a day).
-
But the good thing about using EKS is that the underlying system could be used for a lot more than this. We could have "blogs" that are tagged with a single string for keyword search, and let only the owner of the blog edit it (over the mix network) but anybody else gain access to it via EKS. We can also have arbitrary files uploaded via the mix network and downloaded via EKS with actual keyword searches
Where are these blogs and files stored? On the EKS servers? And they don't know which specific files they are storing? In that case, it's like having Freenet on Tor?
-
And if the EKS servers store everything, how do they reduce their database size when they get too large? Throw the oldest stuff overboard first?
-
But the good thing about using EKS is that the underlying system could be used for a lot more than this. We could have "blogs" that are tagged with a single string for keyword search, and let only the owner of the blog edit it (over the mix network) but anybody else gain access to it via EKS. We can also have arbitrary files uploaded via the mix network and downloaded via EKS with actual keyword searches
Where are these blogs and files stored? On the EKS servers? And they don't know which specific files they are storing? In that case, it's like having Freenet on Tor?
It is different from Freenet in that Freenet aims to have plausible deniability (ie: your direct neighbors can tell the files you obtain via them, but not if you requested them), whereas in this case, it will aim to provide cryptographic anonymity (ie: the EKS servers are not capable of determining the files you obtain unless they can break the encryption algorithm the EKS scheme is based on). I think the future for receive anonymity is in cryptography, rather than in plausible deniability or having a large network of nodes. Freenet aims to make it so your neighbors cannot tell if you requested a file, but they can tell you obtained the file. Tor tries to make it so nobody knows you have obtained a file, but it can only do this by having such a large network that it is unlikely an attacker can monitor an arbitrary entry and exit node. This will make it so nobody can tell that you have obtained a file unless they can crack a strong encryption algorithm.
Also the EKS servers do know the files that they store, but they are encrypted with keys they might not know, and they cannot tell who downloads or attempts to download any specific files (or if anybody ever downloads any specific file at all, for that matter). There are distributed PIR solutions where no single server can even determine any of the content it holds (versus being able to determine the ciphertexts it holds but not being able to determine who if anybody downloads them), I imagine a similar system could possibly work for EKS although I don't know of any, but I would rather have single server than multi server anyway.
The main difference between PIR and EKS is that with PIR content is indexed by number or similar, and in EKS content is actually searchable for with keywords (so you can get the files that match the keyword 'dog' instead of the file at position 42).
-
And if the EKS servers store everything, how do they reduce their database size when they get too large? Throw the oldest stuff overboard first?
Throwing the oldest stuff overboard is the best solution.
-
If EKS servers can see the content that they store, then presumably
1. The servers and operators will become targets for storing massive amounts of CP or whatever illegal content
2. The servers could filter or remove certain kinds of content (obv 2 solves 1)
-
subbed
-
If EKS servers can see the content that they store, then presumably
1. The servers and operators will become targets for storing massive amounts of CP or whatever illegal content
2. The servers could filter or remove certain kinds of content (obv 2 solves 1)
Freenet nodes can see the content that they store as well, it is just encrypted with a key that they can claim not to know. However, Freenet has 20,000 or so users all storing content, and I imagine there will only be a dozen EKS servers at the most, so they will be more valuable targets than arbitrary Freenet nodes.
Goldberg (damn is this guy the new Chaum or what? He has invented so many amazing algorithms and published so many kick ass papers at this point, he is pretty much a cypherpunk rockstar) created a distributed PIR design in which no individual server knows any of the content it holds, however this is PIR not encrypted keyword search. I don't know of any papers for obtaining this same feature in EKS and I don't think I will be capable of designing such a system any time soon. I am pretty good at implementing things, even some relatively complicated cryptographic stuff....but I am not a professional cryptographer only a hobbyist, and I cannot actually craft a system only implement things from white papers.
www.cypherpunks.ca/~iang/pubs/robustpir.pdf
We then extend our protocol so that queries have
information-theoretic protection if a limited number of
servers collude, as before, but still retain computational
protection if they all collude. We also extend the protocol
to provide information-theoretic protection to the contents
of the database against collusions of limited numbers of the
database servers, at no additional communication cost or
increase in the number of servers.
I think such a system probably would inherently be distributed as well.
-
loving the level of ..bump. in this thread yo
-
Decentralization is the the future. Freedom Hosting and Tormail taught us that in a painful way. 80% of onionland is gone, and dozens of threads have been created, asking about Tormail alternatives. Centralized services are easy targets. SR is now the biggest target in onionland by lightyears. We must decentralize the darknet drug markets before they are gone too.
A good but imperfect example is Torchat. It's the only fully decentralized messaging system that runs over Tor by design. As long as you and your friends have Torchat clients, nobody can stop you from communicating anonymously. There is no server to seize. A small drawback is that Torchat runs a hidden service on your computer, which makes you vulnerable to certain attacks, but in my view it's not a problem for the vast majority of people, as long as they don't make their Torchat IDs public.
When I started this thread, I wanted to solve a different problem: how to keep my activities private from the operators of the service that I'm using. The conclusion that I came to, the only way to do it it effectively, is to use a decentralized market that has no admins. Now I see that that solution solves an even bigger problem: authoritarian censorship.
#realtalk
-
I really find PIR and the systems similar to it to be absolutely amazing. Who would have guessed you can get data from a database without the servers that run the database knowing what data you want or what data they send to you? How about without them knowing the data in the database to begin with!
-
Ahh i get you, That does make it a hell of a lot easier then.
Hmm, the one question that stands out in my mind is if we managed to get it all sorted, major support, donations, brilliant ideas how could we get someone to create the "site" without compromising security by some sort of backdoor being left etc. You could have lots of different cryptographers working on different sections so that the whole thing could not be compromised but think of the damage one lone skilled cryptographer could do , Should he decide to code it maliciously.
-
If EKS servers can see the content that they store, then presumably
1. The servers and operators will become targets for storing massive amounts of CP or whatever illegal content
2. The servers could filter or remove certain kinds of content (obv 2 solves 1)
Unless I'm misunderstanding, #1 has to be true. If servers can see the files they store, but don't have access to the key to those files, I don't see how they can exercise any content-level control over what they store or serve. An encrypted BLOB is an encrypted BLOB.
And it's probably going to throw the oldest content overboard first even if its the most frequently accessed content. I'm guessing there can't even be a method to track when a file was last accessed (since, by definition, it has to be blind to what it's serving). If it could record the last access time at a file level, that opens up timing attacks against anonymous clients. Unless it's happening at a non-correlatable level (basically, at a storage unit level that's impossible to reverse back into what that storage actually comprises).
It probably falls into the usual anonymous storage/communication conundrum. If you design a perfect system that someone else can't censor or track, you can't censor or track it either. So when you support "good" causes for anonymity, you also support Appelbaum's Four Horsemen (CP, drugs, terrorism, and money laundering).
-
Goldberg (damn is this guy the new Chaum or what? He has invented so many amazing algorithms and published so many kick ass papers at this point, he is pretty much a cypherpunk rockstar)
The guy who invented OTR and/or wrote the Pidgin plugin and is currently working on multi-party OTR? If he gives us an mpOTR plugin, I would have 10,000 of his babies.
-
Unless I'm misunderstanding, #1 has to be true. If servers can see the files they store, but don't have access to the key to those files, I don't see how they can exercise any content-level control over what they store or serve. An encrypted BLOB is an encrypted BLOB.
If the content is uploaded encrypted and the operator doesn't know the key, then they don't know what they are storing, so they can't be blamed for what they're hosting anymore than Dropbox can be blamed if someone dumps a Truecrypt file on their servers and LE finds out that file contains illegal content.
Of course, if that's true, they can't censor it either. I mean, Dropbox could reject all Truecrypt files but an EKS server would be designed to store encrypted content.
The thing about dumping old content, I consider that a feature. Look at this forum. Why store every thread since the beginning, when everyone asks the same questions over and over every week? The popular threads stay on the front page for weeks or months at a time, so they are safe. It would work just as well if it was designed like 4chan to roll old threads off the server, say after 3 months of inactivity. Hell, it even warns you not to dig old threads back up and to start a new thread! It is needlessly storing gigabytes of data.
It probably falls into the usual anonymous storage/communication conundrum. If you design a perfect system that someone else can't censor or track, you can't censor or track it either. So when you support "good" causes for anonymity, you also support Appelbaum's Four Horsemen (CP, drugs, terrorism, and money laundering).
Yep, that's what the Tor people keep saying.
-
Ahh i get you, That does make it a hell of a lot easier then.
Hmm, the one question that stands out in my mind is if we managed to get it all sorted, major support, donations, brilliant ideas how could we get someone to create the "site" without compromising security by some sort of backdoor being left etc. You could have lots of different cryptographers working on different sections so that the whole thing could not be compromised but think of the damage one lone skilled cryptographer could do , Should he decide to code it maliciously.
Because it is open source and will be audited by anyone who wants to audit it. Also because the people making it are not going to backdoor it (and if you don't believe me read the source code!) :).
Unless I'm misunderstanding, #1 has to be true. If servers can see the files they store, but don't have access to the key to those files, I don't see how they can exercise any content-level control over what they store or serve. An encrypted BLOB is an encrypted BLOB.
They could always download something from themselves, because they know the strings it is indexed by, and then they can match the ciphertext they obtained to the ciphertext on their server.
If the content is uploaded encrypted and the operator doesn't know the key, then they don't know what they are storing, so they can't be blamed for what they're hosting anymore than Dropbox can be blamed if someone dumps a Truecrypt file on their servers and LE finds out that file contains illegal content.
Pretty much this. EKS servers have plausible deniability kind of how Freenet nodes do. They cannot decrypt arbitrary files looking for illegal content. However if they also store plaintext stuff at all it would be a different story. Also since all nodes have all files, they could easily be told by police that a certain ciphertext is illegal etc. It would be better if the content of the database is hidden as well, I just don't know how to do this with EKS and have not read of any such systems, only for PIR.
The thing about dumping old content, I consider that a feature. Look at this forum. Why store every thread since the beginning, when everyone asks the same questions over and over every week? The popular threads stay on the front page for weeks or months at a time, so they are safe. It would work just as well if it was designed like 4chan to roll old threads off the server, say after 3 months of inactivity. Hell, it even warns you not to dig old threads back up and to start a new thread! It is needlessly storing gigabytes of data.
The EKS servers have no concept of a thread, only individual posts. The threading is done client side.
-
Astor actually brought up a good point and I am struggling to find a perfect solution. Although the content on EKS servers is encrypted, we should assume that the people running the servers will be able to obtain keys for certain content, we should also assume that some of them are malicious and could try to censor information. There are distributed PIR schemes that hide the content of the database from the servers, but a client that queries the database still does so by index position. That means if an entity that runs a PIR server also runs a client, and the client knows a certain message is at index 42, the PIR server can then link the secret share at position 42 to the message downloaded by the client. So even Goldbergs PIR will not work to solve this problem. The problem is characterized as follows:
Given a server or cluster of servers hosting a database, how can we have it so that:
A. Clients can request specific files from the database (via position or keyword)
B. The servers hosting the database cannot determine the clients query (ie: servers cannot tell the position requested or keyword searched for)
C. The servers hosting the database cannot tell the files returned (ie: they do not know what they send back to the client)
D. The servers hosting the database cannot tell the files hosted (ie: they cannot ever see any content that the client eventually obtains, during storage or transfer)
E. An entity that owns a client and a server cannot download a known file from the database in order to be able to associate content on the database with the file (ie: the server cannot link data it hosts to content even if it downloads the content partially from itself while acting as a client)
A, B and C are solved via at least PIR , Oblivious Transfer and Private Stream Searching schemes (actually Encrypted Keyword Search only solves A and B). D has been integrated into at least some distributed PIR schemes, the one I linked to from Goldberg for example. E I do not know how to solve for, and it leads to a problem: servers can determine where a specific piece of content they host is located on their servers and censor it.
I also should start saying PSS instead of EKS, I always confuse the two terms. EKS lets the client obtain encrypted files from a remote server without the remote server knowing the content of the files or the keywords searched for, but it still knows the ciphertexts returned. PSS lets the client obtain encrypted files from a remote server without the remote server knowing the content of the files, the keywords searched for, OR the ciphertexts returned to the client. So it is kind of how PIR lets the client get a file from a database without the server hosting the database knowing the file the client got, and Oblivious Transfer does the same thing and also prevents the client from learning anything else about the database. Cryptography is full of rather subtle distinctions, but I should nip this in the bud as EKS is actually not the correct terminology in this case (although I am the one who introduced the bad terminology, as I said I frequently find that I mistakenly call Private Stream Searching Encrypted Keyword Searching).
-
put another way, how can a user obtain an item partially from a server they operate, without being able to associate a piece of data on their server with the item they obtain? PIR via numeric index is clearly out as the data at the numbered index on the server will be associated with the file the client obtains. I think that single server is also out as well, if the user has total access to the only client and server involved I think this is pretty clearly impossible. I don't know if there is even a solution to this problem yet, or if it is even possible, but it would be a great way to get strong censorship resistance.
-
put another way, how can a user obtain an item partially from a server they operate, without being able to associate a piece of data on their server with the item they obtain? PIR via numeric index is clearly out as the data at the numbered index on the server will be associated with the file the client obtains. I think that single server is also out as well, if the user has total access to the only client and server involved I think this is pretty clearly impossible. I don't know if there is even a solution to this problem yet, or if it is even possible, but it would be a great way to get strong censorship resistance.
You're right.. I'd be floored if there was a way to solve that problem. If you own the server, and you're running a client connecting to it, you should be able to backdoor whatever mechanisms keep you from seeing the true storage location. Even if some amazing algorithm "solved" the problem through quantum physics, the server admin would just backdoor the mathematical functions in the server code until he was at a point where he could see the true storage location.
If that's an absolute showstopper (and I'm not sure it is), the only other method I can think of to prevent it would be to use multiple distributed servers who were seeding/peering, ala Freenet, where if a server owner deleted content, it could magically reappear via servers outside of his control.
Another exposure with this architecture is DoS. If you're blind to anything about the data except it's original creation date, it's difficult to keep someone from simply flooding petabytes into the server (make it eat its tail until it eats its head). Unless there's something like a proof-of-work concept prior to uploading data to effectively rate-limit that forced growth by a single actor.
-
Astor actually brought up a good point and I am struggling to find a perfect solution. Although the content on EKS servers is encrypted, we should assume that the people running the servers will be able to obtain keys for certain content, we should also assume that some of them are malicious and could try to censor information. There are distributed PIR schemes that hide the content of the database from the servers, but a client that queries the database still does so by index position. That means if an entity that runs a PIR server also runs a client, and the client knows a certain message is at index 42, the PIR server can then link the secret share at position 42 to the message downloaded by the client. So even Goldbergs PIR will not work to solve this problem. The problem is characterized as follows:
Given a server or cluster of servers hosting a database, how can we have it so that:
A. Clients can request specific files from the database (via position or keyword)
B. The servers hosting the database cannot determine the clients query (ie: servers cannot tell the position requested or keyword searched for)
C. The servers hosting the database cannot tell the files returned (ie: they do not know what they send back to the client)
D. The servers hosting the database cannot tell the files hosted (ie: they cannot ever see any content that the client eventually obtains, during storage or transfer)
E. An entity that owns a client and a server cannot download a known file from the database in order to be able to associate content on the database with the file (ie: the server cannot link data it hosts to content even if it downloads the content partially from itself while acting as a client)
I think this problem is solved by querying multiple servers. Unless an adversary runs all or almost all nodes, you will be able to retrieve the content. Comparing the results of multiple queries from different nodes will allow clients to determine which ones are censoring, and even what kinds of content they are censoring, and the clients could blacklist those nodes.
In the end, censorship is simply impractical if the network of nodes is big enough.
-
Another exposure with this architecture is DoS. If you're blind to anything about the data except it's original creation date, it's difficult to keep someone from simply flooding petabytes into the server (make it eat its tail until it eats its head). Unless there's something like a proof-of-work concept prior to uploading data to effectively rate-limit that forced growth by a single actor.
Proof of work will stop most potential attackers (people with only a few computers under their control) from doing massive flooding, but it isn't the ideal solution and powerful attackers (people with botnets) can easily get around it.
I think this problem is solved by querying multiple servers. Unless an adversary runs all or almost all nodes, you will be able to retrieve the content. Comparing the results of multiple queries from different nodes will allow clients to determine which ones are censoring, and even what kinds of content they are censoring, and the clients could blacklist those nodes.
In the end, censorship is simply impractical if the network of nodes is big enough.
That is a possible solution as well, however we need to think of the anonymity implications of a server being able to censor some of the information it holds. It would be better to have some cryptographic way of solving this problem rather than sheer brute content availability.
this paper looks interesting: https://www.usenix.org/conference/foci12/one-way-indexing-plausible-deniability-censorship-resistant-storage
Abstract
The fundamental requirement for censorship resis-
tance is content discoverability — it should be easy for
users to find and access documents, but not to discover
what they store locally, to preserve plausible deniabil-
ity. We describe a design for “one-way indexing” to
provide plausibly-deniable content search and storage
in a censorship resistant network without requiring out-
of-band communication, making a file store searchable
and yet self-contained. Our design supports publisher-
independent replication, content-oblivious replica main-
tenance, and automated garbage collection.
1 Introduction
Censorship resistant systems allow users to find and ac-
cess content even if an external entity is trying to pre-
vent this, either by attempting to block specific content
(e.g. by keyword), classes of content (e.g. video files),
classes of websites and services (e.g. social networks),
or block the use of the communication system itself (e.g.
shutting down the Internet). Prior real-world experience
demonstrates that nation-state-level adversaries are will-
ing to engage in all these tactics [5, 18, 31]. Numerous
potential solutions have been proposed [4, 7, 27], but the
problem of plausibly-deniable search and robust storage
remains elusive due to its seemingly contradictory set of
requirements — how does a system maintain a search-
able index of content for users and yet hide it from inter-
mediate/relay nodes and volunteers who store content?
Any useful censorship resistant system must provide
plausibly-deniable in-band search and content privacy on
the wire. Protection for storers as well as intermediaries
is vital, since we expect that any user’s computer may
be seized and examined by a powerful adversary [22], so
the owner must be able to plausibly disavow knowledge
of stored content. That same user must be able to search
and find content in the network which may already be
on his or her computer, but should not discover that it
is stored locally. Prior work has partially addressed this
by encrypting files and requiring out-of-band discovery
of decryption keys, which makes reconstruction of con-
tent difficult. We describe a design for plausibly deniable
search and robust storage for a censorship resistant net-
work that supports natural keyword search while retain-
ing deniability.1 Our design is self-contained — no out-
of-band communication is required to find content nor
obtain decryption keys to decode files. This promotes
usability and reduces users’ real-world risks.
One-way indexing. To solve the problem we propose
“one-way indexing,” such that a user can search by key-
word, but someone storing parts of the file cannot de-
termine the content of the file or query. To publish file
F with keyword kw, Alice partitions it into three logical
portions — the content, consisting of encrypted blocks
b1 , . . . , bk each indexed under ID hash1 (bi ); the content
manifest, containing a list of all block hashes (allowing
retrieval of the file) and indexed as hash2 (kw); and the
key manifest, containing the file decryption key, indexed
as hash3 (kw). To retrieve a file, a user will search for
hash2 (kw) and hash3 (kw), but any node not storing both
manifests must invert the keyword hash in order to re-
trieve the other manifest and reconstruct the file, even if
all file blocks are stored locally.
Robust storage. Censorship resistance requires perpet-
ual and robust storage. We use both erasure coding
and replication at publication time to achieve initial ro-
bustness, and maintain it without publisher intervention.
Once the file has been stored as described above, nodes
who store the file’s content manifest lazily verify that
a file is sufficiently replicated, freeing the original pub-
lisher from responsibility and providing added deniabil-
ity. To prevent mitigate adversaries overwhelming the
system with useless data, we incorporate lazy garbage
collection, randomly selecting unused contents for dele-
tion
So after reading the abrtract it sounds like this might be what we want, I need to read the rest of the paper though. It looks like it has all the functionality of encrypted keyword search (search by keyword, servers cannot determine the keyword searched for), private stream searching (same as EKS plus servers cannot tell the content returned), and the censorship resistance we were looking for (servers can obtain content from the network and still not be able to determine that they host the content as well). So maybe need to start saying OWI (one way indexing) instead of PSS instead of EKS heh.