Silk Road forums
Discussion => Security => Topic started by: kmfkewm on August 08, 2012, 04:09 am
-
Alice and Bob both have a copy of a database that holds the following items:
10101 - 0
11100 - 1
00010 - 2
01101 - 3
Carol wants to obtain item number three without Bob or Alice learning which of the items she is interested in. First she generates a random bit string equal in length to the number of items in the database.
0010
next she generates a bit string that xors to 0 with the first at every position except the database item she wants
0011
she sends one of these strings to Alice and the other to Bob. They xor messages from the database together if their position is marked by 1 in the string the receive from Carol. Alice gets 0010 so she simply replies with the message at position 2: 00010, bob receives 0011 so he replies with the xor of message 2 and 3:
00010 xor 01101 = 01111
Alice and Bob send their replies to Carol who xor them together
01111 xor 00010 = 01101, which is item at position three in the database
neither Alice nor Bob can determine which of the items was requested. I think this is pretty neat, especially for databases with hundreds of thousands of entries mirrored over a dozen different non-cooperating servers :D.
-
it is also a simple to explain PIR protocol that gets much better bandwidth scaling than everybody gets everything (after all, Carol could have just gotten the entire database from Bob or Alice, but that would cost her 5 bits per item instead of 1 bit per item...particularly nice for databases holding entries 1 megabyte or more each).
-
This would get unrealistic pretty quick as the item count goes up. You aren't going to XOR a hundred thousand items per database request.
The idea can be applied to any finite set, its just basic property of parity. Alice could request a set of 16 items including the one she cares about from Party A, and the same 16 without the one she cares about from Party B. Obviously this suffers the same problem as you have above, if you both requests you know which item was being requested.
-
Sounds familiar, like the Cocaine Auction Protocol whitepaper you posted a while back.
Also it's simpler, it still blows my mind that this works:
x = x XOR y
y = x XOR y
x = x XOR y
WTF, that's just unnatural.
-
This would get unrealistic pretty quick as the item count goes up. You aren't going to XOR a hundred thousand items per database request.
The idea can be applied to any finite set, its just basic property of parity. Alice could request a set of 16 items including the one she cares about from Party A, and the same 16 without the one she cares about from Party B. Obviously this suffers the same problem as you have above, if you both requests you know which item was being requested.
why not? I have run this as code and it had no problem with a database holding a hundred thousand items.
-
This would get unrealistic pretty quick as the item count goes up. You aren't going to XOR a hundred thousand items per database request.
The idea can be applied to any finite set, its just basic property of parity. Alice could request a set of 16 items including the one she cares about from Party A, and the same 16 without the one she cares about from Party B. Obviously this suffers the same problem as you have above, if you both requests you know which item was being requested.
That would not give the same privacy benefits, if Alice gets 16 items from a database of a hundred thousand items, once from each of two servers, both of the servers know that there is a 50% chance that she wants one of the 16 items they sent her, and then they know that there is a 1/32 chance that she wants any one of the 16 items they sent her. After only a few requests an individual server could quickly determine which sort of items Alice is interested in. With this, they have absolutely no idea which of the items she wants, unless they get together and share her request strings with each other. She could want any one of the hundred thousand items in the database.
Also the way you say it can be done, would require each of the servers to send Alice 16 megabytes of data per request, if the database items are 1 mb each. The way I described in the OP they would each only need to send her 1mb item, so it scales a lot better and has much better privacy guarantees.
disclaimer: I did not invent this
-
This would get unrealistic pretty quick as the item count goes up. You aren't going to XOR a hundred thousand items per database request.
The idea can be applied to any finite set, its just basic property of parity. Alice could request a set of 16 items including the one she cares about from Party A, and the same 16 without the one she cares about from Party B. Obviously this suffers the same problem as you have above, if you both requests you know which item was being requested.
That would not give the same privacy benefits, if Alice gets 16 items from a database of a hundred thousand items, once from each of two servers, both of the servers know that there is a 50% chance that she wants one of the 16 items they sent her, and then they know that there is a 1/32 chance that she wants any one of the 16 items items they sent her. With this, they have absolutely no idea which of the items she wants, unless they get together and share her request strings with each other. She could want any one of the hundred thousand items in the database.
Also the way you say it can be done, would require each of the servers to send Alice 16 megabytes of data per request, if the database items are 1 mb each. The way I described in the OP they would each only need to send her 1mb item, so it scales a lot better and has much better privacy guarantees.
disclaimer: I did not invent this
I was just suggesting instead of using a bit-string of the number of items in the database, you just use an explicit finite set of items. You'd still XOR the sets together so it'd still be the size of the largest item, and 16 is just some arbitrary number if you want to obscure it more use more. Also something to consider, is if multiple operations are ever done regarding this item if the requested bit-string or set was truly random intersection attacks would start making the list smaller, and at some-point reveal the item of interest. You'd want to use the same set of sampled items over multiple requests, or at least ensure a certain percentage over overlap.
As far as XOR'ing nearly 100 gig of data per request being unrealistic, well...
-
yes at 100GB it would be unrealistic, but if every entry is only 100kb it would be only 10 gigabytes, and as the bits set to one are random each server would only need to XOR 5 gigabytes of data per request with a database of 100,000 items. If entries are limited to 10kb, each server would need to XOR about 500 megabytes per request with a database of 100,000 items.