Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - kmfkewm

Pages: 1 ... 43 44 [45] 46 47 ... 249
661
Security / Re: anonymous membership query
« on: August 01, 2013, 10:07 am »
I would like something that could replace Tormail this year, and that relies on no central server. That is Torchat's greatest strength. Nobody can take it down or overwhelm the network / server, like what appears to happen to Tormail every time there's a database error. The trade off is that Torchat is a live messaging system. Both parties must be online at the same time. If it stored messages like BitMessage, that would be perfect.

Freenet sounds like what you want.

662
Security / Re: anonymous membership query
« on: August 01, 2013, 09:56 am »
You know, a quote suddenly comes to mind.  I find there to be tremendous wisdom in it: "premature optimization is the root of all evil."

Be sure you're solving a problem that you're actually faced with -- do you actually have all these users chomping at the bit, or will it be several years at least before your system (if it takes off) is under that kind of strain?  Are you taking into account the increased computational power that will be available to you at the time you need to actually solve this issue?

Just some thoughts that may or may not be of use... :)

That is a good point. I have been trying to make something that will scale to 500,000 users making 10 posts a day, but so far I have no users and they are making no posts a day. Even assuming ten thousand users making ten posts a day, things become much more realistic in the bandwidth department. Even if we assume all messages are 10KB, and the metadata associated with a message is 1KB:

10,000 users making 10 posts a day brings metadata to about 98 megabytes per day * 10,000 users = about 960 gigabytes in metadata bandwidth per day (98 megabytes per user per day).

10,000 users * 5,000 messages per day * 10 KB per message = 480 gigabytes in message data for receive

assuming Pynchon Gate PIR, there are 100,000 messages per day means each query is 100,000 bits about 12.2 KB * 5,000 posts * 10,000 users = 572 gigabytes in request data

960 + 572 + 480 = 2,012 / 1,024 = 1.96 terabytes of bandwidth required per day, which is realistically affordable with a high end server package.

On the other hand, assuming 500,000 users making 10 posts a day brings metadata to 4.8 gigabytes * 500,000 = 2343 terabytes in metadata a day, which is completely unrealistic.

But I would much rather make something that can scale to 500,000 users than something that only scales to 10,000 users.

663
Security / Re: anonymous membership query
« on: August 01, 2013, 09:32 am »
Also, if they are anonymous users connecting over Tor, then why does it matter? Why not ask for the message directly?

Because I don't trust the anonymity of Tor and I am trying to make a system that can resist the NSA :). If I simply wanted to make a distributed encrypted forum that uses Tor I would have been done last year. The anonymity part is what really makes things fucking hard.

664
Security / Re: anonymous membership query
« on: August 01, 2013, 09:29 am »
How can you ask a server for a block of data without the server knowing which block it is?

The answer to that question is in several papers on Computational Single Server PIR (CPIR). Please let me know if you find one that will work for what I am trying to do, I will read it after you suggest it to me :D.

Okay, you know more than me about this stuff, but... I thought it could be proven that this isn't even possible, meaning that you can either have a single server that's not computationally-theoretically robust but can be inefficient and information-theoretically robust, or multiple servers that are both (proportional to the number of parties) computationally-theoretically robust and information-theoretically robust?  Or is it... oh fuck it man, I can barely even remember the words let alone the concepts.  I forgot stuff like this the week after the tests ::)

It is proven that a single server information theoretically secure PIR algorithm must be everybody gets everything (ie: the entire database must be sent to have information theoretically secure single server PIR), but single server computationally secure PIR is possible (just like a one time pad is information theoretically secure, but RSA is computationally secure. Single server PIR works if it has security comparable to RSA, but not comparable to a one time pad).

665
Security / Re: anonymous membership query
« on: August 01, 2013, 09:27 am »
How can you ask a server for a block of data without the server knowing which block it is?

The answer to that question is in several papers on Computational Single Server PIR (CPIR). Please let me know if you find one that will work for what I am trying to do, I will read it after you suggest it to me :D.

LOL. I thought the explanation with Alice and Carol was in response to my question, and I thought, that allows Carol to determine a message for her exists, but still, how does she ask the server for that message? I suppose she could ask for the whole database or a block of it that happens to include the message, so there's some plausible deniability.

Also, I want to know what Alice and Carol's secret is!

How she asks the server for the message after she determines it exists is part two hehe. I suppose she could do some private stream searching algorithm, but as far as I know all of these sort of assume that each message is padded to the same size, gah. Alice knowing that the server has a message for her is good, it would be even better if she could determine where in the database the message is. The simplest solution would be for the server to keep track of every contact string and send to every client the list of contact strings, the size of the message they point to, and where in the database they start. That is bandwidth prohibitive when things scale up to hundreds of thousands of clients. Assuming every message has 1 kb of associated data, and there are 500,000 clients getting 10 messages a day, that means there is about 4.8 gigabytes of metadata a day. Sending that to 500,000 clients is not going to fly, so the clients need to be able to query for the data that is relevant to them without the server being able to determine that the data is relevant. I think with a bloom filter and PIR I can hopefully maybe get it so the clients can determine if they have a message waiting for them with low bandwidth, but then the problem is how do the clients find out where the message to them is in the database, and how do they get it without the server knowing it is the message they got.

The biggest issue I am running into is that all of these systems seem to really like the idea of related data being blocked together into sequential buckets that are all padded to the same size instead of thrown up all over the database with 1kb here and 4kb there. Perhaps I could have every message broken down into like 1KB packets, with each packet tagged with the contact string and obtained with private stream searching. But the problem is that the server cannot learn the total number of packets that an individual message consists of. I want to have messages of various different sizes and let clients get them from random spots in the database, and then padding to a certain size, without the server being able to tell the number of individual packets that are returned for a single keyword search. Gah it is really hard to even think about this shit for me honestly, it is a level or two beyond what I know about cryptography and math, and I am struggling to think of how to do it. It is like I am right on the verge of a break through but I just cannot quite get all the way there. I wish I just knew some papers that will work to do what I want when I implement them, I have a much easier time being told what to implement, getting the paper and studying my ass off until I can do it, than I do trying to figure out exactly how to describe what I want and then reading fifty papers trying to find if any of them sound similar hehe.

666
Security / Re: anonymous membership query
« on: August 01, 2013, 09:12 am »
Quote
So basically the database server never knows who it has a message for, only that said message exists. The more "Bobs" there are, the less likely the knowledge of the recipient of such a message is, which would be further obscured by increasing string length?

Yep. In Pynchon Gate the server knows who it has a message for, but not who gets the message. I want a system where the server doesn't know who it gets a message for and doesn't know who gets the message, to protect from network analysis attacks when I scale things to group communications.

Because I don't want the server to know "I got a message for Alice and Bob and Carol" because then it knows Alice and Bob and Carol communicate with some third party as well possibly. I want the server to know "I got a message for random one time use string A and random one time use string B and random one time use string C" because then it cannot link Alice and Bob and Carol, or one entity to another. Otherwise we could end up with a system where the server cannot tell which IP belongs to Alice and which IP belongs to Bob and which IP belongs to Carol, but it can tell that Alice and Bob and Carol communicate with each other. I want to avoid the ability of the server to tell who is communicating with who even if it cannot tie an IP address to any of the who.

667
Security / Re: anonymous membership query
« on: August 01, 2013, 09:02 am »
Quote
So basically the database server never knows who it has a message for, only that said message exists. The more "Bobs" there are, the less likely the knowledge of the recipient of such a message is, which would be further obscured by increasing string length?

Yep. In Pynchon Gate the server knows who it has a message for, but not who gets the message. I want a system where the server doesn't know who it gets a message for and doesn't know who gets the message, to protect from network analysis attacks when I scale things to group communications.

668
Security / Re: anonymous membership query
« on: August 01, 2013, 09:00 am »
How can you ask a server for a block of data without the server knowing which block it is?

The answer to that question is in several papers on Computational Single Server PIR (CPIR). Please let me know if you find one that will work for what I am trying to do, I will read it after you suggest it to me :D.

669
Security / Re: anonymous membership query
« on: August 01, 2013, 08:52 am »
Thanks for suggestion of Percy++ SS, it looks like it has a very nicely written paper and I can probably figure out if it will work fairly quickly. It actually looks quite promising right now, after a brief read of the whitepaper, but I need to be sober and spend some time reading it before I conclude it will work. It is already implemented in C++ though, and has a really nice and well explained whitepaper, so it shouldn't take me too long to see if it will work for me. The biggest problem I see with it immediately is that it is multi-server, I would prefer computational single server PIR, but distributed database isn't the end of the world it just adds some to the complexity.

I note from the paper that it was actually designed with the intention of improving on the pynchon gate PIR, so in either case I will probably use this for Pynchon Gate if I end up not being able to do what I want.

Quote
of PIR, an “item” is often thought of as a single bit out of
an n-bit database, but it could also be a “block” of size b
bits. In the latter case, the n-bit database is considered to be
composed of n/b blocks, each of size b bits.

I really need a PIR system where it is a single bit out of an n-bit database, pynchon gate style PIR works great for blocks but when it comes to single bits it is worse than everybody gets everything.

Quote
A number of
applications have been proposed for PIR, including patent
and pharmaceutical databases [1], online census informa-
tion [17], and real-time stock quotes [17]. The Pynchon
Gate [11] shows how to use PIR for an arguably more real-
istic purpose: retrieving pseudonymously addressed email;
it argues that PIR is a more suitable primitive for this appli-
cation than previous proposals.
    A trivial solution to the PIR problem is simply to ask
the server for the whole database and look up the desired
bit or block yourself. To make things more interesting (not
to mention practical), we analyze the communication cost
of the protocol—the total number of bits transmitted—and
insist that it be sublinear; that is, less than n.
    There are two main types of PIR: information-theoretic
and computational. In information-theoretic PIR, the server
is unable to determine any information about your query
even with unbounded computing power. In computational
PIR (CPIR) [3, 8], the privacy of the query need only
be guaranteed against servers restricted to polynomial-time
computations. Note that in the information-theoretic case
the unbounded power is only to be used to try to compro-
mise your privacy; in either case we still insist that you and
the servers use only polynomial-time computations in order
to perform the protocol.
    It is an unsurprising fact that information-theoretic sub-
linear PIR is impossible with a single server. However, it is
possible when there are l servers, each with a copy of the
database—assuming that the servers do not collude in order
to determine your query. A t-private l-server PIR is a PIR
system in which the privacy of the query is information-
theoretically protected, even if up to t of the l servers col-
lude. (Of course, it must be the case that t < l.)
    Beimel and Stahl [2] investigate the case where servers
can fail to respond. In this event, it is important that the

670
Security / Re: anonymous membership query
« on: August 01, 2013, 08:34 am »
Okay here is a simple and theoretically sound demonstration that unfortunately will not work in practice. Alice is the user and she talks to Carol, and Bob is the server.

Bob has a bloom filter. Okay, I am keeping this small because I need to write the entire thing out for demonstrative purpose. So Bob's blank bloom filter looks like this

00000

Alice sends a message through Bob to Carol tagged with "I am some secret between Alice and Carol". Bob uses a public algorithm to translate this tag into a series of bit positions, let's say position 0, 2 and 4. So now Bob sets the bits at those positions to be 1 in his bloom filter, to show (probabilistically) that a message with the tag "I am some secret between Alice and Carol" has been added to the database. This gives us a bloom filter that looks like this:

10101

Now Carol wants to know if Bob has a message to her that came from Alice. Since Carol knows "I am some secret between Alice and Carol" and she knows the public algorithm Bob used to translate this secret into a series of bits to set to 1 in his bloom filter, now Carol can determine probabilistically if Bob has that message in his database, by doing PIR to get the bit values at positions 0, 2 and 4 in Bob's bloom filter.

This is where I need to diverge significantly, because I only know one PIR system very well, and it does not work well for this, and it assumes distributed copies of the database. So now let's assume that there are 2 Bob's , and neither cooperates with the other, and they both have the same bloom filter. So the Bob's databases look like this:

10101

and Carol needs to get position 0, 2 and 4. In Pynchon Gate PIR Alice would do this by first generating a random string with one bit for each message in the database. In this case each message in the database IS a bit, so it kind of sucks, but for example:

00101

And now Carol needs to make a second random *looking* string that causes the first string to Xor to 0 at all positions except for the position she wants. So first Carol wants position 0:

00101

0 XOR 1 = 1
0 XOR 0 = 0
1 XOR 1 = 0
0 XOR 0 = 0
1 XOR 1 = 0

so Carol now has the second string 10101. So now Carol has the two strings

00101 and 10101 and she sends one to the first Bob and one to the second Bob. The Bob's then XOR together messages at positions marked by 1 in the string the receive and send the result back to Alice. So the database is 10101:

Bob1 gets 00101

1 XOR 1 = 0

Bob2 gets 10101

1 XOR 1 = 0 XOR 1 = 1

Carol gets 0 and 1 in response, 0 XOR 1 = 1, so the first bit in the bloom filter is set to one. I am not going to write this out all the way, but Carol does this for each of the bit positions associated with the message, and she finds that the result is

111

which means that probabilistically the message from Alice has been added to the list of messages.


671
Security / Re: anonymous membership query
« on: August 01, 2013, 07:54 am »
Pretty much here is where I am at. I have been working for about two years now coding a system for secure communications, and have tens of thousands of lines of source code already done. I have been sort of designing this thing as I go. At first I tried to make entire design by myself. That was a bad plan. Then I focused on the academic literature and implemented other peoples ideas. That was a much better plan. I have Sphinx cryptographic packet format completely implemented in C (as well as LIONESS block cipher that it uses). I have greatly wrapped OpenSSL crypto library, and can do ECDH key exchange, AES counter encryption, MAC, and all of the other stuff, with a simple call. I have a NIST password entropy estimator implemented, as well as a bloom filter, as well as wrapped database libraries for easy database management, secure wipe, tons of shit. I also have Alpha mixing mix network system almost fully implemented, and distributed directory servers almost fully implemented, tons of networking code, and a lot of other shit I am sure I cannot remember right now (I have taken a multi month break on working on this because of various reasons). I also have the PIR of Pynchon Gate implemented. So I am close to having the entire Pynchon Gate whitepaper implemented, using Sphinx and Alpha mixing for forward messages, with automatic ECDH between communicating parties, etc. I also have Tor wrapped up and integrated, as all mixes are run as hidden services and all clients use Tor.

But I want to diverge from Pynchon Gate at this point. I don't like that Pynchon gate has a semi-trusted nymserver that gathers all messages to me and then puts them in a bucket that is distributed for PIR. I also don't like that Pynchon gate assumes person to person communications rather than group communications. I want to remove the nymserver by having contact strings between communicating parties. Essentially the communicating parties do ECDH key exchange and have now shared secrets which can be iteratively hashed out to tag to messages for identification. Now I do not need a trusted nymserver that gets messages for Bob and puts them in Bobs bucket, because each individual message is tagged with an unlinkable string that Bob can identify is for him. Okay that is great, except for two things. One thing is that Pynchon gate has buckets that are all padded to the same size, so every PIR cycle Bob always gets his entire bucket of messages and then sorts out the individual messages. This is made possible because of the Nymserver, which bunches all messages to Bob together for retrieval with block PIR. So instead I want Bob to have these contact strings for a thousand people he is communicating with, and every cycle I want Bob to query the PIR server to find out which of his contacts have sent him a message. This is what I am talking about in my original post, how Alice can ask Bob if he has a message for her without Bob knowing which message Alice is asking for. After Alice finds that Bob does have a message for her she then needs to retrieve it. I am a bit shakey on this part as well. If messages are not all the same size then using block PIR to get messages will be a waste of time since the message sizing leaks the message. So all things must be padded to the same size before they are retrieved. But I would rather for the user to be able to do non-sequential PIR to get say 6 KB of messaging data tagged with one contact string in one spot of the database and 2 KB of messaging data tagged with another contact string in another part of the database, and then 2 KB of padding, instead of the user having all of their total message data put in a sequential bucket with 2 KB of padding. So essentially I am working on figuring out

1. How can Alice ask Bob if Bob has a message for Alice without Bob knowing which Message Alice is interested in?
2. Once Alice knows she has a message, how can she get the message without leaking what the message is? 

without having to individually pad each message to the same size (bandwidth prohibitive) and without a trusted nymserver being able to group messages to Alice together so she can get them (padded to the same bucket size) with sequential byte PIR.

If I can find good answers to question 1 and 2 then I will have a very very secure and anonymous encrypted messaging system that can scale to group communications. If that fails I will probably just fall back to finishing Pynchon Gate, but honestly it is not exactly what I want. The hardest part is making shit scale to group communications, doing person to person is soooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo much easier than group communications.

672
Security / Re: anonymous membership query
« on: August 01, 2013, 07:31 am »
So essentially I need a PIR that meets the following requirements:

1. Single bit, non-sequential

I do not want to get a block of bytes from a position in a database, I want to get single bits that are at pseudo-random locations throughout the database. This puts the Pynchon Gate PIR out the window because it has query sizes that grow by one bit for each item in the database, which works fine if the items in the database are 1 megabyte but if they are 1 bit then it is is obviously worse than just doing everybody gets everything PIR.

2. Scalable, not computationally bound to theoretical papers

I envision hundreds of thousands of clients doing thousands of queries a day. They are all for small bit strings though.

3. Computational PIR. I want a single server PIR, so the database doesn't need to be distributed.

I don't know if any PIR schemes meet these requirements. I understand Pynchon Gate PIR inside and out because it is fairly simple, but many of the advanced PIR papers look to me as if somebody vomited up math symbols onto a paper. I want to spend time to implement something and I am willing to spend much time figuring out how to do it and learning the associated math, but I don't want to spend four weeks to get to the point of understanding that I realize this PIR will not work for me, thirty times before I find the one that will.

673
Security / Re: anonymous membership query
« on: August 01, 2013, 07:23 am »
I want Alice to be able to ask Bob if he has a message for her, without Bob knowing which messages Alice is looking for. The easiest solution is for Bob to tell Alice all of the messages he has. This is great unless there are 500,000 users sending 10 messages a day each tagged with 128 bit string. Now the meta data alone is about 76 megabytes, and there are 500,000 users so Bob now takes about 36 terabytes of bandwidth a day just letting everybody know the metadata of which messages are available. If everybody checks for 1,000 messages a day and the PIR cost is even 512 bits per query, 500,000 * 1,000 * 512 = 29.8 gigabytes of bandwidth for all clients to check for so many messages. But the results are dependent on the cost of the PIR, and the only PIR I know would not work well for this at all (only theoretically would it work, in practice it would be absolutely stupid).

Essentially you and I do ECDH key exchange and come to a shared secret. We communicate with each other through Bob by tagging messages with our shared secret prior to uploading them to Bob anonymously. Now you know the message tags that indicate that a message is waiting for you, but you cannot ask Bob "Hey do you have this message tag?" because then Bob knows what message will be for you. So you need a way to ask Bob if he has something without him being able to tell what you are asking if he has. The solution could be for Bob to keep a bloom filter. Bloom filter is essentially a block of bits , a fresh bloom filter all bits are set to 0. When Bob gets an item he takes its hash value, translates the hash value into a series of numbers that map to a bit position in the grid, and Bob sets the bit positions in the grid to be 1. If all bit positions mapping to an inserted item are set to 1 then you can query the bloom filter and see that the message has probabilistically been added. So now we have Bob with his bloom filter, and he gets these message tags when people send messages to him for others to download and he puts the tags in his bloom filter. Now I can ask Bob if he has the message by doing PIR to get the bits at the positions in his bloom filter (Since I know the message tag I know the bits that will be set to 1 if the message has been added, since I also have the public parameters of Bob's bloom filter). If after I do PIR and get all of the bits they are all 1, then I know Bob has my message, if any of them are 0 then I know Bob doesn't have my message, but in either case Bob cannot tell which messages I asked him if he has. And now Bob doesn't need to send me and 500,000 other people 76 megabytes a day.

674
Security / Re: Best VPN???
« on: August 01, 2013, 07:06 am »
Care to tell me your set of Tor entry guards? I mean it isn't like I will see anything other than people using Tor if I own your entry node right? Plenty of people using Tor these days, so might as well tell me your entry guards. Or I could just wait a while since everybody is using tails, I imagine it wont take very long before you use my entry guard anyway and then I can use fingerprinting attack and correlation attack to tie you to your forum nym. It is sure nice of the forum to keep timestamps of when these series of bytes came from your computer to it :), I wonder how many people sent message the same size as yours through Tor at the same time you sent that post to the forum? Markov modeling will help me filter them away ;).

675
Security / Re: Best VPN???
« on: August 01, 2013, 07:03 am »
So you use Tor and tell me you use BobVPN. So now that is great because I can go to BobVPN and see who all is using Tor. Okay there are twenty people using BobVPN and connecting to Tor, you are probably one of them. Now if I can watch traffic over BobVPN I can quickly pinpoint your traffic with fingerprinting attacks, since I already see the posts you make here and the size of the posts. Only one of the twenty people using BobVPN to connect to Tor is likely to have traffic pattern correlating with the posts I can see you making on the forum here. Also many VPN providers keep logs of who is connected to the VPN when, even if they don't keep traffic logs. So now I can use these logs and see who of the twenty are connected to the VPN when I see traffic from you on this site. Over a small period of time I can use intersection attack to deduce who you are. You could say well my VPN keeps no logs blah blah, and I say fine that is great but you still have essentially changed your security model from Tor to your VPN provider, and that makes life easier for me if I am trying to trace you.

Pages: 1 ... 43 44 [45] 46 47 ... 249