encrypted VoIP traffic: Alejandra y Roberto or Alice and Bob is the paper I was originally thinking of , and indeed I remembered incorrectly by thinking that they were focusing on information leaked through interpacket timing characteristics rather than variable packet sizes. However, interpacket timing characteristics can actually leak information as well, I did a little searching and found this research paper for example: etd.ohiolink.edu/send-pdf.cgi/Lu%20Yuanchao.pdf?csu1260222271 which shows how interpacket timing characteristics can be analyzed with traffic classifiers in an attempt to identify the speakers of an encrypted voice chat (provided the attacker has encrypted VOIP samples from a pool of potential suspects and wants to later be able to link encrypted VOIP streams to them). They also claim to be able to identify encrypted speech from the target so long as they have a sample reference of encrypted speech + plaintext to compare it to (note that the same speech encrypted twice would produce different ciphertexts if any sane cryptosystem is being used, so the data is leaking via interpacket timing). It is VERY likely that more information than just the speaker + previously sampled ciphertext:plaintext phrases will leak via interpacket timing characteristics. I have done a (very) little google-fu looking for research papers to back this but so far this is all I have been able to find. However, we can extrapolate from the research done on encrypted website fingerprinting that both variable packet size AND interpacket timing characteristics will leak information about the encrypted payload, and the most successful classifiers will likely take both data points into consideration. I would be surprised if the language spoken and possibly even non-sampled phrases cannot be identified with interpacket timing fingerprinting alone.