Quote from: Guru on August 04, 2012, 08:40 amStyleometry is really overrated. I seem to remember a few years back that the author of a book entitled "Primary Colors" was revealed to be none other than journalist Joe Klein. Apparently, literary analysis by an academic had earlier fingered Mr. Klein, but he denied authorship. What needs to be remembered is that for such analysis to be carried out, you need two things: 1) A large volume of material written by the ostensible author; and2) A large volume of material written by the Anonymous would-be author. As a journalist, Joe Klein had a huge amount of material published under his own name. The problem with styleometry, is that it requires a large amount of material written under your real identity, to compare the anonymous/pseudonymously authored materials to. If one is careful not to post under their real name, there is nothing identifiable to compare the anonymous/pseudonymously authored material with. GuruRight now? Yes, sure thing. I mean the whole concept has only come into its own with computation sorta recently as far as I can tell.I think there have been some decent advances in styleometry though. I've been keeping an eye on the Drexel Group's work on styleometry evasion, and here are some numbers of where it's at these days:You can, if you have 50 or less suspects, ~6500 words from an author IDed source as a benchmark and only 500 words from an anonymous source, identify with a very high degree of accuracy who that person is.Now, I'm certainly not claiming you can just obtain our posts here on SRF and then compare it to every tweet and forum on the web to get results. But... certainly very powerful organizations are working very hard indeed to obtain exactly those kinds of powers. Brennan says the results can be good intelligence for a set containing 100s, even 1000s of suspects.My real concern is that our data is pretty much stored forever, and therefore the improvements in technology, software and hardware could eventually evolve it into a genuine threat, esp. combined with other intelligence forms. The fundamental principal of counting is the ultimate enemy. Hence: paranoid pine.