NSA domestic Spying and data mining

There has been a lot of talk about the NSA domestic spying. The fact that AT&T Sprint, and others have been sending internet traffic, and phone calls to the NAS is not in Doubt. The issue is what they are doing with the data and phone calls.

Data mining is problematic, and probably always will be. The problem has to do with reducibility of content, and decisive targeting of probable results sets. In business, data mining focuses on business issues that can be, more or less, narrowed down. Discovering the percentage of persons with blue eyes who purchase your product might be interesting, but probably not useful unless you can link this data to an artifact that Blue eyed individuals purchase more of your product. Discussing terror takes on subtle difference wither you take it from the context of the “the grandchildren terrorizing their grandparents”, or the “we will be terrorizing the Capitol building” . Both word elements will trigger an automated scan, either vocal, or digital, but neither will be useful until connected with pertinent relevant data and reduced to comply with in the context of the target required.
What is the percentage of usage of the word terror in current conversations, and text? And how much reduction does it take to get to something useful? Does your filtration, remove valuable information.

You could simplify this issue by considering the general issues of the IT world. The distilling of facts into information.

I used to work in the Telecom world, and we had a cell phone fraud unit that, in order to catch fraud established a phone pattern recognition system. Not unlike the NSA’s own requirement. But unlike the NSA it’s target was focused on fraud. A requirement the telco applied to good customer service. But unlike the NSA privacy was not an issue. The design of the data mining system dissociated personal identification from phone numbers. The system only established a cell phone calling pattern for each phone number in the system. There was nothing to connect the user of the phone to the number at all. What was being detected was, change.

For everyone, there can be an associated pattern, a group of familiar phone numbers, a common event.
Calling your wife on your way home from work is a pattern. And can, over time, establish a rhythm, a pattern to the phone usage. And should that phone suddenly start calling Brazil, the call no longer fits the established pattern, and would trip a warning, which would be forwarded to a fraud investigator.

This is mealy an example, and evidence that the NSA did not need to acquire the amount and types of data that they requested in order to look for terrorist. They were spying, and listening in on Americans, and for that matter everyone else on the internet.

Time to dig out that old PGP encryption key.