Artificial intelligence system can predict data theft by scanning email
- 02 October, 2014 23:37
Workers who may be tempted to sell confidential corporate data should think twice about what they write in an email -- an AI-based monitoring system could be watching.
Tokyo-based data analysis company UBIC has developed an artificial intelligence system that scans messages for signs of potential plans to purloin data.
A risk prediction function is being added to an existing product from the company that audits email for signs of activity such as price fixing. The Lit i View Email Auditor has been used in electronic discovery procedures in U.S. lawsuits.
The artificial intelligence system, dubbed Virtual Data Scientist, can sift through messages and identify senders whose writing suggests they are in financial straits or disgruntled about how their employer treats them.
Such a situation would be classified as a "developing" problem, while messages about data access that's out of the ordinary, for instance, would get a "preparation" classification.
"Cases such as information leaks do not occur all of a sudden," a UBIC spokeswoman wrote in an email.
"The Risk Prediction function can detect which risk phase the company is facing and alerts in advance so that the company can make the crisis prevention before the incident takes place," the spokeswoman wrote.
The system seems a bit like a tool from the science fiction movie "Minority Report," designed to intercept would-be criminals before a crime takes place, but it's built on established human expertise. The Virtual Data Scientist trains itself by studying and emulating the techniques of professional auditors.
It can then bring those techniques to bear by scanning massive volumes of email. UBIC says it's more efficient than traditional manual keyword searches and that even subtle indications of fraud can be detected.
The Japan Patent Office recently decided to issue UBIC a patent for "predictive coding" that identifies behavior that could lead to future misconduct.
The approach links machine learning with analysis of big data and behavioral sciences such as psychology and criminology. The emerging field is known as behavior informatics and it has its own IEEE task force and other research groups.
UBIC's system currently works in Japanese only, but support for English and other languages is being added, the spokeswoman wrote.
The feature follows the arrest in July of an engineer who allegedly stole personal data on up to 20.7 million customers of Benesse, the parent company of Berlitz language schools in Japan, to sell them for a profit. The incident was one of Japan's largest data leaks.