Research Summaries

Back Detecting Attempts to Deceive Machine Learning on Big Data Collections

Fiscal Year 2019
Division Graduate School of Operational & Information Sciences
Department Computer Science
Investigator(s) Rowe, Neil C.
Sponsor Department of Defense Space (DoD)
Summary This work supports the technical domain of interest of “Science of the Artificial”, the subtopic of “Adversarial Machine Learning”, and the strategic objective of “Perform Disruptive Research – Discover and Develop New Technologies”. It will investigate possible methods by an adversary to provide data that will mislead a machine-learning algorithm, and will test ways to detect such misleading data and compensating for it. This topic is important because the artificial-neural-network methods that are currently popular tend to focus on repeated patterns, and such patterns can be in many cases artificially created by an adversary. Since neural networks use normative methods for which frequencies of occurrence are important, an adversary can fool them by masking activity by artificially high frequencies of uninteresting data or creating decoys of fake interesting data. However, variations in parameters of neural networks may reduce their susceptibility to manipulated data. Also, other methods of machine learning besides neural networks can be less susceptible to adversary manipulation, such as partitioning methods like support-vector machines, logical-modeling methods like decision-tree forests and set covering, and instance-based modeling that measures similarity to individual cases seen before. We propose to conduct experiments with real data that we deliberately manipulate in various ways, then compare the ability to detect the manipulation by a set of machine-learning methods. The experimental data will be aircraft and ship records which we have studied previously for anomaly detection. We will explore manipulating this data by such principles as adding considerable numbers of uninteresting exercises at the same time as doing a few unusual movements in support of a military operation, or by deliberately sending aircraft on odd tracks to confuse analysis. We will then attempt to infer patterns of activity that we can use to make predictions of future activity. The benefits of this research will be to quantity the degree of susceptibility of several important machine-learning algorithms to manipulation of data by an adversary, which can then provide guidance for best practices in using those algorithms. No researchers have follow-on assignments at the sponsor, but we teach the sponsor’s employees in our distance-learning data-mining certificate program.
Keywords big data Deception Evaluation countermeasures machine learning
Publications Publications, theses (not shown) and data repositories will be added to the portal record when information is available in FAIRS and brought back to the portal
Data Publications, theses (not shown) and data repositories will be added to the portal record when information is available in FAIRS and brought back to the portal