Data Science Research

Research on Big Data ranges from cybersecurity to marketing analytics, from environmental remote-sensing analysis to health care, from recognizing fraud to the interface between law and software engineering. UM houses the Statistics and Applied Math Core to support student, faculty, and community research and projects in big-data analytics. The University’s Cyber Innovation Laboratory provides hardware and software to support cyber-related projects in data analytics, assurance, and network forensics.
Michael Hofmann is an Research Assistant Professor at the University of Montana. He received his M.S. degree in Geology from the University of Erlangen-Nürnberg (Germany), and his Ph.D. in Geology from the University of Montana. After receiving his PhD, he joined ConocoPhillips a research geologist and senior research geologist. In 2012, he co-founded AIM GeoAnalytics, LLC, which furnishes a diverse array of geologic consulting and analytical services to the oil and gas industry. His current research focuses on understanding the sedimentary processes, stratigraphy, and diagenetic alterations that pertain to clastic and mudrock sedimentary systems.
He uses 'big data' commonly in the form of large 3D seismic data volumes, extensive geophysical log suites, point cloud data from 3D outcrop analysis (e.g. Lidar laser scanning), and high resolution photo panoramas (e.g. Gigapan).

Observations fuel science. Observations on the human body, for instance, allow scientists to discover the causes and cures of diseases. Observations frequently require instruments, and many of these instruments detect signal-based data. A mass spectrometer, for instance, uses chemical and physical properties to make observations of the types and counts of certain molecules in different samples, such as human blood. This is a difficult problem, as data sets are very large and very hard to interpret. Current technology can interpret only the largest, least relevant signals from mass spectrometer and other instruments. Dr. Smith is using computer science to develop techniques to identify far greater numbers of molecules at much higher accuracy, providing researchers with a vastly expanded set of observations. Dr. Smith is working to apply this technology to a broad range of applications, from quickly developing tests for new disease epidemics to discovering the elemental composition of remote planets.


TwitterLaurie and Jason are assistant professors at the University of Montana that are working together to analyze Twitter data focused on African American perspectives on recent Supreme Court cases focused on university admissions, retention, and graduation.  Given the large amount of Tweets in this topic, this research lies at the intersection of data analytics, social theories, and legal analysis.  Both quantitative and qualitative methods are used to analyze Twitter data with specific theoretical lenses.   

Joe Eaton, Journalism

Joe Eaton, an assistant professor in the School of Journalism, uses big data in his work as an investigative health care reporter. Prior to joining UM in 2013, Eaton and his colleagues at the Center for Public Integrity in Washington, D.C., acquired eight years of Medicare billing data by hospitals and physicians. Working together with the Silicon Valley software and data mining company Palantir Technologies, Eaton wrote a series of investigative stories that showed the public health care insurer for the elderly overpaid hospitals and doctors by least $11 billion for highly-questionable charges. The series won first place in the 2013 National Institute for Health Care Management awards. Eaton is currently teaching investigative journalism and data reporting at the university. In 2014, his students will begin a project using billing data to investigative Medicare fraud.

Mark Grimes, Biological Sciences

To understand tyrosine kinase signaling mechanisms, we undertook a large-scale study of phosphorylated proteins (phosphoproteomics) in neuroblastoma cell lines. We developed new methods to analyze these data with help from collaborators in the fields of pattern recognition, computational biology and bioinformatics, including Gary Bader (University of Toronto), Paul Shannon (Fred Hutchison Cancer Research Institute) and Wan-Jui Lee and Laurens van der Maaten (Delft University of Technology). These methods are described in a paper just published (Grimes, et al., 2013). The picture emerging from detailed analysis of neuroblastoma phosphoproteomic data is that of adaptable and ambulatory protein complexes that, for simplicity, we refer to as the mobile networks hypothesis. We use the term mobile networks to refer to dynamic multiprotein signaling complexes that assemble on or move into different membrane compartments. The model is that transient networks of multiprotein complexes, whose assembly is governed by interactions between phosphorylated proteins and phospho-specific protein binding domains, convey information that changes cell fate. These complexes assemble at distinct intracellular locations, and contain different components, in response to activation of different receptor tyrosine kinases. A surprising finding was that more than half of the known RTKs in the human genome were detected in neuroblastoma cell lines, and in most cases several RTKs appear to be active in the same cell line. We are currently investigating mechanisms of signal integration when two or more receptors are simultaneously activated.

Joel Henry, Computer Science

With a unique blend of qualifications, including a PhD in Computer Science and a JD, Joel works in the areas of e-discovery and early case assessment using smart technology that meets client needs. Joel maintains a small but focused legal practice that focuses on IT Law - e-discovery and early case assessment using technology. His client base also includes high-tech startups working their way through funding arrangements, contracts, licensing, intellectual property, and business decisions involving legal issues. The combination of a Computer Science PhD, specializing in software engineering, and a JD provide a broad and deep range of knowledge suitable for negotiation, mediation, and arbitration.

Joel also started Agile Legal Technology which has commercialized technology assisted review research done at UM. Agile Legal Technology does not use predictive coding (which is too slow) but developed tools that produce final results in the time it takes to build one seed set. This solution to one aspect of Big Data, textual content analysis, guides users to information retrieval using case documents and natural language inquiries. It is easy to understand, easy to use, and easy to explain to non-techies.