Data Science Research
Observations fuel science. Observations on the human body, for instance, allow scientists to discover the causes and cures of diseases. Observations frequently require instruments, and many of these instruments detect signal-based data. A mass spectrometer, for instance, uses chemical and physical properties to make observations of the types and counts of certain molecules in different samples, such as human blood. This is a difficult problem, as data sets are very large and very hard to interpret. Current technology can interpret only the largest, least relevant signals from mass spectrometer and other instruments. Dr. Smith is using computer science to develop techniques to identify far greater numbers of molecules at much higher accuracy, providing researchers with a vastly expanded set of observations. Dr. Smith is working to apply this technology to a broad range of applications, from quickly developing tests for new disease epidemics to discovering the elemental composition of remote planets.
Laurie and Jason are assistant professors at the University of Montana that are working together to analyze Twitter data focused on African American perspectives on recent Supreme Court cases focused on university admissions, retention, and graduation. Given the large amount of Tweets in this topic, this research lies at the intersection of data analytics, social theories, and legal analysis. Both quantitative and qualitative methods are used to analyze Twitter data with specific theoretical lenses.
Joe Eaton, an assistant professor in the School of Journalism, uses big data in his work as an investigative health care reporter. Prior to joining UM in 2013, Eaton and his colleagues at the Center for Public Integrity in Washington, D.C., acquired eight years of Medicare billing data by hospitals and physicians. Working together with the Silicon Valley software and data mining company Palantir Technologies, Eaton wrote a series of investigative stories that showed the public health care insurer for the elderly overpaid hospitals and doctors by least $11 billion for highly-questionable charges. The series won first place in the 2013 National Institute for Health Care Management awards. Eaton is currently teaching investigative journalism and data reporting at the university. In 2014, his students will begin a project using billing data to investigative Medicare fraud.
To understand tyrosine kinase signaling mechanisms, we undertook a large-scale study of phosphorylated proteins (phosphoproteomics) in neuroblastoma cell lines. We developed new methods to analyze these data with help from collaborators in the fields of pattern recognition, computational biology and bioinformatics, including Gary Bader (University of Toronto), Paul Shannon (Fred Hutchison Cancer Research Institute) and Wan-Jui Lee and Laurens van der Maaten (Delft University of Technology). These methods are described in a paper just published (Grimes, et al., 2013). The picture emerging from detailed analysis of neuroblastoma phosphoproteomic data is that of adaptable and ambulatory protein complexes that, for simplicity, we refer to as the mobile networks hypothesis. We use the term mobile networks to refer to dynamic multiprotein signaling complexes that assemble on or move into different membrane compartments. The model is that transient networks of multiprotein complexes, whose assembly is governed by interactions between phosphorylated proteins and phospho-specific protein binding domains, convey information that changes cell fate. These complexes assemble at distinct intracellular locations, and contain different components, in response to activation of different receptor tyrosine kinases. A surprising finding was that more than half of the known RTKs in the human genome were detected in neuroblastoma cell lines, and in most cases several RTKs appear to be active in the same cell line. We are currently investigating mechanisms of signal integration when two or more receptors are simultaneously activated.
With a unique blend of qualifications, including a PhD in Computer Science and a JD, Joel works in the areas of e-discovery and early case assessment using smart technology that meets client needs. Joel maintains a small but focused legal practice that focuses on IT Law - e-discovery and early case assessment using technology. His client base also includes high-tech startups working their way through funding arrangements, contracts, licensing, intellectual property, and business decisions involving legal issues. The combination of a Computer Science PhD, specializing in software engineering, and a JD provide a broad and deep range of knowledge suitable for negotiation, mediation, and arbitration.
Joel also started Agile Legal Technology which has commercialized technology assisted review research done at UM. Agile Legal Technology does not use predictive coding (which is too slow) but developed tools that produce final results in the time it takes to build one seed set. This solution to one aspect of Big Data, textual content analysis, guides users to information retrieval using case documents and natural language inquiries. It is easy to understand, easy to use, and easy to explain to non-techies.