Big Understanding
Networks, Sensors, & Mobility
Regional Meetings
Prior conferences
Upcoming conference
more about us



Big Understanding
February 17-18, 2016
(Opening reception/dinner Feb 16)
Austin, Texas

Texas Advanced Computing Center (TACC) and University of Texas
Feb 19, 2016

Library Selection
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (Basic Books, 2015)
By Pedro Domingos


Tuesday, February 16
6:00 PM
First-Timers Reception — San Jacinto Terrace
6:30 PM
Welcome Reception — Ballroom Foyer
7:00 PM
Dinner — San Jacinto Ballroom

Wednesday, February 17
7:30 AM

Breakfast — Ballroom Foyer

8:30 AM
Len Kleinrock, TTI/Vanguard Advisory Board
Conference Welcome
8:50 AM
Doug Lenat, TTI/Vanguard Advisory Board
50 Shades of Understanding: Why The Veneer of Intelligence is Not Enough
Sometimes a Vanguard conference introduction explores the interactions amongst the upcoming talks, or surveys them, or provides a frame, context, or dimensions by which to better situate them; this time, the introduction is more complementary than complimentary. Most of our Big Understanding meeting explores and celebrates some exciting successes which statistical machine learning is enjoying, today, when applied to big data. But amidst these heady triumphs, let’s not lose sight of various other powerful ways in which we—and artificial intelligences—can reason. Our superpower as human beings is that we can harness both the power of right-brain “thinking fast” pattern-finding and left-brain “thinking slow” logical inference. The resulting synergy sets us apart from other animals, enabling us not just to survive but to understand and master the complex natural world and infosphere in which we live. As our AI systems become ever more autonomous and ubiquitous, we can and should impart to them that ability to reflect and rationalize and responsibly reconsider their decisions.
9:50 AM
Paul Hofmann, Chief Technology Officer, Space-Time Insight
Augmented Intelligence—Machine Learning on Sparse Graphs
In order to make fast and reliable decisions in the face of uncertainty, we combine business data (CRM, ERP, data warehouses, asset management systems) with real time operational data, IoT data, and external data (e.g., weather, traffic, social media) to correlate these across space and time.  Two important application areas—though not the only ones—are anomaly detection and decision making under uncertainty. We’ll look at examples from utilities, banking, and logistics and show a typical state-of-the-art use case for each of the three machine learning categories—supervised, unsupervised and re-enforcement learning.
10:35 AM
Coffee Break
11:05 AM
Jessica Richman, Co-founder, uBiome
Understanding Our Microbiome
The microbiome is a new area of medicine—essentially, a new organ in the human body. By learning about the microbiome—the trillions of bacteria that live on and in the human body—we will better understand autoimmune disorders, antibiotics and probiotics, as well as the effect of diet on human health. The NIH Human Microbiome Project has been a great start, but it needs to be scaled up to thousands of samples in an automated way. uBiome uses citizen science to better understand the microbiome. With the largest human microbiome dataset in the world, it seeks to develop novel clinical tests based on the microbiome and also to partner with leading organizations (such as the Centers for Disease Control) to develop therapeutics and diagnostics. 
11:45 AM
Dafna Shahaf, Lecturer, Hebrew University of Jerusalem
Information Cartography
Large-scale data has potential to transform almost every aspect of our world, from science to business; for this potential to be realized, we must turn data into insight. Two projects exemplify efforts to address this problem computationally. The first, Metro Maps of Information, aims to help people understand the underlying structure of complex topics, such as news stories or research areas. Metro Maps are structured summaries that can help us understand the information landscape, connect the dots between pieces of information, and see the big picture. The second project proposes a framework for automatic discovery of insightful connections in data, for example, identifying gaps in medical knowledge: Our system recommends directions of research that are both novel and promising. Both problems can be formulated mathematically with efficient, scalable methods for solving them. User studies on real-world datasets demonstrate that users can acquire insight efficiently across multiple domains.
12:25 PM
Members’ Working Lunch — San Jacinto Ballroom
1:45 PM
Greg Dobler, Research Scientist, NYU Center for Urban Science and Progress
Urban Informatics: Better Cities Through Imaging
Big cities have big problems— traffic, water delivery, waste disposal, the efficient use of energy, and more—making them marvelous laboratories for data analytics. Ultimately, however, real-world solutions can only be gained through a and genuine understanding of how people live, the study of which cuts across all such problem boundaries and often requires us to infer human behavior through weak signals (such as infrared heat maps as proxies for energy use) and machine learning. The emerging field of “urban informatics” already making real progress in New York, Amsterdam, and many other cities around the world.
2:25 PM
Manuel Aparicio, Chief Technology Officer, Saffron Technology, Intel 
Big Experience: The Human Source of Knowledge, Captured and Shared by Cognitive Computing
Humans are raised without rules and are always learning—why can’t robots? Human associative memory, in particular, is an effective model to which more traditional machine learning methods can be applied. Specific cases to be discussed include Boeing (product lifecycle intelligence, predictive maintenance); Accenture (software asset management); USAA (personalized recommendation engine, fraud detection).
3:05 PM
Coffee Break
3:35 PM
Pietro Michelucci, Director, Human Computation Institute
Crowd AI  
So-called wicked problems are intractable societal problems, such as climate change, pandemic disease, and geopolitical conflict, the solutions to which exceed the reach of individual human cognitive abilities. They are multifaceted, involving multiple systems such that a solution that benefits one system (e.g., Earth’s ecosystem) may harm another (e.g., the global economy).  Furthermore, viable solutions tend to be dynamic, adaptive, and ongoing. Human computation—which encompasses methods such as crowdsourcing, citizen science, and distributed knowledge collection—offers new promise for addressing wicked problems, by enabling participatory sensing, group intelligence, and collective action at unprecedented scales. Ironically, many of the wicked problems we face today have resulted from  unintended manifestations of collective behavior (e.g,. car pollution). Thus, to harness and improve this crowd-powered capability is a fitting remedy.
4:15 PM
Kalyan Veeramachaneni, Research Scientist, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology
To Teach a Computer to Be a Data Scientist 
Can we develop general-purpose tools and utilities for use in data science? How can we make the process more systematic and structured while still considering the maximum amount of complexity? What’s needed is a platform that structures an otherwise ad-hoc process. The Data Science Machine is an end-to-end automated system that can turn raw data into predictive models with minimal human input. It did better than a majority of human competitors in three competitions held at premier machine learning conferences (beating 615 of 906 other teams). It use can, hopefully change and broaden our relationship to data, allowing us to become feature selectors instead of feature creators.
The ultimate achievement would be to supply different interfaces to different people (economists, analysts, educators) in order to bring them closer to their goals—and to make it easier, more effective, and more enjoyable for people to work closely with data.
5:00 PM
Close of First Day
6:00 PM
6:30 PM

Reception & Dinner — County Line on the Lake

Thursday, February 18
7:30 AM

Breakfast San Jacinto Ballroom

8:30 AM
Scott Clark, Co-founder and Chief Executive Officer, SigOpt
Using Optimization to Build Systems with Less Trial and Error 
Too much expert time is often wasted tuning and tweaking complicated systems like production machine learning models. After domain expertise is applied to build a model the task of finding the best variation of that model is often accomplished via costly trial and error. This resource intensive step can quickly become intractable for even relatively simple models. Recently in academia and the startup world there have been great strides in tackling this problem in an automated and mathematically principled way. I'll go over these techniques and several examples of where this problem arises from applications in bioinformatics and airplane design to examples where properly tuning a model can allow you to beat Vegas and Wall Street.
9:10 AM
D. Sculley, Software Engineer, Google
Machine Learning: The High Interest Credit-Card of Technical Debt
Machine learning offers a fantastically powerful toolkit for building complex systems quickly, but it is dangerous to think of these quick wins as coming for free. Using the framework of “technical debt,” we note that it is all too  easy to incur massive ongoing maintenance costs at the system level when applying machine learning. Specific risk factors and design patterns include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, changes in the external world, and a variety of system-level anti-patterns. We'll discuss how these crop up and how to avoid them when possible.
9:50 AM
Barbara Han, Disease Ecologist, Cary Institute
The Algorithm That's Hunting Ebola
In April 2014, a team of ecologists, veterinarians, and an anthropologist traveled to a Guinean village named Meliandou to find how patient zero, a 2-year-old boy named Emile, contracted the Ebola virus. By the end of the year, they had published a paper that hypothesized that little Emile had contracted Ebola from a colony of insect-eating bats near where the local children played. But this sort of investigation is a rearguard maneuver against a brutal opponent. Is there a way to go on the offense against Ebola and other zoonotic diseases? Can we predict outbreaks before they occur? Yes. Using computer modeling and machine learning, we can predict which wild species are capable of causing future outbreaks. The models create caricatures of likely reservoirs, while the algorithms sort through hundreds or thousands of species that have never been checked for zoonotic diseases, and calculate the probability that any given species is a disease reservoir based on its similarity to that caricature.
10:30 AM
Coffee Break
11:00 AM
Erin Dolan, Executive Director, Texas Institute for Discovery Education in Science, University of Texas, Austin
Big Undergraduate Research: The Freshman Research Initiative
More than 900 freshmen (~50% women and ~40% underrepresented groups) are recruited each year to participate in UT Austin’s Freshman Research Initiative. The program spans three semesters of integrated coursework and laboratory research in cohorts of about 20-40 under the guidance of post-docs who have the title of Research Educators. Students emerge with experience with experimental techniques and lab work, a deep understanding of the scientific process, and sometimes publications. For the 6,000+ students who have passed through the program since 2005, graduation rates within in STEM and overall are demonstrably higher. They have higher GPAs than their peers and there is a three-fold increase in the number who go on to pursue graduate degrees.
11:35 AM
Scott Niekum, Assistant Professor, University of Texas, Austin
From Robot Learning to Embodied Understanding 
Future co-robots in the home and workplace will require the ability to quickly characterize new tasks and environments without the intervention of expert engineers. While robotic learning algorithms have produced impressive results in recent years, they have typically done so in laboratory settings and in a manner that falls short of the depth of understanding required for robust generalization in unstructured environments.  Recent interactive approaches to structure discovery, in which robots learn models that have interpretability and explanatory power, may lead to a deeper, embodied understanding of the world that they operate in. An examination of why these elements are conspicuously missing from many robot learning systems will reveal fundamental weaknesses of some traditional machine learning paradigms in robotics and will shed light on new possible paths toward understanding.
12:10 PM
Sumant Kawale, Senior Director of Business Development, SparkCognition
Leveraging Cognitive Analytics for Asset Management and Security
Medical diagnosis is probably the best-known application of IBM Watson, but the technology is equally suited to security event and incident management. For example, we can prevent attacks on things like electric grids and railroads by analyzing data from equipment sensors as well as external threat information. Security teams can avoid and solve problems, whether by flagging a railway car about to break down or restricting server access on the basis of the host country of an IP address. We can spot anomalies against known patterns of operation and usage. When exposed to data, algorithms can automatically build a model of the underlying mechanism.
12:45 PM
Members' Working Lunch San Jacinto Ballroom
2:00 PM
Josh Stuart, Professor of Biomolecular Engineering, University of California, Santa Cruz
Predicting Driver Mutations Across Cancers Using Pathway Logic
Accurately modeling molecular and cellular network will greatly improve our ability to predict the consequences of genetic states on human health. The Stuart lab uses data-driven approaches to identify and characterize genetic networks, investigate how they've evolved, and then use them to simulate and predict cellular behavior. By integrating high-throughput molecular biology datasets, we can develop computational models and algorithms to predict cellular-level and organism-level phenotypes. In particular, we can elucidate altered signaling pathways in cancer cells that initiate and drive tumorogenesis and develop models to predict both the impact of mutations in human tissue and a patient's response to treatment.
2:40 PM
Erik Mueller, Founder and Chief Executive Officer, Symbolic AI
Building Systems that Reason and Explain Like People
Using machine learning we’re making great strides: image recognition, language translation, credit card processing—the list is long and getting longer. But when these systems make mistakes, it's hard to understand why, and it's hard to fix the problem. The best we can do is retrain the system with better data, or modify the training algorithm. Computing systems should instead be transparent. They should be able to reason like people and explain their reasoning to people. Transparency promotes understanding, is educational, makes it easier to fix problems, improves customer satisfaction, and builds trust. We’ll look at several methods for designing and implementing transparent computers ranging from declarative methods to neural networks.
3:20 PM
Bob Lucky,TTI/Vanguard Advisory Board
Conference Reflections
4:00 PM
Close of Conference

Friday, February 19


Dell, the Texas Advanced Computing Center (TACC), and University of Texas, Austin

home about us activities and deliverables contact faqs copyright