Please join Stanford Libraries, the Center for Spatial & Textual Analysis, and Stanford Data Science for a two-day conference and data challenge on how data science, digital humanities, and libraries address sources, methods, and meaning in their data practices.
- Anne Burdick - Founding Director of the Knowledge Design Lab in the School of Design at the University of Technology Sydney where she is a Research Professor of Visual Communication Design.
- Jo Guldi - Associate professor of History at Southern Methodist University and external faculty at the Stevanovich Institute for the Formation of Knowledge at the University of Chicago.
- Mark Hansen - Faculty at Columbia Journalism School in July of 2012 and inaugural director of the east coast branch of the Brown Institute for Media Innovation
DetailsResearchers in every field of study have access to far more data than could have been imagined even a decade ago. The scale and speed of data creation has opened up exciting new paths of inquiry while, at the same time, introducing new kinds of data bias and challenges in the form of reproducibility and reliability of results. All data are representations. Their collection, handling, reduction and transformation influence what we can and cannot learn from data. And yet our traditional methods of data management, rooted in provenance, context, and careful documentation, struggle to keep pace with big data.
Data science has made significant advances in computational and mathematical tools to organize and analyze data, recognize patterns and make predictions at scale. Digital humanities are also concerned with scale, but focus on the problem from a different angle: How do we reduce rich, layered and uneven, historical and literary sources into data representations that are ripe for computation? Libraries and archives that produce metadata, define collections, and trace provenance, all to provide meaningful context, are in need of more powerful tools to support the production of new knowledge.
This state of things provides an opportunity— perhaps an imperative— to learn from each other across theoretical and methodological divides, and to address the social, ethical, and political implications of the boundlessness of data today. In response to this need, the Stanford Data Science Institute, CESTA (Center for Spatial and Textual Analysis) and Stanford Libraries are hosting two complementary trans-disciplinary events to foster communication and shared understanding beginning with this online conference. See http://datapractices.stanford.edu for more about the speakers and the talks.