Presenting Trading Consequences at CHESS’13

On June 1st, we presented the Trading Consequences project as part of CHESS’13, the Canadian History & Environment Summer School that took place in Nanaimo on Vancouver Island, Canada, from May 31-June 2, 2013.

CHESS provided us with the unique opportunity to present our progress on Trading Consequences to a wider audience of environmental historians to gain feedback on our current prototype, and to engage in a broader discussion on our general approach of combining text mining and information visualization to support research in environmental history.

As part of CHESS, we ran a half-day workshop. We first presented the goals of Trading Consequences and introduced the idea of leveraging computational methods (in our case, text mining and information visualization) to support history research and research in the humanities in general.

We then introduced our current visualization prototype to the CHESS participants (all environmental historians). We explained the visualizations’ core functionalities and how the underlying document corpus can be explored along geographical, temporal, and topical (i.e., commodity terms) dimensions.

For the rest of the workshop, historians freely interacted with the visualization prototype in groups of 2-3 people. We gave them small pointers of what to focus their exploration on. For instance, we asked them to explore the commodities “cinchona” and “cheese” and to zoom into locations that seem of interest in the context of these commodities. Explorations were always followed by brief discussions with the entire group.

As part of their exploration, some historians immediately started to focus on Vancouver Island as the geographic location where CHESS took place, and verified the mention of commodities there that had been discussed as part of other workshop presentations. Others experimented with commodities and locations related to their own research, and from, there, tried to assess the capabilities of the visualization and underlying data.

Workshop discussions focused mostly around 3 different themes: (1) the general functionality of the prototype and what the visualizations actually represent, (2) the underlying dataset and, closely connected to this, what kind of insights can be drawn from the visualizations, and (3) the potential of our approach in general.

Comments about the Visualizations: The historians quickly understood the general purpose and functionality of the visualizations. The basic visualization components, the geographic map, the temporal bar chart, the commodity tag cloud, and the commodity graphs were easily understood from a high level. There was some confusion, however, about lower level details represented in the visualizations. For instance, the meaning of the size and number of clusters in the map was unclear (e.g. do they represent number of documents, number of occurrence of a particular commodity, number of commodity mentions?). Some historians tried to drill down further into the visualizations and watch changes to make sense of these questions – sometimes this strategy clarified things, sometimes it added to the confusion. We gathered all comments and suggestions regarding the visualization design and are currently working on improving the prototype. One important part will be the addition of tooltips and legends to clarify the meaning of the represenations.

Insights Gathered from the Visualizations: A large part of the discussions focused on what kind of insights can be gathered from the visualizations and from the data set that we are generating in Trading Consequences. Some historians made a point that what the visualizations really represent is the rhetoric around commodity trading in the 19th century: what is shown is where and when a dialogue about particular commodities took place; the visualizations do not necessarily provide information about the occurrence of commodities in certain locations or amounts that were traded from one location to the other. This raises the question of how we can clarify what the visualizations represent exactly and what kind of data they are based on (e.g., by adding more elaborate legends). One perceived strength of the visualizations that was mentioned is the fact that they provide an overview of the documents from a meta-level, in at a scale that humans do not have the capacity of.

Reactions to our Approach: The historians at CHESS were generally positive about our approach of combining text mining and visualization to help research processes in environmental history and they clearly saw the potential. There was some skepticism of how much a tool like this can actually produce profound outcomes (e.g., because of the noise in the data), and the stability and performance of the visualization prototype has to be improved to support a fluid “dialogue” with the data. Some historians appreciated the use of visualizations as a visual search engine that can help to identify relevant documents in the corpus. Others suggested to add visualizations that can help to analyze particular patterns in the data (e.g. relations between different commodity terms and how these have change over time. We are currently working on visualization prototypes that focus on this latter aspect.



Progress to date on Trading Consequences Visualizations

Up here in St Andrews we are in the process of exploring several routes to visualize the vast amount of commodity data that have been extracted from the historical archives by our colleagues from the University of Edinburgh.

Research in environmental history can be an open-ended process where research questions are formed and refined as part of working with the available data (i.e. historic documents). Our goal is therefore the development of visualization concepts that will reveal a range of temporal, geographic and content-related perspectives on the commodity data, and that will highlight different conceptual angles and relations within the data. Such “interlinked” visualization perspectives can provide an overview of the entire dataset and, at the same time, act as probes to explore certain aspects of the commodity data in more detail. Using this approach we aim to support more open-ended explorations of the commodity data as well as providing easy access to specific documents of interest.

Our design process so far has been driven by discussions with Jim and Colin, paper sketches to iterate on certain visualization ideas and some literature research on information visualization and digital humanities.

Discussions with Jim and Colin revealed that the temporal and geographic aspects of the data are central to their research but always in close combination with commodity types and their relations to each other. This resulted in several paper sketches, as you can see below, to explore how these particular aspects could be visually expressed and augmented with interactive features.

We also created (static) computational sketches (shown below) based on samples from the actual database. At the same time, our collaborators from EDINA created an interface to the database that allowed interrogating the data through textual queries and list views.

Both these approaches allowed use to explore the character of the data and potential visualization challenges that this introduces.

The implementation of a web-based visualization prototype that combines the ideas from our early design explorations is currently in full swing. This prototype is based on the popular visualization library d3.js. We are closely collaborating with the teams from Toronto and Edinburgh on iterating  its design and implementation.

Moving from questions and the interests of researchers in environmental history to interactive visualizations which support digging into data with fluid and commodity oriented inquiries is a process on continual refinement and the exploration of small and large interaction research questions.