Mining Twitter to monitor invasive alien species—An analytical framework and sample information topologies


Social online media increasingly emerge as important informal information sources that can contribute to the detection of trends and early warnings in critical fields such as public health monitoring or emergency management. In the face of global environmental challenges the utilization of this information in ecological monitoring contexts has been called for, but examples remain sparse. This can be attributed to the significant technical challenges in processing this data and concerns about the quality, reliability and applicability of information mined from social media to the ecological domain. Here the strength and weaknesses of social media mining for ecological monitoring are assessed using the micro-blogging service Twitter and invasive alien species (IAS) monitoring as an example. The assessment is based on a manual analysis of 2842 Tweets sampled from Twitter data with potential direct or descriptive references to IAS impacting forest ecosystems, which was collected over a period of nearly three years. The results are presented as information topologies for Twitter messages of observational and non-observational character for three IAS with distinctive characteristics (Oak Processionary Moth, Emerald Ash Borer, Eastern Grey Squirrel). The results show that the social media channel Twitter is a rich source of primary and secondary observational biodiversity information. It also provides useful insights in the topical landscape of public communications on IAS as well as the public perception of IAS and IAS management. The analysis suggests broad application opportunities in IAS monitoring and management, and points at applications for related environmental questions. The results highlight that social media mining for ecological monitoring needs to be approached with the same best practices as ecological monitoring in general, requiring a good understanding of the monitored subjects and specific monitoring questions. The challenges in utilizing this information for operational systems are of technical rather than conceptual nature and include extending the degree of automation, especially with regard to image recognition and the automatic provisioning of location information.