Polly: A Tool for Rapid Data Integration and Analysis in Support of Agricultural Research and Education

Published Date
March 01, 2020
Type
Journal Article
Polly: A Tool for Rapid Data Integration and Analysis in Support of Agricultural Research and Education
Authors:
Waqar Muhammad Muhammad
Flavio Esposito, Maitiniyazi Maimaitijiang, Vasit Sagan, Enrico Bonaiuti

Data analysis and modeling is a complex and demanding task. While a variety of software and tools exist to cope with this problem and tame big data operations, most of these tools are either not free, and when they are, they require large amount of configuration and steep learning curve. Moreover, they provide limited functionalities. In this paper we propose Polly, an online data analysis and modeling open-source tool that is intuitive to use and can be used with minimal or no configuration. Users can use Polly to rapidly integrate, analyze their data, prototype and test their novel methodologies. Polly can be used also as an educational tool. Users can use Polly to upload or connect to their structured data sources, load the required data into our system and perform various data processing tasks. Examples of such operations include data cleaning, data pre-processing, attribute encoding, regression and classification analysis. Aside from modeling, users can then download their results in the form of graphs in several standard visualization formats. While in this paper we focus on analyzing dataset for smart farming, our tool usage fits to a more general audience. To justify our backend design and implementation choices, we also present a performance analysis between backend virtualization technologies (containers or serverless computing), showing both expected and surprising results.

Citation:
Waqar Muhammad, Flavio Esposito, Maitiniyazi Maimaitijiang, Vasit Sagan, Enrico Bonaiuti. (1/3/2020). Polly: A Tool for Rapid Data Integration and Analysis in Support of Agricultural Research and Education. Internet of Things, 9.
Keywords:
uav
smart farming
serverless computing
network virtualization
machine learning