Polly: A Tool for Rapid Data Integration and Analysis in Support of Agricultural Research and Education
Data analysis and modeling is a complex and demanding task. While a variety of software and tools exist to cope with this problem and tame big data operations, most of these tools are either not free, and when they are, they require large amount of configuration and steep learning curve. Moreover, they provide limited functionalities. In this paper we propose Polly, an online data analysis and modeling open-source tool that is intuitive to use and can be used with minimal or no configuration. Users can use Polly to rapidly integrate, analyze their data, prototype and test their novel methodologies. Polly can be used also as an educational tool. Users can use Polly to upload or connect to their structured data sources, load the required data into our system and perform various data processing tasks. Examples of such operations include data cleaning, data pre-processing, attribute encoding, regression and classification analysis. Aside from modeling, users can then download their results in the form of graphs in several standard visualization formats. While in this paper we focus on analyzing dataset for smart farming, our tool usage fits to a more general audience. To justify our backend design and implementation choices, we also present a performance analysis between backend virtualization technologies (containers or serverless computing), showing both expected and surprising results.