SeaBee Data Platform – Satisfying sense of progress

The data storage and sharing component of the SeaBee infrastructure (the SeaBee Data Platform) is where we design and implement cloud-based data storage, and data sharing applications (based on GIS-visualisation tools).

It is for scientists and other experts in, or with, the SeaBee team to upload, process and share data collected from drone missions, all in one place. All datasets on GeoNode, one part of the data platform, are publicly available and exposed as Web Mapping Services (WMSs), so others can add them to their own maps (for example, in GIS).  

“We’re building an Open Source, centralised platform for efficiently processing data from many types of drones. We’re also making a big effort to build upon existing research infrastructure (NIRD) and to keep everything as open as possible.

– James, leader of the team developing this component. 

Several different challenges were overcome by the SeaBee team working on this – from the pandemic situation to adapting new technologies that meet user needs. Now, all the basic components for the SeaBee Data Platform are in place, which is crucial to the success of this phase. 

Getting this to work is one of the biggest challenges, but we are working towards a bigger contribution to the Norwegian research community by pushing what’s possible using national infrastructure”

– James, leader of the team developing this component. 

SeaBee Data Platform components
  • Data storage (using MinIO to store the data collected) 
  • Upload interfaces (integrated with Drone Logbook to extract metadata for drone missions) 
  • Four state-of-the-art workflows: 
    • Orthorectification (using Open Drone Map) 
    • Annotation (using ArcGIS Pro/Image analyst) 
    • Machine learning (with Norges Regnesentral, using Convolutional Neural Networks and PyTorch) 
    • Publishing (using GeoServer and visualised by GeoNode) 
  • Documentation (stored on GitHub) 
Simplified workflow showing the main parts of the SeaBee Data Platform (hosted by Sigma2, the national e-infrastructure for data science in Norway).

All the code is Open Source and available on GitHub. The next steps will focus on standardising workflows and processing the backlog of data, as well as increasing automation (towards a production environment).  

James Sample (NIVA) has recently stepped up to lead the development of this component, taking over from Kristoffer Kalbekken (NIVA). Speaking on the recent developments on the SeaBee Data Platform, James says: 

Over the past two months the SeaBee Team, supported by Sigma2, has worked really hard to get all the core components of the platform in place. It’s exciting to see what can be achieved using the national e-infrastructure and it’s great being able to process mission data more efficiently… The sense of progress is satisfying”. 

Screenshot of the GeoNode section of the SeaBee Data Platform (5th March 2023), showing the toggle swipe functionality on data from a recent SeaBee mission on the coast of Norway.

The SeaBee Data Platform is hosted by Sigma2 on NIRD (the National Infrastructure for Research Data). It currently uses the following resources shared between different platform components: 

  • 64 Central Processing Units (CPUs) 
  • 2 Graphics Processing Units (GPUs; NVIDIA Tesla V100-SXM2-16GB)  
  • 200 GB memory  
  • 10 TB storage (~5 TB used)  
More information