Global Flood Evacuation Big Data System: From Local to Worldwide Planning
Project Overview
We developed a comprehensive, scalable evacuation planning system to address global flood scenarios. The project evolved through three phases, starting with a local focus on the Netherlands and scaling up to worldwide coverage, incorporating big data technologies for real-time processing and analysis.
Key Technologies
- Apache Spark: For distributed data processing and analysis
- Amazon Web Services (AWS): For scalable cloud computing resources
- Apache Kafka: For real-time data streaming and processing
- H3 Geospatial Indexing: For efficient geospatial calculations
- OpenStreetMap Data: For geographical and population information
- Parquet and ORC File Formats: For efficient big data storage and retrieval
Development Process
- Phase 1: Netherlands Evacuation Planning
- Implemented initial evacuation algorithm using Apache Spark
- Processed OpenStreetMap data for the Netherlands
- Utilized H3 indexing for geospatial calculations
- Achieved efficient data processing for a single country
- Phase 2: Scaling to Global Evacuation Planning
- Migrated to AWS for increased computational power
- Optimized data processing for planetary-scale datasets
- Implemented precomputation techniques for elevation data
- Achieved significant performance improvements, reducing processing time from hours to minutes
- Phase 3: Real-time Evacuation Updates
- Integrated Apache Kafka for real-time data streaming
- Developed a stateful stream processing application
- Implemented time-windowed aggregations for dynamic updates
- Created a system capable of handling continuous, global-scale updates
Key Innovations
Planetary Elevation Precomputation: Significantly reduced processing time by precomputing average elevations for H3 indices globally.
Scalable Cluster Configuration: Optimized AWS cluster configurations for both precomputed and non-precomputed scenarios, balancing performance and cost.
Real-time Update System: Developed a Kafka-based system capable of processing global-scale evacuation updates in real-time, with configurable time windows.
Performance Highlights
- Reduced global evacuation plan computation from hours to under 5 minutes
- Achieved real-time updates for global refugee movements
- Optimized cost-efficiency, reducing AWS expenses while maintaining performance
Future Work
Potential areas for future development include:
- Integration with real-time weather and flood prediction models
- Development of a user-facing application for public evacuation guidance
- Further optimization of data processing pipelines for even faster global-scale computations
Detailed Reports
For in-depth technical information and project progression, please refer to our series of reports: