Scalable Big Data Processing for Intelligent Fruit Classification

The project involves developing a Big Data processing pipeline for "Fruits!", a startup in the AgriTech sector focused on creating intelligent fruit-picking robots. The initial goal is to launch a mobile application that allows users to photograph fruit and receive detailed information, helping to raise awareness about fruit biodiversity. This app will also serve as a foundational step in building a fruit image classification engine. The project leverages a cloud-based Big Data environment using AWS EMR and PySpark to process fruit images and labels, ensuring scalability for future data volume growth. The focus is on creating an efficient data processing architecture, ensuring GDPR compliance, and optimizing cloud costs.