Quickstart: Apache Spark on Kubernetes

Using Apache Spark Operator in Kubernetes to streamline your Big Data workflows with a cloud-native approach without relying on a Hadoop cluster.

ReclameAQUI Data Lake

Containerized Data Lake running on GCP, using Kubernetes (GKE) to orchestrate Apache ecosystem components, with GCS for data storage and BigQuery as the analytical interface. Governance and security fully implemented using existing Google Suite groups and users through LDAP, giving stakeholders full autonomy to consume data from the Lake (with auditing).