Big Data & Data Science

About Us

The aim of this project is to make AArch64 a first class citizen in the Big Data, Analytics and Data Science community (e.g., Hadoop, Spark, etc.). Big Data and Data Science technologies are vital and have become mature with various production implementations. Linaro drives engineering activities and ARMv8 builds. for Apache Ambari, BigTop, Spark and Hadoop.




Assignee, ARM
Member Engineer, ARM

IRC Channel:  #linaro-bigdata

Mailing List: 

Meetings & Calendar

This calendar is displayed using UTC timezone with no DST offsets.

Current Plan

The following items are on the project backlog but not currently planned. If you are interested in contributing to any of these items, please state your intention on the project's mailing list (found above)

Work in Progress

Coming soon…

Plan of Record

Coming soon…

Health Checks

Coming soon…




Big Data Components

  1. Apache Bigtop

  2. Big Data Core Components

  3. Big Data Operations

  4. Big Data Streaming Tools

  5. Big Data Data warehousing and Database Tools

  6. Big Data Data Governance and Security

    • Apache Ranger

    • Apache Knox

    • Apache Atlas

    • Apache Sentry

  7. Big Data File Formats

    • Apache Parquet

    • Apache Avro

  8. Big Data Datascience Notebooks

    • Apache Jupyter

    • Apache Zeppelin

  9. Big Data Analytics

  10. Big Data ML - Machine Learning

  11. Big Data component dependencies



Build and Port

Machine Learning




State of Big Data on Aarch64 - Apache Bigtop

Big Data benchmarking

Himalayan Odyssey

Big Data Roadmap

Strategic Engineering

Big Data and OpenJDK Strategic Engineering - 2018

Big Data and OpenJDK Strategic Engineering - 2017

Recently updated