Big Data & Data Science

About Us

The aim of this project is to make AArch64 a first class citizen in the Big Data, Analytics and Data Science community (e.g., Hadoop, Spark, etc.). Big Data and Data Science technologies are vital and have become mature with various production implementations. Linaro drives engineering activities and ARMv8 builds. for Apache Ambari, BigTop, Spark and Hadoop.

Roadmap

 

Engineers

Assignee, ARM
Member Engineer, ARM



IRC Channel:  #linaro-bigdata

Mailing List: leg-bigdata@linaro.org 

Meetings & Calendar

This calendar is displayed using UTC timezone with no DST offsets.



Current Plan

The following items are on the project backlog but not currently planned. If you are interested in contributing to any of these items, please state your intention on the project's mailing list (found above)

Work in Progress

Coming soon…

Plan of Record

Coming soon…

Health Checks

Coming soon…

Portal


Documentations

ERP 

Big Data Components

  1. Apache Bigtop

  2. Big Data Core Components

  3. Big Data Operations

  4. Big Data Streaming Tools

  5. Big Data Data warehousing and Database Tools

  6. Big Data Data Governance and Security

    • Apache Ranger

    • Apache Knox

    • Apache Atlas

    • Apache Sentry

  7. Big Data File Formats

    • Apache Parquet

    • Apache Avro

  8. Big Data Datascience Notebooks

    • Apache Jupyter

    • Apache Zeppelin

  9. Big Data Analytics

  10. Big Data ML - Machine Learning

  11. Big Data component dependencies

Tests 

Benchmarking

Build and Port

Machine Learning

Misc

Bigtop


Blogs/Presentations

State of Big Data on Aarch64 - Apache Bigtop

Big Data benchmarking

Himalayan Odyssey


Big Data Roadmap


Strategic Engineering

Big Data and OpenJDK Strategic Engineering - 2018

Big Data and OpenJDK Strategic Engineering - 2017






Recently updated