Awesome Cassandra 

Cassandra is a free and open-source, distributed, wide column store,
NoSQL database management system designed to handle large amounts of
data across many commodity servers, providing high availability with no
single point of failure. Cassandra is supported by the Apache Software
Foundation and is also known as Apache Cassandra.
This is a curated list of awesome Cassandra packages and
resources. Maintained by Rahul Singh of Anant. Feel free contact me if you’d like to
collaborate on this and other awesome lists. Awesome Cassandra
, Awesome Solr, Awesome Lucene. This
powers the Resources section of Cassandra.Link, a rich collection of
blog feeds, and curated links as a searchable knowledge base.
Contents
General
Cassandra
Cassandra History
Cassandra Use Cases
Cassandra Distributions
Cassandra Compliant
Databases on JVM
- DataStax Enterprise - Most
widely used commercial distribution of Cassandra, integrated with Apache
Spark (for SparkSQL, analytics), Apache Solr (for secondary index),
Apache TinkerPop based Graph stored in Cassandra, and OpsCenter.
- DDAC/Luna - Datastax
Distribution of Cassandra, a production ready distribution with a bulk
loader supported by Datastax. DDAC is Deprecated now, but Datastax is
still supporting Cassandra with it’s new Luna Service.
Cassandra Compliant
Databases on C++
- ScyllaDB - NoSQL
data store using the seastar framework, compatible with Cassandra.
- YugaByte
Database - YugaByteDB is a transactional, high-performance database
for building distributed cloud services. It supports
Cassandra-compatible and Redis-compatible APIs, with PostgreSQL in
Beta.
Cassandra
as a Service / Managed Cassandra Based on Open Source Cassandra
- DataStax Astra - DataStax
Astra Cassandra as a Service running on the Kubernetes operator
Cassandra available on AWS and GCP.
- IBM Cloud
Databases for DataStax - IBM Cloud Managed Service for DataStax
Enterprise.
- Instaclustr
Managed Cassandra as a Service - Instaclustr provides a fully
managed and SOC 2 certified hosted & managed service for Cassandra®
on AWS, Azure, GCP and IBM Cloud.
- Aiven for Cassandra -
Aiven for Cassandra is a managed and hosted distributed NoSQL database
providing scalability, high availability, and excellent fault tolerance.
Cassandra as a Service is available on Google Cloud Platform, Amazon Web
Services, Microsoft Azure, DigitalOcean, and UpCloud.
- Microsoft
Azure Managed Instance for Cassandra - Azure Managed Instance for
Cassandra provides automated deployment and scaling operations for
managed open-source Cassandra datacenters. It accelerates hybrid
scenarios and reduces ongoing maintenance.
Cassandra
as a Service / Managed Cassandra Based on Proprietary Technology
Using Cassandra
Cassandra from Relational
Cassandra Data Modeling
Cassandra Architecture
Cassandra Monitoring
Cassandra Maintenance
Cassandra Security
Cassandra Deployment
Cassandra
Deployment on Docker / Containerized Cassandra
Cassandra
Deployment on Kubernetes / Kubernetized Cassandra
- K8ssandra.io - Kubernetes +
Cassandra - K8ssandra provides a production-ready platform for
running Cassandra on Kubernetes. This includes automation for
operational tasks such as repairs, backups, and monitoring.
- Datastax -
Cassandra Kubernetes Operator - Datastax’s Cassandra Kubernetes
Operator which supports Datastax as well as open source Cassandra
containers on Kubernetes.
- Instaclustr -
Kubernetes Operator for Cassandra - The Cassandra operator manages
Cassandra clusters deployed to Kubernetes and automates tasks related to
operating an Cassandra cluster.
- Sky UK -
Cassandra Kubernetes Operator - Kubernetes operator that manages
Cassandra clusters inside Kubernetes. Well designed and organized.
- CassKop
- Cassandra operator for Kubernetes - Kubernetes operator automates
the Cassandra operations such as deploying a new rack aware cluster,
adding/removing nodes, configuring the C and JVM parameters, upgrading
JVM and C versions. Written in Go.
- Strapdata
- Elassandra Operator for Kubernetes - The Elassandra Kubernetes
Operator automates the deployment and management of Elassandra clusters
deployed in multiple Kubernetes clusters.
- Rook.io -
Cassandra on Kubernetes - Rook is an open source cloud-native
storage orchestrator, providing the platform, framework, and support for
a diverse set of storage solutions to natively integrate with
cloud-native environments. They have a special operator for Cassandra
amongst other providers.
- Kudo
Cassandar Operator - The KUDO Cassandra Operator makes it easy to
deploy and manage Cassandra on Kubernetes.
Integrating with Cassandra
.NET and Cassandra
Spark
- DataStax
Spark Cassandra Connector - Library that lets you expose Cassandra
tables as Spark RDDs, write Spark RDDs to Cassandra tables, and execute
arbitrary CQL queries in your Spark applications.
- sample
Spark Job Server Cassandra - Simple sample job illustrating the use
of Spark Jobserver to execute Apache Spark analytics with
Cassandra.
- fluxcapacitor/pipeline
- End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline
using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, NiFi,
Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin,
iPython/Jupyter Notebook, Tableau, Twitter Algebird.
- Spark +
Cassandra Best Practices - Outlines general use cases and best
practices of Spark & Cassandra together.
Search / Secondary Indexes
Databases
Timeseries Databases
Monitoring / Metrics
- cortexproject/cortex
- Horizontally scalable, highly available, multi-tenant, long term
Prometheus storage.
- filodb/FiloDB -
Distributed Prometheus time-series database compatible with Prometheus
queries.
- cybem/cyanite-iow
- Cassandra backed Carbon daemon and metric web service. IPONWEB
repository, compatible with Carbon.
Custom Time Series
Graph
Miscellaneous
Packages
Libraries
- express-cassandra
- Cassandra ORM/ODM/OGM for Node.js with optional support for Elassandra
& JanusGraph.
- DataStax Java
Driver - Java client driver for Cassandra.
- DataStax C++
Driver - Modern, feature-rich, and highly tunable C/C++ client
library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using
exclusively Cassandra’s native protocol and Cassandra Query Language
v3.
- DataStax Python
Driver - Modern, feature-rich and highly-tunable Python client
library for Cassandra (2.1+) using exclusively Cassandra’s binary
protocol and Cassandra Query Language v3.
- DataStax Ruby
Driver - Ruby client driver for Cassandra. This driver works
exclusively with the Cassandra Query Language version 3 (CQL3) and
Cassandra’s native protocol.
- DataStax Node.js
Driver - Modern, feature-rich and highly tunable Node.js client
library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using
exclusively Cassandra’s binary protocol and Cassandra Query Language
v3.
- DataStax C#
Driver - Modern, feature-rich and highly tunable C# client library
for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively
Cassandra’s binary protocol and Cassandra Query Language v3.
- DataStax PHP
Driver - DataStax PHP Driver for Cassandra.
- Achilles -
Achilles is an open source Persistence Manager for Cassandra,with the
features like Advanced bean mapping (compound primary key, composite
partition key, timeUUID, ect),Native collections and map support,and
so.
- phpcassa - PHP
client library for Cassandra.
- Caffinitas
- Caffinitas is an advanced object mapper for Cassandra which has been
especially designed to work with Datastax Java Driver 2.1+ against
Cassandra 2.1, 2.0 or 1.2.
- Spring
Data for Cassandra - Spring Data for Cassandra offers a familiar
interface to those who have used other Spring Data modules in the
past.
- gocql - Package gocql
implements a fast and robust Cassandra client for the Go programming
language.
- Hackolade - Visual data modeling
tool for NoSQL databases and stuctures like Cassandra, ElasticSearch,
Graph DBs, JSON, APIs.
- JetBrains Datagrip DB
IDE - The Cross-Platform IDE for Databases & SQL by JetBrains,
with support for Cassandra.
- Datastax
- Management API for Cassandra - The Management API is a sidecar
service layer that attempts to build a well supported set of operational
actions on Cassandra® nodes that can be administered centrally.
- DataStax
OpsCenter - Simplified management for DataStax Enterprise and
Cassandra database clusters.
- CassandraCAS -
Compare-and-swap tool for Cassandra created by Datomic.
- Peloton - Unified
resource scheduler created by Uber. This tool can handle many nodes and
clusters through resource management and scalability.
- Ansible-Galaxy:
Cassandra GitHub - Collection called cassandra that aims at
providing all Ansible modules allowed to interact with Cassandra.
- Ansible-Galaxy:
Cassandra - Documentation for Ansible-Galaxy: Cassandra.
- Ansible-dse
- Set of Ansible playbooks that will build a Datastax Enterprise
cluster.
- dseansible -
DSE Installation and Upgrade Ansible Playbooks/Roles for Ubuntu
Linux.
- DbSchema -
Cassandra Designer - DbSchema: Cassandra Diagram Designer & GUI
Admin Tool which can do Cassandra amongst other databases.
- DBeaver - Free Universal Database
Tool - Third party tool for dealing with all sorts of databases
including Cassandra.
- RazorSQL - Multi DB Manager Tool
- Multi-db tool for Linux, Mac, and Windows that works with
Cassandra.
- Cassandra Reaper -
Automated repairs for Cassandra. Supports all versions.
- cstar perf -
Cassandra performance testing platform.
- Spark
Cassandra Stress - Tool for testing the DataStax Spark Connector
against Cassandra or DSE.
- cqlmigrate -
Cassandra CQL migration tool. cqlmigrate is a library for performing
schema migrations on a cassandra cluster.
- cassandra-migration-tool-java
- Cassandra migration tool for java is a lightweight tool used to
execute schema and data migration on Cassandra database.
- Cassalog -
Cassalog is a schema change management library and tool for Cassandra
that can be used with applications running on the JVM.
- cdeploy -
Cdeploy is a simple tool to manage your Cassandra schema migrations in
the style of dbdeploy.
- Web: Cassandra
Calculator - Simple calculator to see how size / replication factor
affect the system’s consistency.
- Cassandra-web -
Web interface for Cassandra.
- CassanddraRestfulAPI
- CassandraRestfulAPI project exposes the cassandra data tables with the
help of Restful API.
- Netflix: Staash -
Language-agnostic as well as storage-agnostic web interface for storing
data into persistent storage systems, the metadata layer abstracts a lot
of storage details and the pattern automation APIs take care of
automating common data access patterns.
- cql-vim - Cassandra
CQL Syntax Highlighter for Vim.
- Presto - Distributed SQL Query
Engine for Big Data. Presto allows querying data where it lives,
including Hive, Cassandra, relational databases or even proprietary data
stores.
- SSTable
Tools - Toolkit for parsing, creating and doing other fun stuff with
Cassandra 3.x SSTables.
- Cassandra-Exporter
- Simple Tool to Export / Import Cassandra Tables into JSON.
- Cassandra
SStable Tools - Multiple different tools combined into one that
helps admins get summaries, metadata, partition info, cell info.
- Cassandra-Client
- Simple gui tool for browsing tables and data in Cassandra.
- CQL Data
Modeler - Very useful tool to test out a CQL schema and visualize
what the partition would like in relationship to the columns and
rows.
- Cassandra
Snapshot Backup - Quick and easy way to snapshot files in a
Cassandra database and back them up using Ansible.
- Slothsandra
- Integration for Cassandra with the Slack app, which stores old
messages that Slack no longer does itself.
- sandraREST -
Cassandra manager with a web UI for RESTful APIs.
- Cassandra
Leadership - Library to help elect leaders using cassandra. Uses
paxos to build a leadership election module.
- Terraform
Cassandra - Terraform module that creates a Cassandra cluster.
- Datadog
- Third party tool that allows monitoring and metrics for Cassandra
nodes and clusters.
- tlp-cluster -
Provisioning tool for Cassandra designed for developers looking to
benchmark and test Cassandra. It assists with builds and starting
instances on AWS.
- Helenos - Free web
based environment that simplifies a data exploring & schema
management with Cassandra database.
- ValuStor -
ValuStor is a key-value pair database solution.
- Cassandra-Migration
- Cassandra / DataStax Enterprise database migration (schema evolution)
library.
- JanuesGraph-Utils -
Tool to Develop a graph database app.
- Scylla-Migrator -
Migrate data extract using Spark to Scylla, normally from
Cassandra.
- Cassandra
CA Manager - Create and sign Java keystores.
- Zipkin -
Distributed tracing system.
- Instaclustr
Kerberos plugin - GSSAPI authentication provider for Cassandra.
- Instaclustr
Java Driver for Kerberos - GSSAPI authentication provider for the
Cassandra Java driver.
- Instaclustr
Minotaur - Command line tool for consistent rebuilding of a
Cassandra cluster.
- Instaclustr
TTL Remover - Command line tool for rewriting SSTables to remove
TTLs.
- Instaclustr
SSTable Generator - CLI tool for programmatic generation of
Cassandra SSTables.
- Instaclustr
Exporter - Java agent that exports Cassandra metrics to
Prometheus.
- Instaclustr
Go Client for Instaclustr Icarus - Go client for Instaclustr Icarus
sidecar.
Open Source Applications
- Twissandra -
Twissandra is an example project, created to learn and demonstrate how
to use Cassandra. Running the project will present a website that has
similar functionality to Twitter.
- ChronoServer -
Test server for sampling how long it takes mobile & web clients to
make various types of requests to a server doing common request
patterns.
- Cassandra
Cluster Admin - Cassandra Cluster Admin is a GUI tool to help people
administrate their Cassandra cluster.
- Cassandra-Tools
- Python Fabric scripts to help automate the launching and managing of
cluster testing on AWS.
- Cassandra
Opstools - Generic scripts to review and monitor cassandra, from
Spotify.
- CCM: Cassandra Cluster
Manager) - Script/library to create, launch and remove an Cassandra
cluster on localhost.
- Netflix-Priam -
Co-Process for backup/recovery, Token Management, and Centralized
Configuration management for Cassandra.
- CStar - Cassandra
cluster orchestration tool for the command line.
- CMB - Highly available,
horizontally scalable queuing and notification service compatible with
AWS SQS and SNS.
- CassieQ -
Distributed queue built off of Cassandra.
- Cherami - Distributed,
scalable, durable, and highly available message queue system.
- Scheduler -
Scala library for scheduling arbitrary code to run at an arbitrary
time.
Logging /Metrics
Resources
Documentation
Books
Courses
Communities
Blogs
- Datastax - DataStax,
Inc. is a data management company that provides commercial support,
software, and cloud database-as-a-service based on Cassandra.
- Codecentric:
Cassandra - Codecentric is an IT consulting company, these are their
blog posts surrounding the topic of Cassandra.
- Pythian:
Cassandra - Pythian provides data and cloud-related services. The
company provides services for Oracle, SQL Server, MySQL, Hadoop,
Cassandra and other databases and their supporting infrastructure.
- Instaclustr -
Managed and supported open source solutions for Cassandra, Kafka,
Elasticsearch & Redis.
- OpenCredo:Cassandra -
OpenCredo is a consulting company that helps clients make informed
decisions around cloud native and open source technologies, as well as
public cloud services.
- DOAN DuyHai’s Blog:
Cassandra - Duyhai Doan is a freelance big data and cloud architect
who values sharing knowledge and contributing to the technology
community.
- Amy Tobert - Amy Tobert is a
full-stack engineer & leader with passion for sustainable systems
and people-centered leadership. Her blog details different Cassandra
deployments amont other topics.
- Christopher Batey:
Cassandra - Christopher Batey is a software engineer of over 15
years and is a primary contributor to Akka and occasional contributor to
Cassandra.
- Distributed
Bytes: Cassandra - Tim Ojo is the creator of Distributed Bytes and
software engineer at Capital one. These are a collection of his posts
surrounding the topic of Cassandra.
- The Netflix Tech
Blog - Learn about Netflix’s world class engineering efforts,
company culture, product developments and more.
- Spotify
R&D / Engineering Blog : Cassandra - Cassandra related posts on
Spotify’s official technology blog.
- Ryan Svilha -
Ryan Svilha is a principle engineer at DataStax. His blog posts covers
topics surround Cassandra and associated tools.
- Anant - Anant builds and
manages business platforms of which they connect customer experiences
and information systems with real-time data platforms.
Videos
- Best Practices
for Running Cassandra on AWS - Joint webinar between Amazon Web
Services (AWS) and Stackdriver, an AWS Technology partner, to learn best
practices that apply to storing, analyzing and managing queries that
equate to over 1+ billion measurements a day.
- Monitoring
Cassandra: Don’t Miss a Thing (Alain Rodriguez, The Last Pickle) | C*
Summit 2016 - Talk given by Alain Rodriguez, Consultant at The Last
Pickle, discussing what to monitor in Cassandra, how, and why.
- Tuning
the Spark Cassandra Connector - Great talk by Russell Spitzer
maintainer of the Spark Cassandra connector.
- Cassandra.Lunch -
Collection of all past Cassandra.Lunch webinars including videos,
slides, and Blog posts surrounding all topics Cassandra.
- Working with
.NET and Cassandra/DataStax Enterprise - Getting a C# .NET core
application started to work against a Cassandra or DSE database.
Slides