Updating conversion, creating readmes

2024-04-19 23:37:46 +02:00
parent 3619ac710a
commit 08e75b0f0a
635 changed files with 30878 additions and 37344 deletions
--- a/terminal/dataengineering
+++ b/terminal/dataengineering
@@ -1,4 +1,4 @@
-[38;5;12m                                                 [39m[38;2;255;187;0m[1m[4mAwesome Data Engineering [0m[38;5;14m[1m[4m![0m[38;2;255;187;0m[1m[4mAwesome[0m[38;5;14m[1m[4m (https://awesome.re/badge-flat2.svg)[0m[38;2;255;187;0m[1m[4m (https://github.com/sindresorhus/awesome)[0m
+[38;5;12m                                                              [39m[38;2;255;187;0m[1m[4mAwesome Data Engineering [0m[38;5;14m[1m[4m![0m[38;2;255;187;0m[1m[4mAwesome[0m[38;5;14m[1m[4m (https://awesome.re/badge-flat2.svg)[0m[38;2;255;187;0m[1m[4m (https://github.com/sindresorhus/awesome)[0m

 [38;5;11m[1m▐[0m[38;5;12m [39m[38;5;12mA curated list of awesome things related to Data Engineering.[39m

@@ -35,8 +35,7 @@
 [38;5;12m  - [39m[38;5;14m[1mRQLite[0m[38;5;12m (https://github.com/rqlite/rqlite) - Replicated SQLite using the Raft consensus protocol.[39m
 [38;5;12m  - [39m[38;5;14m[1mMySQL[0m[38;5;12m (https://www.mysql.com/) - The world's most popular open source database.[39m
 [48;5;235m[38;5;249m- **TiDB** (https://github.com/pingcap/tidb) - TiDB is a distributed NewSQL database compatible with MySQL protocol.[49m[39m[48;5;235m[38;5;249m                                                                                                               [49m[39m
-[48;5;235m[38;5;249m- **Percona XtraBackup** (https://www.percona.com/software/mysql-database/percona-xtrabackup) - Percona XtraBackup is a free, open source, complete online backup solution for all versions of Percona Server, MySQ[49m[39m[48;5;235m[38;5;249m                [49m[39m
-[48;5;235m[38;5;249mL® and MariaDB®.[49m[39m[48;5;235m[38;5;249m                                                                                                                                                                                                                   [49m[39m
+[48;5;235m[38;5;249m- **Percona XtraBackup** (https://www.percona.com/software/mysql-database/percona-xtrabackup) - Percona XtraBackup is a free, open source, complete online backup solution for all versions of Percona Server, MySQL® and MariaDB®.[49m[39m
 [48;5;235m[38;5;249m- **mysql_utils** (https://github.com/pinterest/mysql_utils) - Pinterest MySQL Management Tools.[49m[39m[48;5;235m[38;5;249m                                                                                                                                   [49m[39m
 [38;5;12m  - [39m[38;5;14m[1mMariaDB[0m[38;5;12m (https://mariadb.org/) - An enhanced, drop-in replacement for MySQL.[39m
 [38;5;12m  - [39m[38;5;14m[1mPostgreSQL[0m[38;5;12m (https://www.postgresql.org/) - The world's most advanced open source database.[39m
@@ -52,20 +51,18 @@
 [38;5;12m  - [39m[38;5;14m[1mIonDB[0m[38;5;12m (https://github.com/iondbproject/iondb) - A key-value store for microcontroller and IoT applications.[39m
 [38;5;12m- Column[39m
 [38;5;12m  - [39m[38;5;14m[1mCassandra[0m[38;5;12m (https://cassandra.apache.org/) - The right choice when you need scalability and high availability without compromising performance.[39m
-[48;5;235m[38;5;249m- **Cassandra Calculator** (https://www.ecyrd.com/cassandracalculator/) - This simple form allows you to try out different values for your Apache Cassandra cluster and see what the impact is for your application[49m[39m[48;5;235m[38;5;249m [49m[39m
-[48;5;235m[38;5;249m.[49m[39m[48;5;235m[38;5;249m                                                                                                                                                                                                                   [49m[39m
+[48;5;235m[38;5;249m- **Cassandra Calculator** (https://www.ecyrd.com/cassandracalculator/) - This simple form allows you to try out different values for your Apache Cassandra cluster and see what the impact is for your application.[49m[39m
 [48;5;235m[38;5;249m- **CCM** (https://github.com/pcmanus/ccm) - A script to easily create and destroy an Apache Cassandra cluster on localhost.[49m[39m[48;5;235m[38;5;249m                                                                                        [49m[39m
 [48;5;235m[38;5;249m- **ScyllaDB** (https://github.com/scylladb/scylla) - NoSQL data store using the seastar framework, compatible with Apache Cassandra.[49m[39m[48;5;235m[38;5;249m                                                                               [49m[39m
 [38;5;12m  - [39m[38;5;14m[1mHBase[0m[38;5;12m (https://hbase.apache.org/) - The Hadoop database, a distributed, scalable, big data store.[39m
-[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mAWS[0m[38;5;14m[1m [0m[38;5;14m[1mRedshift[0m[38;5;12m [39m[38;5;12m(https://aws.amazon.com/redshift/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mfast,[39m[38;5;12m [39m[38;5;12mfully[39m[38;5;12m [39m[38;5;12mmanaged,[39m[38;5;12m [39m[38;5;12mpetabyte-scale[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mmakes[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12msimple[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mcost-effective[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12manalyze[39m[38;5;12m [39m[38;5;12mall[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12musing[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mexisting[39m[38;5;12m [39m[38;5;12mbusiness[39m[38;5;12m [39m
-[38;5;12mintelligence[39m[38;5;12m [39m[38;5;12mtools.[39m
+[38;5;12m  - [39m[38;5;14m[1mAWS Redshift[0m[38;5;12m (https://aws.amazon.com/redshift/) - A fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools.[39m
 [38;5;12m  - [39m[38;5;14m[1mFiloDB[0m[38;5;12m (https://github.com/filodb/FiloDB) - Distributed. Columnar. Versioned. Streaming. SQL.[39m
 [38;5;12m  - [39m[38;5;14m[1mVertica[0m[38;5;12m (https://www.vertica.com) - Distributed, MPP columnar database with extensive analytics SQL.[39m
 [38;5;12m  - [39m[38;5;14m[1mClickHouse[0m[38;5;12m (https://clickhouse.tech) - Distributed columnar DBMS for OLAP. SQL.[39m
 [38;5;12m- Document[39m
 [38;5;12m  - [39m[38;5;14m[1mMongoDB[0m[38;5;12m (https://www.mongodb.com) - An open-source, document database designed for ease of development and scaling.[39m
-[48;5;235m[38;5;249m- **Percona Server for MongoDB** (https://www.percona.com/software/mongo-database/percona-server-for-mongodb) - Percona Server for MongoDB® is a free, enhanced, fully compatible, open source, drop-in replacement[49m[39m[48;5;235m[38;5;249m                                                                                              [49m[39m
-[48;5;235m[38;5;249m for the MongoDB® Community Edition that includes enterprise-grade features and functionality.[49m[39m[48;5;235m[38;5;249m                                                                                                                                                                                                                   [49m[39m
+[48;5;235m[38;5;249m- **Percona Server for MongoDB** (https://www.percona.com/software/mongo-database/percona-server-for-mongodb) - Percona Server for MongoDB® is a free, enhanced, fully compatible, open source, drop-in replacement for the MongoDB® Communi[49m[39m[48;5;235m[38;5;249m                                                                     [49m[39m
+[48;5;235m[38;5;249mty Edition that includes enterprise-grade features and functionality.[49m[39m[48;5;235m[38;5;249m                                                                                                                                                                                                                                            [49m[39m
 [48;5;235m[38;5;249m- **MemDB** (https://github.com/rain1017/memdb) - Distributed Transactional In-Memory Database (based on MongoDB).[49m[39m[48;5;235m[38;5;249m                                                                                                                                                                                               [49m[39m
 [38;5;12m  - [39m[38;5;14m[1mElasticsearch[0m[38;5;12m (https://www.elastic.co/) - Search & Analyze Data in Real Time.[39m
 [38;5;12m  - [39m[38;5;14m[1mCouchbase[0m[38;5;12m (https://www.couchbase.com/) - The highest performing NoSQL distributed database.[39m
@@ -89,25 +86,23 @@
 [38;5;12m  - [39m[38;5;14m[1mHeroic[0m[38;5;12m (https://github.com/spotify/heroic) - A scalable time series database based on Cassandra and Elasticsearch, by Spotify.[39m
 [38;5;12m  - [39m[38;5;14m[1mDruid[0m[38;5;12m (https://github.com/apache/incubator-druid) - Column oriented distributed data store ideal for powering interactive applications.[39m
 [38;5;12m  - [39m[38;5;14m[1mRiak-TS[0m[38;5;12m (https://basho.com/products/riak-ts/) - Riak TS is the only enterprise-grade NoSQL time series database optimized specifically for IoT and Time Series data.[39m
-[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mAkumuli[0m[38;5;12m [39m[38;5;12m(https://github.com/akumuli/Akumuli)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAkumuli[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mnumeric[39m[38;5;12m [39m[38;5;12mtime-series[39m[38;5;12m [39m[38;5;12mdatabase.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mbe[39m[38;5;12m [39m[38;5;12mused[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mcapture,[39m[38;5;12m [39m[38;5;12mstore[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mprocess[39m[38;5;12m [39m[38;5;12mtime-series[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mreal-time.[39m[38;5;12m [39m[38;5;12mThe[39m[38;5;12m [39m[38;5;12mword[39m[38;5;12m [39m[38;5;12m"akumuli"[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mbe[39m[38;5;12m [39m[38;5;12mtranslated[39m[38;5;12m [39m[38;5;12mfrom[39m
-[38;5;12mesperanto[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12m"accumulate".[39m
+[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mAkumuli[0m[38;5;12m [39m[38;5;12m(https://github.com/akumuli/Akumuli)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAkumuli[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mnumeric[39m[38;5;12m [39m[38;5;12mtime-series[39m[38;5;12m [39m[38;5;12mdatabase.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mbe[39m[38;5;12m [39m[38;5;12mused[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mcapture,[39m[38;5;12m [39m[38;5;12mstore[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mprocess[39m[38;5;12m [39m[38;5;12mtime-series[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mreal-time.[39m[38;5;12m [39m[38;5;12mThe[39m[38;5;12m [39m[38;5;12mword[39m[38;5;12m [39m[38;5;12m"akumuli"[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mbe[39m[38;5;12m [39m[38;5;12mtranslated[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12mesperanto[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m
+[38;5;12m"accumulate".[39m
 [38;5;12m  - [39m[38;5;14m[1mRhombus[0m[38;5;12m (https://github.com/Pardot/Rhombus) - A time-series object store for Cassandra that handles all the complexity of building wide row indexes.[39m
 [38;5;12m  - [39m[38;5;14m[1mDalmatiner DB[0m[38;5;12m (https://github.com/dalmatinerdb/dalmatinerdb) - Fast distributed metrics database.[39m
 [38;5;12m  - [39m[38;5;14m[1mBlueflood[0m[38;5;12m (https://github.com/rackerlabs/blueflood) - A distributed system designed to ingest and process time series data.[39m
 [38;5;12m  - [39m[38;5;14m[1mTimely[0m[38;5;12m (https://github.com/NationalSecurityAgency/timely) - Timely is a time series database application that provides secure access to time series data based on Accumulo and Grafana.[39m
 [38;5;12m- Other[39m
 [38;5;12m  - [39m[38;5;14m[1mTarantool[0m[38;5;12m (https://github.com/tarantool/tarantool/) - Tarantool is an in-memory database and application server.[39m
-[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mGreenPlum[0m[38;5;12m [39m[38;5;12m(https://github.com/greenplum-db/gpdb)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mThe[39m[38;5;12m [39m[38;5;12mGreenplum[39m[38;5;12m [39m[38;5;12mDatabase[39m[38;5;12m [39m[38;5;12m(GPDB)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAn[39m[38;5;12m [39m[38;5;12madvanced,[39m[38;5;12m [39m[38;5;12mfully[39m[38;5;12m [39m[38;5;12mfeatured,[39m[38;5;12m [39m[38;5;12mopen[39m[38;5;12m [39m[38;5;12msource[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mwarehouse.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mprovides[39m[38;5;12m [39m[38;5;12mpowerful[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mrapid[39m[38;5;12m [39m[38;5;12manalytics[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12mpetabyte[39m[38;5;12m [39m[38;5;12mscale[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m
-[38;5;12mvolumes.[39m
+[38;5;12m  - [39m[38;5;14m[1mGreenPlum[0m[38;5;12m (https://github.com/greenplum-db/gpdb) - The Greenplum Database (GPDB) - An advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte scale data volumes.[39m
 [38;5;12m  - [39m[38;5;14m[1mcayley[0m[38;5;12m (https://github.com/cayleygraph/cayley) - An open-source graph database. Google.[39m
 [38;5;12m  - [39m[38;5;14m[1mSnappydata[0m[38;5;12m (https://github.com/SnappyDataInc/snappydata) - SnappyData: OLTP + OLAP Database built on Apache Spark.[39m
-[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mTimescaleDB[0m[38;5;12m [39m[38;5;12m(https://www.timescale.com/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mBuilt[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12man[39m[38;5;12m [39m[38;5;12mextension[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12mtop[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mPostgreSQL,[39m[38;5;12m [39m[38;5;12mTimescaleDB[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mtime-series[39m[38;5;12m [39m[38;5;12mSQL[39m[38;5;12m [39m[38;5;12mdatabase[39m[38;5;12m [39m[38;5;12mproviding[39m[38;5;12m [39m[38;5;12mfast[39m[38;5;12m [39m[38;5;12manalytics,[39m[38;5;12m [39m[38;5;12mscalability,[39m[38;5;12m [39m[38;5;12mwith[39m[38;5;12m [39m[38;5;12mautomated[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mmanagement[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m
-[38;5;12mproven[39m[38;5;12m [39m[38;5;12mstorage[39m[38;5;12m [39m[38;5;12mengine.[39m
+[38;5;12m  - [39m[38;5;14m[1mTimescaleDB[0m[38;5;12m (https://www.timescale.com/) - Built as an extension on top of PostgreSQL, TimescaleDB is a time-series SQL database providing fast analytics, scalability, with automated data management on a proven storage engine.[39m

 [38;2;255;187;0m[4mData Comparison[0m

-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mdatacompy[0m[38;5;12m [39m[38;5;12m(https://github.com/capitalone/datacompy)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mDataComPy[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mPython[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mfacilitates[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mcomparison[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mtwo[39m[38;5;12m [39m[38;5;12mDataFrames[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mpandas,[39m[38;5;12m [39m[38;5;12mPolars,[39m[38;5;12m [39m[38;5;12mSpark[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmore.[39m[38;5;12m [39m[38;5;12mThe[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mgoes[39m[38;5;12m [39m[38;5;12mbeyond[39m[38;5;12m [39m[38;5;12mbasic[39m[38;5;12m [39m[38;5;12mequality[39m[38;5;12m [39m
-[38;5;12mchecks[39m[38;5;12m [39m[38;5;12mby[39m[38;5;12m [39m[38;5;12mproviding[39m[38;5;12m [39m[38;5;12mdetailed[39m[38;5;12m [39m[38;5;12minsights[39m[38;5;12m [39m[38;5;12minto[39m[38;5;12m [39m[38;5;12mdiscrepancies[39m[38;5;12m [39m[38;5;12mat[39m[38;5;12m [39m[38;5;12mboth[39m[38;5;12m [39m[38;5;12mrow[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mcolumn[39m[38;5;12m [39m[38;5;12mlevels.[39m[38;5;12m [39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mdatacompy[0m[38;5;12m [39m[38;5;12m(https://github.com/capitalone/datacompy)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mDataComPy[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mPython[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mfacilitates[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mcomparison[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mtwo[39m[38;5;12m [39m[38;5;12mDataFrames[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mpandas,[39m[38;5;12m [39m[38;5;12mPolars,[39m[38;5;12m [39m[38;5;12mSpark[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmore.[39m[38;5;12m [39m[38;5;12mThe[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mgoes[39m[38;5;12m [39m[38;5;12mbeyond[39m[38;5;12m [39m[38;5;12mbasic[39m[38;5;12m [39m[38;5;12mequality[39m[38;5;12m [39m[38;5;12mchecks[39m[38;5;12m [39m[38;5;12mby[39m[38;5;12m [39m[38;5;12mproviding[39m[38;5;12m [39m
+[38;5;12mdetailed[39m[38;5;12m [39m[38;5;12minsights[39m[38;5;12m [39m[38;5;12minto[39m[38;5;12m [39m[38;5;12mdiscrepancies[39m[38;5;12m [39m[38;5;12mat[39m[38;5;12m [39m[38;5;12mboth[39m[38;5;12m [39m[38;5;12mrow[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mcolumn[39m[38;5;12m [39m[38;5;12mlevels.[39m[38;5;12m [39m

 [38;2;255;187;0m[4mData Ingestion[0m

@@ -149,16 +144,15 @@
 [38;5;12m- [39m[38;5;14m[1mSnackFS[0m[38;5;12m (https://github.com/tuplejump/snackfs-release) - SnackFS is our bite-sized, lightweight HDFS compatible FileSystem built over Cassandra.[39m
 [38;5;12m- [39m[38;5;14m[1mGlusterFS[0m[38;5;12m (https://www.gluster.org/) - Gluster Filesystem.[39m
 [38;5;12m- [39m[38;5;14m[1mXtreemFS[0m[38;5;12m (https://www.xtreemfs.org/) - Fault-tolerant distributed file system for all storage needs.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSeaweedFS[0m[38;5;12m [39m[38;5;12m(https://github.com/chrislusf/seaweedfs)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mSeaweed-FS[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12msimple[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mhighly[39m[38;5;12m [39m[38;5;12mscalable[39m[38;5;12m [39m[38;5;12mdistributed[39m[38;5;12m [39m[38;5;12mfile[39m[38;5;12m [39m[38;5;12msystem.[39m[38;5;12m [39m[38;5;12mThere[39m[38;5;12m [39m[38;5;12mare[39m[38;5;12m [39m[38;5;12mtwo[39m[38;5;12m [39m[38;5;12mobjectives:[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mstore[39m[38;5;12m [39m[38;5;12mbillions[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mfiles![39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mserve[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mfiles[39m[38;5;12m [39m[38;5;12mfast![39m[38;5;12m [39m[38;5;12mInstead[39m[38;5;12m [39m
-[38;5;12mof[39m[38;5;12m [39m[38;5;12msupporting[39m[38;5;12m [39m[38;5;12mfull[39m[38;5;12m [39m[38;5;12mPOSIX[39m[38;5;12m [39m[38;5;12mfile[39m[38;5;12m [39m[38;5;12msystem[39m[38;5;12m [39m[38;5;12msemantics,[39m[38;5;12m [39m[38;5;12mSeaweed-FS[39m[38;5;12m [39m[38;5;12mchoose[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mimplement[39m[38;5;12m [39m[38;5;12monly[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mkey~file[39m[38;5;12m [39m[38;5;12mmapping.[39m[38;5;12m [39m[38;5;12mSimilar[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mword[39m[38;5;12m [39m[38;5;12m"NoSQL",[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mcall[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12m"NoFS".[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSeaweedFS[0m[38;5;12m [39m[38;5;12m(https://github.com/chrislusf/seaweedfs)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mSeaweed-FS[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12msimple[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mhighly[39m[38;5;12m [39m[38;5;12mscalable[39m[38;5;12m [39m[38;5;12mdistributed[39m[38;5;12m [39m[38;5;12mfile[39m[38;5;12m [39m[38;5;12msystem.[39m[38;5;12m [39m[38;5;12mThere[39m[38;5;12m [39m[38;5;12mare[39m[38;5;12m [39m[38;5;12mtwo[39m[38;5;12m [39m[38;5;12mobjectives:[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mstore[39m[38;5;12m [39m[38;5;12mbillions[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mfiles![39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mserve[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mfiles[39m[38;5;12m [39m[38;5;12mfast![39m[38;5;12m [39m[38;5;12mInstead[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12msupporting[39m[38;5;12m [39m[38;5;12mfull[39m[38;5;12m [39m[38;5;12mPOSIX[39m[38;5;12m [39m
+[38;5;12mfile[39m[38;5;12m [39m[38;5;12msystem[39m[38;5;12m [39m[38;5;12msemantics,[39m[38;5;12m [39m[38;5;12mSeaweed-FS[39m[38;5;12m [39m[38;5;12mchoose[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mimplement[39m[38;5;12m [39m[38;5;12monly[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mkey~file[39m[38;5;12m [39m[38;5;12mmapping.[39m[38;5;12m [39m[38;5;12mSimilar[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mword[39m[38;5;12m [39m[38;5;12m"NoSQL",[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12mcan[39m[38;5;12m [39m[38;5;12mcall[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12m"NoFS".[39m
 [38;5;12m- [39m[38;5;14m[1mS3QL[0m[38;5;12m (https://github.com/s3ql/s3ql/) - S3QL is a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack.[39m
 [38;5;12m- [39m[38;5;14m[1mLizardFS[0m[38;5;12m (https://lizardfs.com/) - LizardFS Software Defined Storage is a distributed, parallel, scalable, fault-tolerant, Geo-Redundant and highly available file system.[39m

 [38;2;255;187;0m[4mSerialization format[0m

 [38;5;12m- [39m[38;5;14m[1mApache Avro[0m[38;5;12m (https://avro.apache.org) - Apache Avro™ is a data serialization system.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mApache[0m[38;5;14m[1m [0m[38;5;14m[1mParquet[0m[38;5;12m [39m[38;5;12m(https://parquet.apache.org)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mApache[39m[38;5;12m [39m[38;5;12mParquet[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mcolumnar[39m[38;5;12m [39m[38;5;12mstorage[39m[38;5;12m [39m[38;5;12mformat[39m[38;5;12m [39m[38;5;12mavailable[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12many[39m[38;5;12m [39m[38;5;12mproject[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mHadoop[39m[38;5;12m [39m[38;5;12mecosystem,[39m[38;5;12m [39m[38;5;12mregardless[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mchoice[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mprocessing[39m[38;5;12m [39m[38;5;12mframework,[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mmodel[39m[38;5;12m [39m[38;5;12mor[39m[38;5;12m [39m
-[38;5;12mprogramming[39m[38;5;12m [39m[38;5;12mlanguage.[39m
+[38;5;12m- [39m[38;5;14m[1mApache Parquet[0m[38;5;12m (https://parquet.apache.org) - Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language.[39m
 [38;5;12m  - [39m[38;5;14m[1mSnappy[0m[38;5;12m (https://github.com/google/snappy) - A fast compressor/decompressor. Used with Parquet.[39m
 [38;5;12m  - [39m[38;5;14m[1mPigZ[0m[38;5;12m (https://zlib.net/pigz/) - A parallel implementation of gzip for modern multi-processor, multi-core machines.[39m
 [38;5;12m- [39m[38;5;14m[1mApache ORC[0m[38;5;12m (https://orc.apache.org/) - The smallest, fastest columnar storage for Hadoop workloads.[39m
@@ -187,8 +181,8 @@

 [38;2;255;187;0m[4mBatch Processing[0m

-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mHadoop[0m[38;5;14m[1m [0m[38;5;14m[1mMapReduce[0m[38;5;12m [39m[38;5;12m(https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mHadoop[39m[38;5;12m [39m[38;5;12mMapReduce[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12msoftware[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12measily[39m[38;5;12m [39m[38;5;12mwriting[39m[38;5;12m [39m[38;5;12mapplications[39m[38;5;12m [39m
-[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mprocess[39m[38;5;12m [39m[38;5;12mvast[39m[38;5;12m [39m[38;5;12mamounts[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12m(multi-terabyte[39m[38;5;12m [39m[38;5;12mdata-sets)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12min-parallel[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12mlarge[39m[38;5;12m [39m[38;5;12mclusters[39m[38;5;12m [39m[38;5;12m(thousands[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mnodes)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mcommodity[39m[38;5;12m [39m[38;5;12mhardware[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mreliable,[39m[38;5;12m [39m[38;5;12mfault-tolerant[39m[38;5;12m [39m[38;5;12mmanner.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mHadoop[0m[38;5;14m[1m [0m[38;5;14m[1mMapReduce[0m[38;5;12m [39m[38;5;12m(https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mHadoop[39m[38;5;12m [39m[38;5;12mMapReduce[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12msoftware[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12measily[39m[38;5;12m [39m[38;5;12mwriting[39m[38;5;12m [39m[38;5;12mapplications[39m[38;5;12m [39m[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mprocess[39m[38;5;12m [39m[38;5;12mvast[39m[38;5;12m [39m
+[38;5;12mamounts[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12m(multi-terabyte[39m[38;5;12m [39m[38;5;12mdata-sets)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12min-parallel[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12mlarge[39m[38;5;12m [39m[38;5;12mclusters[39m[38;5;12m [39m[38;5;12m(thousands[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mnodes)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mcommodity[39m[38;5;12m [39m[38;5;12mhardware[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mreliable,[39m[38;5;12m [39m[38;5;12mfault-tolerant[39m[38;5;12m [39m[38;5;12mmanner.[39m
 [38;5;12m- [39m[38;5;14m[1mSpark[0m[38;5;12m (https://spark.apache.org/) - A multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.[39m
 [38;5;12m  - [39m[38;5;14m[1mSpark Packages[0m[38;5;12m (https://spark-packages.org) - A community index of packages for Apache Spark.[39m
 [38;5;12m  - [39m[38;5;14m[1mDeep Spark[0m[38;5;12m (https://github.com/Stratio/deep-spark) - Connecting Apache Spark with different data stores. Deprecated.[39m
@@ -198,14 +192,14 @@
 [38;5;12m- [39m[38;5;14m[1mAWS EMR[0m[38;5;12m (https://aws.amazon.com/emr/) - A web service that makes it easy to quickly and cost-effectively process vast amounts of data.[39m
 [38;5;12m- [39m[38;5;14m[1mData Mechanics[0m[38;5;12m (https://www.datamechanics.co) - A cloud-based platform deployed on Kubernetes making Apache Spark more developer-friendly and cost-effective.[39m
 [38;5;12m- [39m[38;5;14m[1mTez[0m[38;5;12m (https://tez.apache.org/) - An application framework which allows for a complex directed-acyclic-graph of tasks for processing data.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mBistro[0m[38;5;12m [39m[38;5;12m(https://github.com/asavinov/bistro)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mlight-weight[39m[38;5;12m [39m[38;5;12mengine[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12mgeneral-purpose[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mprocessing[39m[38;5;12m [39m[38;5;12mincluding[39m[38;5;12m [39m[38;5;12mboth[39m[38;5;12m [39m[38;5;12mbatch[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mstream[39m[38;5;12m [39m[38;5;12manalytics.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12mbased[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mnovel[39m[38;5;12m [39m[38;5;12munique[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mmodel,[39m[38;5;12m [39m[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mrepresents[39m[38;5;12m [39m
-[38;5;12mdata[39m[38;5;12m [39m[38;5;12mvia[39m[38;5;12m [39m[38;5;12m_functions_[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mprocesses[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mvia[39m[38;5;12m [39m[38;5;12m_columns[39m[38;5;12m [39m[38;5;12moperations_[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12mopposed[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mhaving[39m[38;5;12m [39m[38;5;12monly[39m[38;5;12m [39m[38;5;12mset[39m[38;5;12m [39m[38;5;12moperations[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mconventional[39m[38;5;12m [39m[38;5;12mapproaches[39m[38;5;12m [39m[38;5;12mlike[39m[38;5;12m [39m[38;5;12mMapReduce[39m[38;5;12m [39m[38;5;12mor[39m[38;5;12m [39m[38;5;12mSQL.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mBistro[0m[38;5;12m [39m[38;5;12m(https://github.com/asavinov/bistro)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mlight-weight[39m[38;5;12m [39m[38;5;12mengine[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12mgeneral-purpose[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mprocessing[39m[38;5;12m [39m[38;5;12mincluding[39m[38;5;12m [39m[38;5;12mboth[39m[38;5;12m [39m[38;5;12mbatch[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mstream[39m[38;5;12m [39m[38;5;12manalytics.[39m[38;5;12m [39m[38;5;12mIt[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12mbased[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mnovel[39m[38;5;12m [39m[38;5;12munique[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mmodel,[39m[38;5;12m [39m[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mrepresents[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mvia[39m[38;5;12m [39m[38;5;12m_functions_[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m
+[38;5;12mprocesses[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mvia[39m[38;5;12m [39m[38;5;12m_columns[39m[38;5;12m [39m[38;5;12moperations_[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12mopposed[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mhaving[39m[38;5;12m [39m[38;5;12monly[39m[38;5;12m [39m[38;5;12mset[39m[38;5;12m [39m[38;5;12moperations[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mconventional[39m[38;5;12m [39m[38;5;12mapproaches[39m[38;5;12m [39m[38;5;12mlike[39m[38;5;12m [39m[38;5;12mMapReduce[39m[38;5;12m [39m[38;5;12mor[39m[38;5;12m [39m[38;5;12mSQL.[39m

 [38;5;12m- Batch ML[39m
 [38;5;12m  - [39m[38;5;14m[1mH2O[0m[38;5;12m (https://www.h2o.ai/) - Fast scalable machine learning API for smarter applications.[39m
 [38;5;12m  - [39m[38;5;14m[1mMahout[0m[38;5;12m (https://mahout.apache.org/) - An environment for quickly creating scalable performant machine learning applications.[39m
-[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSpark[0m[38;5;14m[1m [0m[38;5;14m[1mMLlib[0m[38;5;12m [39m[38;5;12m(https://spark.apache.org/docs/latest/ml-guide.html)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mSpark's[39m[38;5;12m [39m[38;5;12mscalable[39m[38;5;12m [39m[38;5;12mmachine[39m[38;5;12m [39m[38;5;12mlearning[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mconsisting[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mcommon[39m[38;5;12m [39m[38;5;12mlearning[39m[38;5;12m [39m[38;5;12malgorithms[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mutilities,[39m[38;5;12m [39m[38;5;12mincluding[39m[38;5;12m [39m[38;5;12mclassification,[39m[38;5;12m [39m[38;5;12mregression,[39m[38;5;12m [39m
-[38;5;12mclustering,[39m[38;5;12m [39m[38;5;12mcollaborative[39m[38;5;12m [39m[38;5;12mfiltering,[39m[38;5;12m [39m[38;5;12mdimensionality[39m[38;5;12m [39m[38;5;12mreduction,[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12mwell[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12munderlying[39m[38;5;12m [39m[38;5;12moptimization[39m[38;5;12m [39m[38;5;12mprimitives.[39m
+[38;5;12m  [39m[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSpark[0m[38;5;14m[1m [0m[38;5;14m[1mMLlib[0m[38;5;12m [39m[38;5;12m(https://spark.apache.org/docs/latest/ml-guide.html)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mSpark's[39m[38;5;12m [39m[38;5;12mscalable[39m[38;5;12m [39m[38;5;12mmachine[39m[38;5;12m [39m[38;5;12mlearning[39m[38;5;12m [39m[38;5;12mlibrary[39m[38;5;12m [39m[38;5;12mconsisting[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mcommon[39m[38;5;12m [39m[38;5;12mlearning[39m[38;5;12m [39m[38;5;12malgorithms[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mutilities,[39m[38;5;12m [39m[38;5;12mincluding[39m[38;5;12m [39m[38;5;12mclassification,[39m[38;5;12m [39m[38;5;12mregression,[39m[38;5;12m [39m[38;5;12mclustering,[39m[38;5;12m [39m[38;5;12mcollaborative[39m[38;5;12m [39m
+[38;5;12mfiltering,[39m[38;5;12m [39m[38;5;12mdimensionality[39m[38;5;12m [39m[38;5;12mreduction,[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12mwell[39m[38;5;12m [39m[38;5;12mas[39m[38;5;12m [39m[38;5;12munderlying[39m[38;5;12m [39m[38;5;12moptimization[39m[38;5;12m [39m[38;5;12mprimitives.[39m
 [38;5;12m- Batch Graph[39m
 [38;5;12m  - [39m[38;5;14m[1mGraphLab Create[0m[38;5;12m (https://turi.com/products/create/docs/) - A machine learning platform that enables data scientists and app developers to easily create intelligent apps at scale.[39m
 [38;5;12m  - [39m[38;5;14m[1mGiraph[0m[38;5;12m (https://giraph.apache.org/) - An iterative graph processing system built for high scalability.[39m
@@ -238,26 +232,23 @@
 [38;5;12m  - [39m[38;5;14m[1mCronQ[0m[38;5;12m (https://github.com/seatgeek/cronq) - An application cron-like system. [39m[38;5;14m[1mUsed[0m[38;5;12m (https://chairnerd.seatgeek.com/building-out-the-seatgeek-data-pipeline/) w/Luige. Deprecated.[39m
 [38;5;12m- [39m[38;5;14m[1mCascading[0m[38;5;12m (https://www.cascading.org/) - Java based application development platform.[39m
 [38;5;12m- [39m[38;5;14m[1mAirflow[0m[38;5;12m (https://github.com/apache/airflow) - Airflow is a system to programmaticaly author, schedule and monitor data pipelines.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mAzkaban[0m[38;5;12m [39m[38;5;12m(https://azkaban.github.io/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAzkaban[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mbatch[39m[38;5;12m [39m[38;5;12mworkflow[39m[38;5;12m [39m[38;5;12mjob[39m[38;5;12m [39m[38;5;12mscheduler[39m[38;5;12m [39m[38;5;12mcreated[39m[38;5;12m [39m[38;5;12mat[39m[38;5;12m [39m[38;5;12mLinkedIn[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mrun[39m[38;5;12m [39m[38;5;12mHadoop[39m[38;5;12m [39m[38;5;12mjobs.[39m[38;5;12m [39m[38;5;12mAzkaban[39m[38;5;12m [39m[38;5;12mresolves[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mordering[39m[38;5;12m [39m[38;5;12mthrough[39m[38;5;12m [39m[38;5;12mjob[39m[38;5;12m [39m[38;5;12mdependencies[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mprovides[39m[38;5;12m [39m[38;5;12man[39m[38;5;12m [39m[38;5;12measy[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12muse[39m[38;5;12m [39m[38;5;12mweb[39m[38;5;12m [39m
-[38;5;12muser[39m[38;5;12m [39m[38;5;12minterface[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mmaintain[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mtrack[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mworkflows.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mAzkaban[0m[38;5;12m [39m[38;5;12m(https://azkaban.github.io/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAzkaban[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mbatch[39m[38;5;12m [39m[38;5;12mworkflow[39m[38;5;12m [39m[38;5;12mjob[39m[38;5;12m [39m[38;5;12mscheduler[39m[38;5;12m [39m[38;5;12mcreated[39m[38;5;12m [39m[38;5;12mat[39m[38;5;12m [39m[38;5;12mLinkedIn[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mrun[39m[38;5;12m [39m[38;5;12mHadoop[39m[38;5;12m [39m[38;5;12mjobs.[39m[38;5;12m [39m[38;5;12mAzkaban[39m[38;5;12m [39m[38;5;12mresolves[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mordering[39m[38;5;12m [39m[38;5;12mthrough[39m[38;5;12m [39m[38;5;12mjob[39m[38;5;12m [39m[38;5;12mdependencies[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mprovides[39m[38;5;12m [39m[38;5;12man[39m[38;5;12m [39m[38;5;12measy[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12muse[39m[38;5;12m [39m[38;5;12mweb[39m[38;5;12m [39m[38;5;12muser[39m[38;5;12m [39m[38;5;12minterface[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mmaintain[39m
+[38;5;12mand[39m[38;5;12m [39m[38;5;12mtrack[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mworkflows.[39m
 [38;5;12m- [39m[38;5;14m[1mOozie[0m[38;5;12m (https://oozie.apache.org/) - Oozie is a workflow scheduler system to manage Apache Hadoop jobs.[39m
 [38;5;12m- [39m[38;5;14m[1mPinball[0m[38;5;12m (https://github.com/pinterest/pinball) - DAG based workflow manager. Job flows are defined programmaticaly in Python. Support output passing between jobs.[39m
 [38;5;12m- [39m[38;5;14m[1mDagster[0m[38;5;12m (https://github.com/dagster-io/dagster) - Dagster is an open-source Python library for building data applications.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mKedro[0m[38;5;12m [39m[38;5;12m(https://kedro.readthedocs.io/en/latest/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mKedro[39m[38;5;12m [39m[38;5;12mis[39m[38;5;12m [39m[38;5;12ma[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mmakes[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12measy[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mbuild[39m[38;5;12m [39m[38;5;12mrobust[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mscalable[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mpipelines[39m[38;5;12m [39m[38;5;12mby[39m[38;5;12m [39m[38;5;12mproviding[39m[38;5;12m [39m[38;5;12muniform[39m[38;5;12m [39m[38;5;12mproject[39m[38;5;12m [39m[38;5;12mtemplates,[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mabstraction,[39m[38;5;12m [39m[38;5;12mconfiguration[39m[38;5;12m [39m[38;5;12mand[39m
-[38;5;12mpipeline[39m[38;5;12m [39m[38;5;12massembly.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mDataform[0m[38;5;12m [39m[38;5;12m(https://dataform.co/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAn[39m[38;5;12m [39m[38;5;12mopen-source[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mweb[39m[38;5;12m [39m[38;5;12mbased[39m[38;5;12m [39m[38;5;12mIDE[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mmanage[39m[38;5;12m [39m[38;5;12mdatasets[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mdependencies.[39m[38;5;12m [39m[38;5;12mSQLX[39m[38;5;12m [39m[38;5;12mextends[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mexisting[39m[38;5;12m [39m[38;5;12mSQL[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mdialect[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12madd[39m[38;5;12m [39m[38;5;12mfeatures[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12msupport[39m[38;5;12m [39m[38;5;12mdependency[39m[38;5;12m [39m
-[38;5;12mmanagement,[39m[38;5;12m [39m[38;5;12mtesting,[39m[38;5;12m [39m[38;5;12mdocumentation[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmore.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mCensus[0m[38;5;12m [39m[38;5;12m(https://getcensus.com/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mreverse-ETL[39m[38;5;12m [39m[38;5;12mtool[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mlet[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12msync[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mcloud[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mSaaS[39m[38;5;12m [39m[38;5;12mapplications[39m[38;5;12m [39m[38;5;12mlike[39m[38;5;12m [39m[38;5;12mSalesforce,[39m[38;5;12m [39m[38;5;12mMarketo,[39m[38;5;12m [39m[38;5;12mHubSpot,[39m[38;5;12m [39m[38;5;12mZendesk,[39m[38;5;12m [39m[38;5;12metc.[39m[38;5;12m [39m[38;5;12mNo[39m[38;5;12m [39m[38;5;12mengineering[39m[38;5;12m [39m[38;5;12mfavors[39m[38;5;12m [39m
-[38;5;12mrequired—just[39m[38;5;12m [39m[38;5;12mSQL.[39m
+[38;5;12m- [39m[38;5;14m[1mKedro[0m[38;5;12m (https://kedro.readthedocs.io/en/latest/) - Kedro is a framework that makes it easy to build robust and scalable data pipelines by providing uniform project templates, data abstraction, configuration and pipeline assembly.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mDataform[0m[38;5;12m [39m[38;5;12m(https://dataform.co/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAn[39m[38;5;12m [39m[38;5;12mopen-source[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mweb[39m[38;5;12m [39m[38;5;12mbased[39m[38;5;12m [39m[38;5;12mIDE[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mmanage[39m[38;5;12m [39m[38;5;12mdatasets[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mdependencies.[39m[38;5;12m [39m[38;5;12mSQLX[39m[38;5;12m [39m[38;5;12mextends[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mexisting[39m[38;5;12m [39m[38;5;12mSQL[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mdialect[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12madd[39m[38;5;12m [39m[38;5;12mfeatures[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12msupport[39m[38;5;12m [39m[38;5;12mdependency[39m[38;5;12m [39m[38;5;12mmanagement,[39m[38;5;12m [39m[38;5;12mtesting,[39m[38;5;12m [39m
+[38;5;12mdocumentation[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmore.[39m
+[38;5;12m- [39m[38;5;14m[1mCensus[0m[38;5;12m (https://getcensus.com/) - A reverse-ETL tool that let you sync data from your cloud data warehouse to SaaS applications like Salesforce, Marketo, HubSpot, Zendesk, etc. No engineering favors required—just SQL.[39m
 [38;5;12m- [39m[38;5;14m[1mdbt[0m[38;5;12m (https://getdbt.com/) - A command line tool that enables data analysts and engineers to transform data in their warehouses more effectively.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mRudderStack[0m[38;5;12m [39m[38;5;12m(https://github.com/rudderlabs/rudder-server)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mwarehouse-first[39m[38;5;12m [39m[38;5;12mCustomer[39m[38;5;12m [39m[38;5;12mData[39m[38;5;12m [39m[38;5;12mPlatform[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12menables[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mcollect[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12mevery[39m[38;5;12m [39m[38;5;12mapplication,[39m[38;5;12m [39m[38;5;12mwebsite[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mSaaS[39m[38;5;12m [39m[38;5;12mplatform,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mthen[39m[38;5;12m [39m[38;5;12mactivate[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m
-[38;5;12myour[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mbusiness[39m[38;5;12m [39m[38;5;12mtools.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mPACE[0m[38;5;12m [39m[38;5;12m(https://github.com/getstrm/pace)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mAn[39m[38;5;12m [39m[38;5;12mopen[39m[38;5;12m [39m[38;5;12msource[39m[38;5;12m [39m[38;5;12mframework[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mallows[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12menforce[39m[38;5;12m [39m[38;5;12magreements[39m[38;5;12m [39m[38;5;12mon[39m[38;5;12m [39m[38;5;12mhow[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mshould[39m[38;5;12m [39m[38;5;12mbe[39m[38;5;12m [39m[38;5;12maccessed,[39m[38;5;12m [39m[38;5;12mused,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mtransformed,[39m[38;5;12m [39m[38;5;12mregardless[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mplatform[39m[38;5;12m [39m[38;5;12m(Snowflake,[39m[38;5;12m [39m
-[38;5;12mBigQuery,[39m[38;5;12m [39m[38;5;12mDataBricks,[39m[38;5;12m [39m[38;5;12metc.)[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mRudderStack[0m[38;5;12m [39m[38;5;12m(https://github.com/rudderlabs/rudder-server)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mwarehouse-first[39m[38;5;12m [39m[38;5;12mCustomer[39m[38;5;12m [39m[38;5;12mData[39m[38;5;12m [39m[38;5;12mPlatform[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12menables[39m[38;5;12m [39m[38;5;12myou[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mcollect[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12mevery[39m[38;5;12m [39m[38;5;12mapplication,[39m[38;5;12m [39m[38;5;12mwebsite[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mSaaS[39m[38;5;12m [39m[38;5;12mplatform,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mthen[39m[38;5;12m [39m[38;5;12mactivate[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mwarehouse[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m
+[38;5;12mbusiness[39m[38;5;12m [39m[38;5;12mtools.[39m
+[38;5;12m- [39m[38;5;14m[1mPACE[0m[38;5;12m (https://github.com/getstrm/pace) - An open source framework that allows you to enforce agreements on how data should be accessed, used, and transformed, regardless of the data platform (Snowflake, BigQuery, DataBricks, etc.)[39m
 [38;5;12m- [39m[38;5;14m[1mPrefect[0m[38;5;12m (https://prefect.io/) - Prefect is an orchestration and observability platform. With it, developers can rapidly build and scale resilient code, and triage disruptions effortlessly.[39m
 [38;5;12m- [39m[38;5;14m[1mMultiwoven[0m[38;5;12m (https://github.com/Multiwoven/multiwoven) - The open-source reverse ETL, data activation platform for modern data teams.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSuprSend[0m[38;5;12m [39m[38;5;12m(https://www.suprsend.com/products/workflows)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mCreate[39m[38;5;12m [39m[38;5;12mautomated[39m[38;5;12m [39m[38;5;12mworkflows[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mlogic[39m[38;5;12m [39m[38;5;12musing[39m[38;5;12m [39m[38;5;12mAPI's[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mnotification[39m[38;5;12m [39m[38;5;12mservice.[39m[38;5;12m [39m[38;5;12mAdd[39m[38;5;12m [39m[38;5;12mtemplates,[39m[38;5;12m [39m[38;5;12mbatching,[39m[38;5;12m [39m[38;5;12mpreferences,[39m[38;5;12m [39m[38;5;12minapp[39m[38;5;12m [39m[38;5;12minbox[39m[38;5;12m [39m[38;5;12mwith[39m[38;5;12m [39m[38;5;12mworkflows[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m
-[38;5;12mtrigger[39m[38;5;12m [39m[38;5;12mnotifications[39m[38;5;12m [39m[38;5;12mdirectly[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mwarehouse.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mSuprSend[0m[38;5;12m [39m[38;5;12m(https://www.suprsend.com/products/workflows)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mCreate[39m[38;5;12m [39m[38;5;12mautomated[39m[38;5;12m [39m[38;5;12mworkflows[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mlogic[39m[38;5;12m [39m[38;5;12musing[39m[38;5;12m [39m[38;5;12mAPI's[39m[38;5;12m [39m[38;5;12mfor[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mnotification[39m[38;5;12m [39m[38;5;12mservice.[39m[38;5;12m [39m[38;5;12mAdd[39m[38;5;12m [39m[38;5;12mtemplates,[39m[38;5;12m [39m[38;5;12mbatching,[39m[38;5;12m [39m[38;5;12mpreferences,[39m[38;5;12m [39m[38;5;12minapp[39m[38;5;12m [39m[38;5;12minbox[39m[38;5;12m [39m[38;5;12mwith[39m[38;5;12m [39m[38;5;12mworkflows[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mtrigger[39m[38;5;12m [39m[38;5;12mnotifications[39m[38;5;12m [39m
+[38;5;12mdirectly[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mwarehouse.[39m

 [38;2;255;187;0m[4mData Lake Management[0m

@@ -296,8 +287,7 @@

 [38;5;12m- [39m[38;5;14m[1mGitHub Archive[0m[38;5;12m (https://www.gharchive.org/) - GitHub's public timeline since 2011, updated every hour.[39m
 [38;5;12m- [39m[38;5;14m[1mCommon Crawl[0m[38;5;12m (https://commoncrawl.org/) - Open source repository of web crawl data.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mWikipedia[0m[38;5;12m [39m[38;5;12m(https://dumps.wikimedia.org/enwiki/latest/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mWikipedia's[39m[38;5;12m [39m[38;5;12mcomplete[39m[38;5;12m [39m[38;5;12mcopy[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mall[39m[38;5;12m [39m[38;5;12mwikis,[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mthe[39m[38;5;12m [39m[38;5;12mform[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mwikitext[39m[38;5;12m [39m[38;5;12msource[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmetadata[39m[38;5;12m [39m[38;5;12membedded[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mXML.[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mnumber[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mraw[39m[38;5;12m [39m[38;5;12mdatabase[39m[38;5;12m [39m[38;5;12mtables[39m[38;5;12m [39m[38;5;12min[39m[38;5;12m [39m[38;5;12mSQL[39m[38;5;12m [39m[38;5;12mform[39m[38;5;12m [39m[38;5;12mare[39m[38;5;12m [39m
-[38;5;12malso[39m[38;5;12m [39m[38;5;12mavailable.[39m
+[38;5;12m- [39m[38;5;14m[1mWikipedia[0m[38;5;12m (https://dumps.wikimedia.org/enwiki/latest/) - Wikipedia's complete copy of all wikis, in the form of wikitext source and metadata embedded in XML. A number of raw database tables in SQL form are also available.[39m

 [38;2;255;187;0m[4mMonitoring[0m

@@ -314,8 +304,8 @@

 [38;2;255;187;0m[4mTesting[0m

-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mGrai[0m[38;5;12m [39m[38;5;12m(https://github.com/grai-io/grai-core/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mcatalog[39m[38;5;12m [39m[38;5;12mtool[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mintegrates[39m[38;5;12m [39m[38;5;12minto[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mCI[39m[38;5;12m [39m[38;5;12msystem[39m[38;5;12m [39m[38;5;12mexposing[39m[38;5;12m [39m[38;5;12mdownstream[39m[38;5;12m [39m[38;5;12mimpact[39m[38;5;12m [39m[38;5;12mtesting[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mchanges.[39m[38;5;12m [39m[38;5;12mThese[39m[38;5;12m [39m[38;5;12mtests[39m[38;5;12m [39m[38;5;12mprevent[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mchanges[39m[38;5;12m [39m[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mmight[39m[38;5;12m [39m[38;5;12mbreak[39m[38;5;12m [39m
-[38;5;12mdata[39m[38;5;12m [39m[38;5;12mpipelines[39m[38;5;12m [39m[38;5;12mor[39m[38;5;12m [39m[38;5;12mBI[39m[38;5;12m [39m[38;5;12mdashboards[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12mmaking[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mproduction.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mGrai[0m[38;5;12m [39m[38;5;12m(https://github.com/grai-io/grai-core/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mcatalog[39m[38;5;12m [39m[38;5;12mtool[39m[38;5;12m [39m[38;5;12mthat[39m[38;5;12m [39m[38;5;12mintegrates[39m[38;5;12m [39m[38;5;12minto[39m[38;5;12m [39m[38;5;12myour[39m[38;5;12m [39m[38;5;12mCI[39m[38;5;12m [39m[38;5;12msystem[39m[38;5;12m [39m[38;5;12mexposing[39m[38;5;12m [39m[38;5;12mdownstream[39m[38;5;12m [39m[38;5;12mimpact[39m[38;5;12m [39m[38;5;12mtesting[39m[38;5;12m [39m[38;5;12mof[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mchanges.[39m[38;5;12m [39m[38;5;12mThese[39m[38;5;12m [39m[38;5;12mtests[39m[38;5;12m [39m[38;5;12mprevent[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mchanges[39m[38;5;12m [39m[38;5;12mwhich[39m[38;5;12m [39m[38;5;12mmight[39m[38;5;12m [39m[38;5;12mbreak[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mpipelines[39m[38;5;12m [39m[38;5;12mor[39m[38;5;12m [39m[38;5;12mBI[39m[38;5;12m [39m
+[38;5;12mdashboards[39m[38;5;12m [39m[38;5;12mfrom[39m[38;5;12m [39m[38;5;12mmaking[39m[38;5;12m [39m[38;5;12mit[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mproduction.[39m
 [38;5;12m- [39m[38;5;14m[1mDQOps[0m[38;5;12m (https://github.com/dqops/dqo) - An open-source data quality platform for the whole data platform lifecycle from profiling new data sources to applying full automation of data quality monitoring.[39m

 [38;2;255;187;0m[4mCommunity[0m
@@ -332,5 +322,5 @@
 [38;2;255;187;0m[4mPodcasts[0m

 [38;5;12m- [39m[38;5;14m[1mData Engineering Podcast[0m[38;5;12m (https://www.dataengineeringpodcast.com/) - The show about modern data infrastructure.[39m
-[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mThe[0m[38;5;14m[1m [0m[38;5;14m[1mData[0m[38;5;14m[1m [0m[38;5;14m[1mStack[0m[38;5;14m[1m [0m[38;5;14m[1mShow[0m[38;5;12m [39m[38;5;12m(https://datastackshow.com/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mshow[39m[38;5;12m [39m[38;5;12mwhere[39m[38;5;12m [39m[38;5;12mthey[39m[38;5;12m [39m[38;5;12mtalk[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mengineers,[39m[38;5;12m [39m[38;5;12manalysts,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mscientists[39m[38;5;12m [39m[38;5;12mabout[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mexperience[39m[38;5;12m [39m[38;5;12maround[39m[38;5;12m [39m[38;5;12mbuilding[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmaintaining[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12minfrastructure,[39m[38;5;12m [39m[38;5;12mdelivering[39m
-[38;5;12mdata[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mproducts,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mdriving[39m[38;5;12m [39m[38;5;12mbetter[39m[38;5;12m [39m[38;5;12moutcomes[39m[38;5;12m [39m[38;5;12macross[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mbusinesses[39m[38;5;12m [39m[38;5;12mwith[39m[38;5;12m [39m[38;5;12mdata.[39m
+[38;5;12m-[39m[38;5;12m [39m[38;5;14m[1mThe[0m[38;5;14m[1m [0m[38;5;14m[1mData[0m[38;5;14m[1m [0m[38;5;14m[1mStack[0m[38;5;14m[1m [0m[38;5;14m[1mShow[0m[38;5;12m [39m[38;5;12m(https://datastackshow.com/)[39m[38;5;12m [39m[38;5;12m-[39m[38;5;12m [39m[38;5;12mA[39m[38;5;12m [39m[38;5;12mshow[39m[38;5;12m [39m[38;5;12mwhere[39m[38;5;12m [39m[38;5;12mthey[39m[38;5;12m [39m[38;5;12mtalk[39m[38;5;12m [39m[38;5;12mto[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mengineers,[39m[38;5;12m [39m[38;5;12manalysts,[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mscientists[39m[38;5;12m [39m[38;5;12mabout[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mexperience[39m[38;5;12m [39m[38;5;12maround[39m[38;5;12m [39m[38;5;12mbuilding[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mmaintaining[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12minfrastructure,[39m[38;5;12m [39m[38;5;12mdelivering[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mand[39m[38;5;12m [39m[38;5;12mdata[39m[38;5;12m [39m[38;5;12mproducts,[39m[38;5;12m [39m
+[38;5;12mand[39m[38;5;12m [39m[38;5;12mdriving[39m[38;5;12m [39m[38;5;12mbetter[39m[38;5;12m [39m[38;5;12moutcomes[39m[38;5;12m [39m[38;5;12macross[39m[38;5;12m [39m[38;5;12mtheir[39m[38;5;12m [39m[38;5;12mbusinesses[39m[38;5;12m [39m[38;5;12mwith[39m[38;5;12m [39m[38;5;12mdata.[39m