NoSQL Types

KV DB: redis, dynamo Document DB: CouchDB, MongoDB Wide-Column DB: column families containers for rows Graph DB: relationship

Storage

SQL; schema on write, fixed schema; schema can be changed, but which will modify the whole database and DB going offline

NoSQL: dynamic schema, columns added on the fly

Query

SQL: DDL, DML NoSQL:

Scalability:

SQL: vertically NoSQL: horizontally

Most of the NoSQL solutions sacrifice ACID compliance for performance and scalability.

TradeOff

SQL:

  1. transcation and ACID: for many e-commerce and financial applications, an ACID-compliant database remains the preferred option.
  2. data structured and unchanging, bussiness not massive growing, or only work with data consistent

NoSQL:

  1. large volume, data little to structure. no limit on type of data, add new types when needed.
  2. cloud-based storage: cost-saving, data spread servers; utilize the commodity HW onsite, save hassle of additional SW and No SQL DB like Cassandra,
  3. Rapid dev, update data structure without down time.

  4. structured data or non structured data
  5. query pattern
  6. scale to handle

Caching system

  1. key -> value storage
  2. redis, memcache,

FileStorage System

  1. amazon: product image/videos, netflix:video
  2. blob storage: is not a DB, DB is used to query, but files just serve it as it is,
  3. cheap one: s3
  4. CDN: distribute the blob obj over geographically

Text Search Engine, not DB

  1. correct spelling: fuzzy search(Airport <- Airprot)
  2. Edit distance,
  3. ES, solr, lucene
  4. no guarantee for data, so cannot be used as the source of truth

metrics tracking system

  1. throuput, CPU utilization, latency
  2. Grafana, Prometheus
  3. TSDB: extension of RDB,
  4. Input patten: not need random update/access -> sequential update in append-only mode, T1 -> T2 -> T3
  5. Query Pattern: bulk read in a time-range, not random read

Analytics Requirement

  1. analyze all transaction of whole company, revenue,etc,
  2. -> reports, not for transaction, but for offline-reporting,

  3. Structure or not
    1. yes -> RDBMS
      1. ACID:
        1. payment system:
          1. transaction boundary for two transactions.
          2. consistency

Catalog system:

  1. different product, totailly different attribute -> json data
  2. e-commerse platform
  3. query on json/random attribute, cannot be done by relational Db,

various of data types, and various of query -> Document DB

  1. MongoDB, couchDB

ever-increasing DB + finite queries

  1. more than a linear fashion,
  2. Cassandra HBase

Hybrid method; order placement system, before order, in RDBMS after can move to Columnar DB

mix the Document DB to support the search