NoSQL Types
KV DB: redis, dynamo Document DB: CouchDB, MongoDB Wide-Column DB: column families containers for rows Graph DB: relationship
Storage
SQL; schema on write, fixed schema; schema can be changed, but which will modify the whole database and DB going offline
NoSQL: dynamic schema, columns added on the fly
Query
SQL: DDL, DML NoSQL:
Scalability:
SQL: vertically NoSQL: horizontally
Most of the NoSQL solutions sacrifice ACID compliance for performance and scalability.
TradeOff
SQL:
- transcation and ACID: for many e-commerce and financial applications, an ACID-compliant database remains the preferred option.
- data structured and unchanging, bussiness not massive growing, or only work with data consistent
NoSQL:
- large volume, data little to structure. no limit on type of data, add new types when needed.
- cloud-based storage: cost-saving, data spread servers; utilize the commodity HW onsite, save hassle of additional SW and No SQL DB like Cassandra,
-
Rapid dev, update data structure without down time.
- structured data or non structured data
- query pattern
- scale to handle
Caching system
- key -> value storage
- redis, memcache,
FileStorage System
- amazon: product image/videos, netflix:video
- blob storage: is not a DB, DB is used to query, but files just serve it as it is,
- cheap one: s3
- CDN: distribute the blob obj over geographically
Text Search Engine, not DB
- correct spelling: fuzzy search(Airport <- Airprot)
- Edit distance,
- ES, solr, lucene
- no guarantee for data, so cannot be used as the source of truth
metrics tracking system
- throuput, CPU utilization, latency
- Grafana, Prometheus
- TSDB: extension of RDB,
- Input patten: not need random update/access -> sequential update in append-only mode, T1 -> T2 -> T3
- Query Pattern: bulk read in a time-range, not random read
Analytics Requirement
- analyze all transaction of whole company, revenue,etc,
-
-> reports, not for transaction, but for offline-reporting,
- Structure or not
- yes -> RDBMS
- ACID:
- payment system:
- transaction boundary for two transactions.
- consistency
- payment system:
- ACID:
- yes -> RDBMS
Catalog system:
- different product, totailly different attribute -> json data
- e-commerse platform
- query on json/random attribute, cannot be done by relational Db,
various of data types, and various of query -> Document DB
- MongoDB, couchDB
ever-increasing DB + finite queries
- more than a linear fashion,
- Cassandra HBase
Hybrid method; order placement system, before order, in RDBMS after can move to Columnar DB
mix the Document DB to support the search