Document store NoSQL databases

NoSQL (Not only Structured Query Language) databases are databases that are used to store data in non-relational databases i.e. graphical, document store, column-oriented, key-value, and object-oriented databases (Sadalage & Fowler, 2012; Services, 2015). NoSQL databases have benefits as they provide a data model for applications that require a little code, less debugging, run on clusters, handle large scale data and evolve with time (Sadalage & Fowler, 2012). Document store NoSQL databases, use a key/value pair that is the file/file itself, and it could be in JSON, BSON, or XML (Sadalage & Fowler, 2012; Services, 2015).  These document files are hierarchical trees (Sadalage & Fowler, 2012).

Parts of the documents could be updated in real-time this type of NoSQL database allows for easy creation and storage of dynamic data like website page views, unique views, or new metrics (Sadalage & Fowler, 2012).  To help speed up the search of a document store NoSQL database like content in multiple web pages, or store log file, indexes can be created (Services, 2012). These indexes could be stored as attributes, such as a “state,” “city,” “zip-code,” etc. attributes, which can have the same, different, or null values in the NoSQL database and it each of these is allowed (Sadalage & Fowler, 2012).

If you want to insert, update, or delete (also known as a transaction) data in a NoSQL database, it will either succeed or fail, it won’t have the ability as traditional databases to either commit or rollback (Sadalage & Fowler, 2012). Only two of the three features can exist according to CAP Theory (Consistency, Availability, and Partition Tolerance), and document store databases primarily focus on availability through replicating data in different nodes (Hurst, 2010; Sadalage & Fowler, 2012).  Some key players in the document store database realm are CouchDB, MongoDB, OrientDB, RavenDB, and Terrastore (Sadalage & Fowler, 2012).  This discussion will focus on both CouchDB and MongoDB; which are open-sourced code that allows having scalability features (CouchDB, n.d.; MongoDB, n.d.; Sadalage & Fowler, 2012).

CouchDB is an Apache code available for Windows, Linux, and Mac OS X and it is also:

  • AP database system (Hurst, 2010)
  • AP systems can achieve consistency if data can be replicated and verified (Hurst, 2010)
  • Globally distributed server cluster to allow for accessing data and implementing projects anywhere through a data replication protocol (CouchDB, n.d.)
  • Data can be stored on a single or clustered server, via locally on the company’s servers, virtual machines, Raspberry Pi servers, or on a cloud provider (CouchDB, n.d.)
  • Allows for offline end user experience (CouchDB, n.d.)
  • Can use MapReduce for deriving insights from the data (CouchDB, n.d.)
  • Uses HTTP protocol and JSON data (CouchDB, n.d.)
  • Only allowing for appending data helps create a crash-resistant data structure (CouchDB, n.d.)

MongoDB is code available for Windows, Linux, Mac OS X, Solaris, etc.:

  • CP database system (Hurst, 2010)
  • CP systems have issues keeping data available across all nodes through their replication system (Hurst, 2010; Sadalage & Fowler, 2012)
  • Used by companies like Expedia, Forbes, Bosch, AstraZeneca, MetLife, Facebook, Urban Outfitters, sprinklr, the guardian, Comcast, etc., such that 33% of the Fortune 100 are using it (MongoDB, n.d.)
  • Has an expressive query language and secondary indexes out of the box to help access and understand data stored within its database, which is easier to use and requires fewer lines of code (MongoDB, n.d.; Sadalage & Fowler, 2012)
  • Allows for a flexible data model that evolves with time as the data stored in it evolves (MongoDB, n.d.)
  • Allows for integration of silo, internet of things, mobile, catalog data to help provide real-time analytics (MongoDB, n.d.)

References

  • CouchDB (n.d.). CouchDB, relax. Apache. Retrieved from http://couchdb.apache.org/
  • Hurst, N. (2010). Visual guide to NoSQL systems. Retrieved from http://blog.nahurst.com/visual-guide-to-nosql-systems
  • MongoDB (n.d.). MongoDB, for giant ideas. Retrieved from https://www.mongodb.com/
  • Sadalage, P. J., Fowler, M. (2012). NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence, 1st Edition. [Bookshelf Online].
  • Services, E. E. (2015). Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, 1st Edition. [Bookshelf Online].
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s