Posts

Graph NoSQL Database

Graph NoSQL Database

A graph database is a NoSQL database that organizes data as nodes, which are like records in a relational database, and relationships, which represent connections between nodes. Because the graph system stores the relationship between nodes, it can support richer representations of data relationships. Relationships are the key concept in graph databases, representing an abstraction that is not directly implemented in RDBMS or other NoSQL databases. Primarily, graph databases are applied in systems that share relationships between values, such as social networks, reservation systems, fraud detection, or customer relationship management systems. And graph databases address significant limitations of existing relational database management systems (RDBMS).

Graph databases, by design, allow simple and fast retrieval of complex hierarchical structures that are difficult to model with RDBMS.  They allow for simple queries that display the nearest neighboring nodes.  And they allow for complex queries that explore vast networks of connections and quickly find patterns in the connections.  Flexible structure enables graph databases to accommodate complex data that doesn’t conform to rigid data models required for RDBMS implementations.

Graph databases contain four types of data fields (nodes, relationships, properties, & labels):

•  Nodes: Objects that represent data entities or instances such as people, businesses, accounts, products or any other item to be tracked. They are roughly the equivalent of the record or row in a relational database, or the document in a document-store database. Each node contains several pieces of information that go together. For example, a single node might include a product name, description, price and product code. Another might have information about a customer, such as name and account number.
•  Relationships: Objects that describe how the nodes relate to each other. Relationships represent the connections, edges, or lines between nodes to other nodes. A relationship connects two nodes and enables users to find related nodes. A relationship always has a source node and a target node that provides the direction of the arrow. Meaningful patterns can emerge when examining the connections and interconnections of nodes.
•  Properties: Additional attributes of both nodes and relationships that are represented as additional key-value pairs. Properties store relevant data about the node or relationship with the entity it describes. Examples of priorities for a node with a label of person include name, age, address, & date of birth. Relationships usually have properties including time, distance, cost, rating or weights which are also stored as key-value pairs.
•  Labels: Named graph construct that is used to group nodes into sets, and all nodes with the same label belongs to the same set. Many database queries can work with these sets instead of the whole graph, making queries easier to write and more efficient to execute. A node may be labeled with any number of labels, including none, making labels an optional addition to the graph.

Each node in the graph database model directly and physically contains a list of relationships that represent the connections to other nodes. Unlike traditional RDBMS, graph databases do not utilize foreign keys or join operations. Instead, all relationships are natively stored within vertices.

Graph databases are purpose-built for the analysis of interconnections and relationships of data entities. This design relates well to analysis of data retrieved from social media, web, and mobile applications,. Graph databases are also useful for working with data in business disciplines that involve analyzing complex relationships and dynamic schema, such as supply chain management, customer relationship management, law enforcement intelligence, and fraud detection.

Share

What is NoSQL?

Definition of NoSQL

Recently, NoSQL databases have been developed that provide a high-performance and salable alternative to more traditional relational database management systems (RDBMS), especially when dealing with large amounts of unstructured or semi-structured data. NoSQL, which stands for “Not Only SQL” (https://en.wikipedia.org/wiki/NoSQL) is unlike RDBMS as it is designed for processing large collections of distributed data that don’t fit well into strict rows and columns. And NoSQL databases are ideal solutions for implementations of Big Data initiatives. Moreover, the substantial increase in amount, speed, and variation of Big Data in recent years has greatly increased the need for deployments of NoSQL databases. While traditional RDBMS are very useful for the processing of highly-structured data, NoSQL databases typically accommodate either semi-structured data, fully-unstructured data, documents, graphs, or dynamic schema.  And NoSQL databases are now widely recognized for their ease of development, functionality, and performance at scale.

The term NoSQL can be applied to some databases that were available before traditional RDBMS, but more often the term refers to databases developed in the mid to late 2000s for the purpose of large-scale database processing within web and mobile based applications. Within these emerging applications, requirements for performance and scalability outweighed the conventional requirement for the rigid data consistency that existing RDBMS provided to transactional applications.  Subsequently, NoSQL databases for web applications have tended to focus on very specific characteristics of data management. The ability to process very large volumes of data and quickly distribute that data across computing processors and clusters has been very desirable in large-scale web application design. There has also been a greater need for flexible data schema, or no schema at all, in order to better implement rapid changes to applications.

An advantage of NoSQL databases over traditional RBMS is that they store and manage data in ways that allow for high operational speed and great flexibility on the part of system developers. In addition, data can be stored in a schema-less or free-form fashion. Any data can be stored in any record. And unlike traditional RDBMS, many NoSQL databases can be scaled horizontally across hundreds or thousands of commodity servers. And NoSQL databases typically utilize lower amounts system memory than RDBMS. This allows for NoSQL databases to achieve much higher performance than traditional RDBMS.

 

NoSQL Databases Typically Contain the Following Types of Data:

•  Semi-structured Data:  CSV, Word, Excel, PowerPoint, Documents, PDFs, Logs, XML, JSON
•  Unstructured Data:  Emails, Text, Messages, Blog Entries, Twitter
•  Binary Data:  Graphics, Images, Audio, Video

 

NoSQL Database Types

•  Key-Value Stores: A simple data storage system that pairs a unique key with an associated value.  Typical uses include: dictionaries, image stores, document/file stores, query cache, lookup tables.
•  Document Stores:  Data stores that pair each key with a complex data structure known as a document.  Documents are typically semi-structured either in XML or JSON formats.  Typical uses include: MS Word documents, MS Excel documents, spreadsheets, presentations, PDF files, sales orders, invoices, product descriptions, web pages, forms.
•  Graph Stores:  Data stores that organize data as nodes, which are like records in a relational database, and edges, which represent connections between nodes.  Typical uses include: social networks, fraud detection, pattern matching, relationship-heavy data.
•  Wide Column / Column Family Stores:  Data stores that have the ability to hold very large numbers of dynamic columns. But unlike a relational database, the names and format of the columns can vary from row to row in the same table.  Typical uses include: web crawling, large sparsely populated tables, highly-adaptive systems, high-variance systems.
•  Native XML Databases:  Data stores that allow data to be stored in the extensible markup language (XML) format, a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. XML databases are a sub-category of document stores.
•  Search Engines:  Information retrieval systems designed to help find information stored on a computer system.
•  Multi-Modal Databases:  Data stores that contain aspects of multiple types of NoSQL database all within one product.

 

NoSQL Database Products (2018)

 

Share