Iycee Charles de Gaulle Summary Sanjay are well suited in dealing with

Sanjay are well suited in dealing with

Sanjay Tanwani1
and Amit Kanojia2

1 School of
Computer Science & IT, Indore, India

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

2 Department of Computer
Science, M.J.Govt.Girls PG College, Indore, India

 

Abstract- The rapid growth
in data volume, complexity, variety and velocity of data in organizations, need
for handling unstructured data is increasing continuously.  NoSQL databases are well suited in dealing
with big data applications.  The enormous
amount of data generated on web is highly unstructured in nature.  Relational database are designed to manage
structured data and is not capable of managing unstructured data and high data
volume.  This paper presents comparative analysis of an
Oracle Database and NoSQL document oriented database management system –
MongoDB.  The comparison depicts key
features, theoretical differences, restrictions and focuses on basic CRUD operations in MogoDB

 

Key Words- Big data, NoSQL, MongoDB,
RDBMS, crud

 

I.       
Introduction

The term NoSQL was first introduced by
Carlo Strozzi in year 1998.  NoSQL stands
for “Not Only SQL”.  The rapid growth of data
and having massive amount of data that comes out every day from the web and business
applications become hard to handle for RDBMS. 
This has added interest to alternatives to RDBMS.  NoSQL databases are defined as distributed,
horizontally scalable and open source. 5

 

Relational database management systems
define fixed schema and data is inserted strictly according to schema.  NoSQL databases are built to allow the
insertion of data without predefined schema, which makes it easy to make
significant application changes in real time and makes development faster.  NoSQL databases are high performance,
scalable systems 1.  It is difficult to
handle both the size of data and concurrent actions on data within standard
RDBMS.  Some of the reasons to employ
NoSQL technique are scalability, high availability; distribute architecture
support, flexible schema, varied data structure, fault tolerance and
consistency. 

 

MongoDB is an open source project held
by the 10gen.company. It is a document-oriented, schema-less database, which
stores data in BSON (Binary JSON) format. 
MongoDB can deal with structured semi structured and unstructured data
unlike RDBMS. MongoDB documents can vary in structure. Fields can vary from
document to document. Similar documents are stored in collections. Here, collection
corresponds to a table and document corresponds to a record.
MongoDB can add, remove or change a field for a document without affecting
other documents in the same collection. This saves the expensive ALTER table
operations that can lead to redesigning the entire set of schemas and the
migration of existing database to the new schema.

 

MongoDB documents hold all data for a
given record in a single document as against relational databases where data
for a single record is spread across different tables. Therefore data in
MongoDB is more localized, which reduces the need to JOIN separate tables 3.
Joins are avoided in MongoDB by embedding documents within the document. The
result is increased performance and scalability as a single read to the
database can retrieve the entire document. MongoDB also provides horizontal
scalability by a technique called Auto sharding and therefore chances of any
node failure are almost nil. Most of the research studies reveal that MongoDB
is much faster than MS SQL in writing (inserts/updates) and reading (retrieval)
1

 

II .   No SQL Databases (Classification)

 

NoSQL
databases are classified as6 –

i.                    
Document
oriented store

ii.                   
Key-value
store

iii.                 
Column
oriented store

iv.                 
Graph
oriented store

 

A.
Document-Oriented

Document-Oriented stores are similar to
Key-Value stores with the distinction that values are visible and can be
queried. Data formats such as JSON or XML are used to store document-oriented
datasets. Document stores provide flexible schema so there is no restriction
for documents to have the same information or schema. Unlike Key-Value store,
it offers the indexing and querying based on values.  These databases store their data in form of
documents in the databases. Here the documents are recognized by a unique set
of keys and values which are almost same as there in the Key Value databases.
Document Stores Databases are schema free and are variable in nature.614

 

Other characteristics of
Document-Oriented stores are horizontal scalability and sharding across the
cluster nodes. Examples of some Document- Oriented stores are MongoDB, Amazon
DynamoDB, CouchDB, CouchBase, MarkLogic, OrientDB, Rethink DB, Cloudant, RavenDB
and Microsoft Azure DocumentDB 6.

 

B.
Key-Value

Key-Value Stores as the name
suggests is a combination of two entities: Key and Values.  It is one of the traditional databases that
has given birth to all the other databases of NoSQL. It has a concrete
application programming interface (API) and allows its users to store data in a
schemaless manner. The stored is in two parts: 
Key is a unique identifier to a particular data entry. Key should not be
repeated if one used that it is not duplicate in nature. Value is a kind of data
that is pointed by a key. 14

 

Key-Value store is the least complex
storage paradigm amongst NoSQL databases. Key-Value Stores provide best
performance on basic CRUD (Create, Read, Update and Delete) operations. They
also provide scalability and sharding across cluster nodes. Sharding is a
horizontal partitioning technique used to partition large amount of data into
smaller and easily manageable parts/shards. However, Key-Value databases are
less flexible for querying and indexing complex and connected data. Queries for
this category are usually based on keys rather than values. Examples of some
Key- Value stores are Redis, Memcached, Riak KV, Hazelcast, Ehcached, OrientDB,
Aerospike, Amazon simple DB etc.6

 

C.
Column-Oriented

Column oriented databases are also
referred as column family databases. Column oriented stores are feasible when
there is a need to handle sparse and large amount of data. Column stores in NoSQL are basically hybrid row/column
store unlike pure relational column databases. Although it makes use of the
columnar extensions but rather storing data in the tables it stores them in
extensively distributed architecture. Columns are grouped according to the
relationship of data. In column stores, each key is associated with one or more
attributes (columns). A Column oriented data storestores its data in such a
fashion that it can be aggregated rapidly with less I/O activity. It focuses on
high scalability in data storage. The data is stored in the sorted sequence of
the column family.

 

In the comparison of row oriented
databases, column oriented databases have better capabilities to manage data
and storage space. Horizontal scalability is one of its trending
characteristics. Some prominent examples of column oriented databases include bloging
and event logging etc. Examples of column-oriented stores are Hbase, Accumulo,
Hypertable, Google Cloud Bigtable, Sqrrl, ScyllaDB, MapR-DB614

 

D.
Graph-Oriented

Graph databases evolved from the Graph
Theory which is designed to represent entities and their relationships as nodes
and edges respectively. The graph consists of nodes and edges, where nodes act
as the objects and edges act as the relationship between the objects. Graph
databases replace relational tables with structured relational graphs of
interconnected key-value pairings. The graph also consists of properties
related to nodes. It uses a technique called index free adjacency i.e. every
node consists of a direct pointer which points to the adjacent node. Millions
of records can be traversed using this technique. In a graph database, focus is
on the relation established between data using pointers. Graph databases
provides schema less and efficient storage of semi structured data. The queries
are expressed as traversals, thus making graph databases faster than relational
databases. It is easy to scale and whiteboard friendly. Graph databases support
ACID axiom and support rollback14.  As
graphs have an expressive power and strong modeling characteristics thus every
scenario from the real world can be represented as graphs and it is possible to
model in graph database as well. Graph data can be queried more efficient
because intensive joins are not necessarily required in graph query languages. 6

 

Fig.
1 NoSQL database types

III.              
Comparison
-Oracle and MongoDb

 

MongoDB is a NoSQL database management system
released in 2009. It stores data as JSON-like documents with dynamic schemas (the
format is called BSON).   NoSQL is a
class of database management system different from the traditional relational
databases in that data is not stored using fixed table schemas.  Mainly its purpose is to serve as database
system for huge web-scale applications where they outperform traditional
relational databases

 

MongoDB focussed on four factors:
flexibility, power, speed and ease of use. 
It supports indexing and it offers multiple programming languages
drivers.  Database model for MongoDB is
schemaless document oreinted wherease Oracle database supports relational
model. Oracle databases possesses a standarnd query language SQl while MongoDB
supports API calls.

 

MongoDB has aggregation functions. A
built-in map-reduce function can be used to aggregate large amounts of data.  MongoDB accepts larger data. The Oracle
database supports maximum value size 4KB whereas MongoDB has maximum value size
16 MB.  The integrity model used by
Oracle Database is ACID, while MongoDB uses BASE. MongoDB offers consistency,
durability and conditional atomicity. Oracle Database offers integrity features
that MongoDB doesn’t offer like: isolation, transactions, referential integrity
and revision control.  In manners of
distribution both MongoDB and Oracle Database are horizontal scalable and have support
for data replication. While MongoDB offers sharing support, Oracle Database
doesn’t.  Both MongoDB and Oracle
Database are cross platform database management systems. Oracle Database was
written in C++, C and Java, while MongoDB was written in C++. MongoDB is a freeware
product, while licencence is needed to use Oracle databases.  17.

 

A.     
Features of Mongodb

•       MongoDB provides high performance.

•       Has rich query language, support all major CRUD
operations, and provides Aggregation features.

•       MongoDB provides High availability with Auto
Replication feature.  Data is restored
through backup (replica) in case failure of server.

•       Provides automatic failover mechanism

•       Sharding is major feature due to which horizontal
scalability is possible.

•       A record in MongoDB is a document

•       Holds collections of documents

B.     
advantages
of Mongodb

•       MongoDB  is simple
and very easy to install and setup.

•       MongoDB is a schema-less database.

•       The document query language supported by MongoDB
plays a vital role in supporting dynamic queries.

•       Very easy to scale.

•       In MongoDB no complex joins are needed. Becauses
data stored in BSON format – key value pair way.

•       It useds internal memory for storage of data due to
this faster access of the data is possible in MongoDB.

•       In MongoDB enhancement in performance can be done
easily compared to any relational databases.

•       No need of mapping the application objects to the
data objects.

•       MongoDB support Sharding results in the horizontal
scaling.  Relational databases support
vertical scaling.

 

Table 2 Comparison of MongoDB and Oracle 14

Key Feature

Oracle

MongoDB

Data Model

Data
Stores in form of tables.  Follow fixed
schema structure.

Follow
Document based model for representing the data. It is schema less and can
handle unstructured data efficiently

Scalability

Providing
both vertical as  well as horizontal scalability

Provide
an effective horizontal scalability

Transaction reliability

follow
ACID rule hence are more reliable

follow
BASE rule

Complexity

More
Complex

Less Complex

Security

Very secure
mechanism

Less Secure

Crash Recovery

Ensure
crash recovery through its ACID properties

depends
on replication as back up to recover from crash.

Cloud

Not
suitable for cloud applications

Suitable for cloud applications

Big Data Handling

Unable
to handle big data problem

Designed
to deal with the Big Data problem effectively.

 

IV . Crud Operations

 

This
section focuses on the basic operations of CRUD. Two databases, one using
Oracle and one in MongoDB are created to compare the way that data will be
created, selected, inserted and deleted in both databases 21.  MongoDB is a fast responding database
management system. If you want a simple database that will respond very fast, MongoDB
is best choice.  MongoDB support all
major CRUD operations, and provides Aggregation features.  Following are the major CRUD operations – 

 

Table 3 CRUD Operations

Operations

Oracle

MongoDB

Create Table

CREATE TABLE Accounts (first_name`
VARCHAR(64) NULL , `last_name` VARCHAR(45) NULL , PRIMARY KEY (`id`) );

db.accounts.insert({
name:”abc”, age:26, address:”indore”})

Delete
a Table

Drop table accounts;

db.accounts.drop()

Insert

Insert into accounts( name, age,
address ) VALUES ( “abc”, 26, “indore”)

db.accounts.insert({
name:”abc”, age:26, address:”indore”})

Select

Select * from accounts

db.accounts.find()

Select fields

Select first_name, last_name  from accounts

db.accounts.find({ }, {
first_name: 1, last_name: 1 })

Conditional Select

Select * from Accounts where dep_wid=”D”
and balance>5000

db.accounts.find({dep_wid:”d”,
balance:{$gt:5000}})

Ordered Select ascending

Select * from accounts order
by user_id asc

db.accounts.find({}).sort({user_id
: 1})

Ordered Select descending

Select * from accounts order
by user_id desc

db.accounts.find({}).sort({user_id:
-1 })

Select with count

Select count(*) from users

db.articles.count()

Update

update table student set
section=”F”  where marks<30; db.Student.update({marks:{lt:30}}, {$set:{Section:"F"}}) Delete delete from Student db.Student.remove( ) Delete with condition delete from Student where section="a" db.student.delete({section:"a"})   V . Related Work             Several database technologies were developed to handle the present explosive growth of data. Many NoSQL databases evolved over time like Mongo DB, Cassandra, Hbase, Couch base etc for dealing huge unstructured data. This paper analyzes the deployment of MongoDB- a popular NoSQL database in different industrial application areas for the better understanding of its scope and to explore the reasons for employing MongoDB.  Unstructured big data related web or mobile application that requires horizontal scaling and which needs fast and rich querying capabilities, MongoDB is the mostly preferred NoSQL database.1 As the number of records in document database increases, the difference between the execution time taken by different databases for the computation of different database operations is what we are looking for.  For the data retrieval operation, data updation, data creation operation and data deletion the performance of which NoSQL document database is better for the different numbers of records or as the number of records increases.  So far relational databases are used for storing the data for the applications but now there is need to store huge amount of data to store and manage which cannot stored by relational databases. NoSQL technology over comes this problem. The operations are performed to explore the results as distinguish between both NoSql databases. The study shows the performance of Mongodb and CouchDB. Results prove that CouchDB is more powerful than Mongodb to load and process on big data and processing very fast as compare to Mongodb. 2   NoSQL systems are relatively new and most of them implement their own query language or interface. Developers need to learn to use these constructs. If a company needs to train its employees a new technology this also adds to the costs of the database system. Eventually a query language for NoSQL data stores.  One should carefully research if NoSQL database are reasonable to use in his application scenario. However, there is no sign of NoSQL databases disappearing. In any case we therefore need to carefully monitor these systems, as they will become more mature and will surpass traditional relational database systems in even more domains. Because of the vast amount of available NoSQL data stores there will be some consolidation in the market eventually.413   Developers have to evaluate their data in order to identify a suitable data model to avoid unnecessary complexity due to transformation or mapping tasks. Queries which should be supported by the database have to be considered at the same time, because these requirements massively influence the design of the data model. Since no common query language is available, every store differs in its supported query feature set. Afterwards, developers have to trade between high performance through partitioning and load balanced replica servers, high availability supported by asynchronous replication and strict consistency. If partitioning is required, the selection of different partition strategies depends on the supported queries and cluster complexity. Beside these different requirements, also durability mechanism, community support and useful features like versioning influence the database selection. In general, key value stores should be used for very fast and simple operations, document stores offer a flexible data model with great query possibilities, column family stores are suitable for very large datasets which have to be scaled at large size, and graph databases should be used in domains, where entities are as important as the relationships between them.8   NoSQL databases are database management system which uses few or no SQL commands to query, store and delete data.  They are used for situations on which traditional relational database managements were not designed for, such as horizontal scaling and storing large amount of complex objects, which are difficult to store on tables.  Nasal has some advantages to be used for large amount of data.  Nasal may be good option applications which deal to large transactions to persist complex data objects. 7   NoSQL databases different in many aspects from traditional databases like structured schema, transactions methodology, complexity, crash recovery and dealing with storing big data which the feature lead to use NoSQL in cloud computing and may be data warehouse.  NoSQL has shortage in security mainly because their designer focuses on other purposes than security and generally the NoSQL databases solution still fresh it didn't reach the full maturity yet, for all that we can find many security vulnerabilities in it.1215   VI.    Conclusion This paper explores NoSQL databases, its types, key features and need. Comparing these with relational databases and list various advantages and features of NoSQL databases. Also the comparative study of Oracle Database and NoSQL MongoDB has been presented.   Basic CRUD operations in MogoDB and Oracle are being analyzed.   VII.             Future Work MongoDB is well suited for big data applications and also satisfying the needs of this digital world, but still lacks maturity compared to relational databases. Relational Databases have a standard development process.  NoSQL lacks standard development methodology. In future there is an exigent need of investigating development methodologies for NoSQL databases also.   References 1.   Abraham, Sunu Mary. "Comparative Analysis of MongoDB Deployments in Diverse Application Areas." International Journal of Engineering and Management Research (IJEMR) 6.1 (2016): 21-24. 2.   Chauhan, Ashutosh Singh, Anjali Kedawat, and Pooja Parnami. "An Approach to Implement Map Reduce with NoSQL Databases." 3.   Das, T. K., and P. Mohan Kumar. "Big data analytics: A framework for unstructured data analysis." International Journal of Engineering Science & Technology 5.1 (2013): 153. 4.   Eckerstorfer, Florian. "Performance of NoSQL Databases." (2011). 5.    Faraj, Azhi, Bilal Rashid, and Twana Shareef. "Comparative study of relational and non-relations database performances using Oracle and MongoDB systems." Journal Impact Factor 5.11 (2014): 11-22. 6.   Farooq, Hina, Azka Mahmood, and Javed Ferzund. "Do NoSQL Databases Cope with Current Data Challenges." 7.   Franco, M., and M. Nogueira. "Using NoSQL Database to Persist Complex Data Objects." Instituto de Informatica, Universidade Federal de Goias (UFG), VIII Seminário de Pós-Graduação da UFG-Mestrado (2011). 8.   Hecht, Robin, and Stefan Jablonski. "NoSQL evaluation: A use case oriented survey." Cloud and Service Computing (CSC), 2011 International Conference on. IEEE, 2011. 9.   Heripracoyo, Sulistyo, and Roni Kurniawan. "Big Data Analysis with MongoDB for Decision Support System." TELKOMNIKA (Telecommunication Computing Electronics and Control) 14.3 (2016): 1083-1089. 10.Li, Yishan, and Sathiamoorthy Manoharan. "A performance comparison of SQL and NoSQL databases." Communications, computers and signal processing (PACRIM), 2013 IEEE pacific rim conference on. IEEE, 2013. 11.Mapanga, Innocent, and Prudence Kadebu. "Database management systems: A nosql analysis." Interna-tional Journal of Modern Communication Technologies & Research (IJMCTR) 1 (2013): 12-18. 12.Mohamed, Mohamed A., Obay G. Altrafi, and Mohammed O. Ismail. "Relational vs. nosql databases: A survey." International Journal of Computer and Information Technology 3.03 (2014): 598-601. 13.Nayak, Ameya, Anil Poriya, and Dikshay Poojary. "Type of NOSQL databases and its comparison with relational databases." International Journal of Applied Information Systems 5.4 (2013): 16-19. 14.Swaroop, Pankhudi, and K. R. S. S. N. R. Vijit Gupta. "NoSQL Paradigm and Performance Evaluation." Scientific Society of Advanced Research and Social Change 3 (2016). 15.Zaki, Asadulla Khan. "NoSQL databases: new millennium database for big data, big users, cloud computing and its security challenges." International Journal of Research in Engineering and Technology (IJRET) 3.15 (2014): 403-409. 16.Zvarevashe, Kudakwashe, and Tatenda Trust Gotora. "A Random Walk through the Dark Side of NoSQL Databases in Big Data Analytics." International Journal of Science and Research 3.6 (2014): 506-509. 17.Boicea, Alexandru, Florin Radulescu, and Laura Ioana Agapin. "MongoDB vs Oracle--database comparison." Emerging Intelligent Data and Web Technologies (EIDWT), 2012 Third International Conference on. IEEE, 2012. 18.Priyanka, AmitPal. "A Review of NoSQL Databases, Types and Comparison with Relational Database." International Journal of Engineering Science 4963 (2016). 19. Gy?rödi, Cornelia, et al. "A comparative study: MongoDB vs. MySQL." Engineering of Modern Electric Systems (EMES), 2015 13th International Conference on. IEEE, 2015. 20. Simanjuntak, Humasak TA, et al. "QUERY RESPONSE TIME COMPARISON NOSQLDB MONGODB WITH SQLDB ORACLE." JUTI: Jurnal Ilmiah Teknologi Informasi 13.1 (2015): 95-105. 21.Truic?, Ciprian Octavian, Alexandru Boicea, and Ionut Trifan. "CRUD operations in MongoDB." Proceedings of the 2013 international Conference on Advanced Computer Science and Electronics Information, Ed. Atlantis Press. 2013.   Authors Profile