hadoop vs rdbms

SQL server 2012 supports Hadoop. Hadoop is not a rational database system, it is even not the replacement or substitute of SQL server. Hadoop is mainly designed to deal with the unstructured and semi-structured data. SQL can save its data in form of XML as well as file stream however, it faces the problem of size moreover, and it uses high processing power as well. The Hadoop is believed to come into existence less than a decade ago and got so popular in the field of big data, it is inspired from google which was used to index the textual information.
According to Facebook (2010), Hadoop was used as largest clusters, storing more than 20 Petabytes of data. Hadoop is written in java language and are run on large clusters of servers. Adding and removing servers from clusters are easy, which means that it is Hadoop is scalable. More the number of servers, more will be processing power. To developers of Hadoop are making Hadoop platform independent, which means that Hadoop can now also be run on windows.
The Rational Databases came into existence in about 1970’s by Edgar F. Codd’s who worked in IBM, this was not very far from the launching time of home based computers. Later this concept of Relational Databases was adopted by Oracle, Informix etc. At approximately same time SQL was launched by Donald D, Raymond, and Chamberlin. SQL was given a status of language which was used to analyse the stored data. However, the Hadoop came into existence in the end of first decade of 21st century. Hadoop was firstly launched by Apache, which later was adapted by many open source providers as Cloudera, Hortonbox, etc. The differences between the two approaches are shown in table below:

1TechnologyRDBMS are databases used for storing data.Hadoop is a framework used to handle large volumes of data.
2Type of data usedRDBMS uses structured data. Provides storage and analysation of structured data in simple manner. RDBMS are not used for semi-structured or un-structured data. Hadoop uses data which are either semi-structured or un-structured and comes from variety of sources like e-mail, videos, photos, Social media posts etc. It can even join, aggregate and analyse semi-structured or un-structured easily
3StorageRational databases stored data in table and data is defined by schema. These are static in natureThe Hadoop stores its data in form of key-values pairs.
These are dynamic in nature.
4ScalabilityRDBMS only allows constant workflow. If scaling is required it adds lots of horsepower i.e. CPU and RAM to small or single dataset.Hadoop is good solution for companies and Businesses which requires variable database at all time. Hadoop requires more CPU and RAM then RDBMS but uses low power and work in parallel.
5Querying RDBMS take use of the SQL query language.Hadoop uses MapReduce programs. This Map Reduce program follows SQL- like commands.
6Size of DatasetRDBMS uses Gigabytes of dataHadoop is used for large dataset of about few petabytes.

RDBMS approach uses ACID properties where ACID in an acronym which stands for Atomicity, Consistency, Isolation and Durability, this makes RDBMS approach feasible for transactions. However, Hadoop uses BASE approach where Basic Availability, Soft state and Eventually Consistence. Apart from this Hadoop is also based on CAP theorem which is the property that is followed by NOSQL approach (Consistency, Availability, Partition tolerance). NoSQL approach is followed by MySQL and Hadoop have tool called Hive which performs similar task as MySQL.

One Comment on “RDBMS VS Hadoop”

  1. I just couldn’t depart your site prior to suggesting that I actually enjoyed the standard info a person supply in your guests? Is going to be again often in order to inspect new posts

Leave a Reply

Your email address will not be published. Required fields are marked *