1.What is Hadoop and why is it a big deal ?
Hadoop is a software framework that allows users to change the application code to customize it to their big data analytic needs. Hadoop is an Apache open source being developed by a world wide community of developers. It is based around earlier work done by Google on their MapReduce application. MapReduce allows distributed computing of large data sets on a cluster of computers. Hadoop is a big deal because it is a open source project that allows custom code that allows business to analyze complex data sets that otherwise would have been hard to make sense of using standard data tables. Hadoop is used by a lot of different big companies but most business are not ready to use it just yet because of the high level analytic expertise and training it requires.
2. Who are Cloudera?
It is a company that specializes in Apache Hadoop software and support services around it on an enterprise level they also contribute to Apache projects related to Hadoop. Cloudera offers two products; the first which is Cloudera Enterprise and the other being Cloudera's Distribution including Apache Hadoop.
3.What is PIG?
It is a high level data flow language used in conjunction with Hadoop. The language is called Pig Latin and it is a form of Java that allows for fast ad-hoc analysis of data sets. Users can create their own functions for special purpose data processing.
4.What is HIVE ?
HIVE functions as a data warehouse that allows for query based analysis of larger data sets. It uses a SQL like languange for its queries.It functions along side Hadoop files systems and just like Hadoop it is open-source and apache developed.
5. What is Cassandra?
Cassandra is an Apache open source database management system. It can handle large volumes of data that is spread out around many different servers. It started out as a way for Facebook to power their inbox search function. It uses NoSQL because traditional SQL based databases can be slow when dealing with big data sets.
6. What is Mahout ?
Mahout it is a suite of machine learning libraries that is designed to be
scalable and robust. it is another Apache open source project that is degined to work with Hadoop. Hadoop is associated with big data and Mahout is the word for a person driving an elephant. The elephant is Hadoop and Mahout wants to be the driving force behind it, but not lead the development of Hadoop.
Subscribe to:
Post Comments (Atom)
Movie Review
Ip Man
Ip Man is a overly dramatized biograpiical movie about Yip “ Ip” Man. He is the first master of Wing Chun fighting style. The movie follows Ip Man from his days just before Japan invaded China in the 1930’s and his struggle to provide for his family. The movie also is about rising up to oppression and equality between nations and races.The movie is also packed with greatly executed martial arts sequences by Donnie Yen as Master Ip
Reasons I like the movie
1.Fast paced execution of Wing Chun kung fu
2.It’s based on the life of the master and mentor of the great Bruce Lee
3.It has a great message about equality and pride, even though it might come across as pro-Chinese to some people
No comments:
Post a Comment