Hadoop Beyond MapReduce : Introducing Kitten

Started by CrackSmokeRepublican, June 29, 2012, 02:34:15 AM

Previous topic - Next topic

CrackSmokeRepublican

Keep an eye open... --CSR

QuoteHadoop Beyond MapReduce, Part 1: Introducing Kitten

    by Josh Wills
    June 26, 2012


This week, a team of researchers at Google will be presenting a paper describing a system they developed that can learn to identify objects, including the faces of humans and cats, from an extremely large corpus of unlabeled training data. It is a remarkable accomplishment, both in terms of the system's performance (a 70% improvement over the prior state-of-the-art) and its scale: the system runs on over 16,000 CPU cores and was trained on 10 million 200×200 pixel images extracted from YouTube videos.

Doug Cutting has described Apache Hadoop as "the kernel of a distributed operating system." Until recently, Hadoop has been an operating system that was optimized for running a certain class of applications: the ones that could be structured as a short sequence of MapReduce jobs. Although MapReduce is the workhorse programming framework for distributed data processing, there are many difficult and interesting problems– including combinatorial optimization problems, large-scale graph computations, and machine learning models that identify pictures of cats– that can benefit from a more flexible execution environment.
After the Revolution of 1905, the Czar had prudently prepared for further outbreaks by transferring some $400 million in cash to the New York banks, Chase, National City, Guaranty Trust, J.P.Morgan Co., and Hanover Trust. In 1914, these same banks bought the controlling number of shares in the newly organized Federal Reserve Bank of New York, paying for the stock with the Czar\'s sequestered funds. In November 1917,  Red Guards drove a truck to the Imperial Bank and removed the Romanoff gold and jewels. The gold was later shipped directly to Kuhn, Loeb Co. in New York.-- Curse of Canaan