See our User Agreement and Privacy Policy. See our Privacy Policy and User Agreement for details. The most well known technology used for Big Data is Hadoop. It is actually a large scale batch data processing system. The SlideShare family just got bigger. Home Explore Login Signup. Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime. Upcoming SlideShare. Like this presentation? Why not share!
Embed Size px. Start on. Show related SlideShares at end. WordPress Shortcode. Next SlideShares. Download Now Download to read offline and view in fullscreen. Download Now Download Download to read offline. Seminar Presentation Hadoop. Hadoop technology. If you continue browsing the site, you agree to the use of cookies on this website.
See our User Agreement and Privacy Policy. See our Privacy Policy and User Agreement for details. The SlideShare family just got bigger. Home Explore Login Signup. Successfully reported this slideshow. We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads.
You can change your ad preferences anytime. PPT on Hadoop. Upcoming SlideShare. Like this presentation? Why not share! Embed Size px. Start on. Show related SlideShares at end. WordPress Shortcode. Next SlideShares. Download Now Download to read offline and view in fullscreen.
Download Now Download Download to read offline. Shubham Parmar Follow. Hadoop en accion. Seminario mongo db springdata Hadoop: MapReduce para procesar grandes cantidades de datos. Share buttons are a little bit lower. Thank you! Published by Michael Lloyd Modified over 4 years ago. Top 3 listed below. Data Services: projects that store, process, and access data in many ways Pig: Scripting language for Hadoop pig latin to analyze large sets of data.
Infrastructure layer consists of compiler that produces Map-Reduce jobs and executes them using the Hadoop cluster. Appeals to developers more familiar with scripting languages and SQL than Java. HCatalog: Metadata and table management. Enables data sharing among other tools such as Pig, MapReduce, and Hive.
Commonly used for predictions and recommendations intelligent applications. Flume: Stores log files and events. Primary use case is to move web log files into Hadoop. Operational Services — projects for operations and management Ambari: Management and monitoring. Makes clusters easy to operate and simplifies provisioning. Oozie: Workflow and scheduling. Coordinates jobs written in multiple languages and allows for specification of order and dependencies between jobs.
Lots of additional projects either Apache or company-specific. Example of HDP framework below.
0コメント