Parallel R: Data Analysis in the Distributed World

Author:

Q Ethan McCallum

Publisher:

Shroff/O'Reilly

Rs275

Availability: Available

Shipping-Time: Usually Ships 5-9 Days

    

Rating and Reviews

0.0 / 5

5
0%
0

4
0%
0

3
0%
0

2
0%
0

1
0%
0
Publisher

Shroff/O'Reilly

Publication Year 2012
ISBN-13

9789350236802

ISBN-10 9789350236802
Binding

Paperback

Number of Pages 140 Pages
Language (English)
Dimensions (Cms) 24 X 18 X 1
Weight (grms) 500
It's tough to argue with R as a high-quality, cross-platform, open source statistical software productóunless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets, including three chapters on using R and Hadoop together. You'll learn the basics of Snow, Multicore, Parallel, Segue, RHIPE and Hadoop Streaming, including how to find them, how to use them, when they work well and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier. Snow: works well in a traditional cluster environment Multicore: popular for multiprocessor and multicore computers Parallel: part of the upcoming R 2.14.0 release R+Hadoop provides low-level access to a popular form of cluster computing RHIPE: uses Hadoop's power with R's language and interactive shell Segue: lets you use Elastic MapReduce as a backend for lapply-style operations

Q Ethan McCallum

Q Ethan McCallum is a consultant, writer and technology enthusiast, though perhaps not in that order. His work has appeared online on The O’Reilly Network and Java.net and also in print publications such as C/C++ Users Journal, Doctor Dobb’s Journal and Linux Magazine. In his professional roles, he helps companies to make smart decisions about data and technology.
No Review Found
More from Author