Icono del sitio BI-Spain.com

Integrar R y Hadoop para análisis de Big Data. Publicación Universitaria.

 Instead, big data requires large clusters with hundreds or even thousands of computing nodes. Official statistics is increasingly considering big data for deriving new statistics is increasingly considering big data for driving new statistics because big data sources could produce more relevant and timely statistics than traditional sources.

 
One of the software tools successfully and wide spread used for storage and processing of big data sets on clusters of commodity hardware is Hadoop.
 
Hadoop framework contains libraries, a distributed file-system, a resource-management platform and implements a version of the MapReduce programming model for large scale data processing.
 
In this paper we investigate the possibilities of integrating Hadoop with R wich is a popular software used for statistal computing and data visualization.
Salir de la versión móvil