The R language is most commonly used for data analysis tools and statistical applications. To provide more support for the R programming language on the Google Cloud Platform (GCP), Google announced the launch of the Spark beta on Cloud Dataproc. According to Google, the rise of cloud computing has opened up new opportunities for R.
Using GCP for R avoids the infrastructure barriers that limit the understanding of data, such as selecting the dataset to be calculated, which cannot be performed correctly due to computational or data size limitations. With GCP, you can build large models to analyze large and small data sets that previously required a lot of upfront investment in high-performance computing infrastructure, "Dataproc and Hadoop product manager Christopher Crosbie and machine learning expert Mikhail Chrestkha atBlog postWritten.
Cloud Dataproc is a managed cloud service for Apache Spark and Apache Hadoop clusters on GCP, and SparkR is a lightweight package that implements Apache Spark from R on the front end, the company explained.
"This integration allows developers using R to perform dplyr-like operations on data sets of any size stored in cloud storage. SparkR also supports distributed machine learning using MLlib. You can use this integration to handle large cloud storage data sets or perform compute-intensive jobs,” Crosbie and Chrestkha wrote.
For more ways for developers to use R on GCP, pleaseclick here.
In addition, Google has announced new improvements to App Spa's Cloud Spanner and Python 3.7.
App Engine now introduces the second generation Python runtime on GCP. According to the company, developers can now use Python Package Index or dependencies in private repositories. Cloud Scheduler and Cloud Tasks are also separated from App Engine, so developers can use these features in all GCP services.