R: A Predictive Analytics and statistical Data Mining Tool

  • Jayalakshmi R Assistant Professor, Department of Computer Science,, Sri Vidya Mandir Arts and Science College, Katteri, Uthangarai, Tamilnadu, India.
Keywords: R Tool, Big Data, Data Mining, Clustering, Classification, K-Means, C5.0


Big data is an emergent technology which entirely describes the storage of unstructured data with sizes more than Terabytes. Obviously it has a massive volume of unstructured data that is so large which is difficult to be processed by traditional database and software techniques. Due to growth in big data analytics, various fields of research and industries require efficient data mining tools to obtain relevant information from various databases. Thus data mining, big data, machine learning algorithms are all linked with each other. Big Data are very complex in nature and thus mining them is not an easy job. Thus the need of efficient data mining tool. R tool has become the foremost well-liked language for information science and it's a necessary tool for finance and analytics. R provides different dimensions to statistical analysis of data sets. This paper explores the aspects of R and R studio along with the overview of big data and data mining and also demonstrates the implementation of k-means clustering and decision tree classifier algorithms.