Cloud Computing is a technology aimed at processing and storing very large amounts of data, which are also referred to as Big Data. One of the most important challenges in Cloud Computing is how to process Big Data. By the end of 2012, the total data generated was 2.8 Zettabytes (ZB) (2.8 trillion Gigabytes). One of the areas contributes to the analysis of Big Data is Data Science. This new study area, called Big Data Science (BDS), has recently become a very important topic in organizations because of the value it can generate, both for themselves and for their customers.
One of the challenges in implementing BDS is the current lack of information to help in understanding this new study area. In response, UnderCloud uses the DIPAR framework, which proposes a means to implement BDS in organizations, and defines its requirements and elements. The framework consists of five stages: Define, Ingest, Preprocess, Analyze, and Report, and is based on the ISO 15939 Systems and software engineering – Measurement process standard, the purpose of which is to collect, analyze, and report data relating to products to be developed.