Modern Big Data Processing with Hadoop
上QQ阅读APP看书,第一时间看更新

Data transformation

This is also a very important phase, where we enforce the standards defined by the enterprise to define the final data output. For example, an organization can suggest that all the country codes should be in ISO 3166-1 alpha-2 format. In order to adhere to this standard, we might have to transform the input data, which can contain countries with their full names. So a mapping and transformation has to be done.

Many other transformations can be performed on the input data to make the final data consumable by anyone in the organization in a well-defined form and as per the organizations standards.

This step also gives some importance to having an enterprise level standard to improve collaboration.