Data Analysis Example: Handling Large data volumes

Tuesday, 25 June 2013

Handling Large data volumes - PIG basics and advantages

Hadoop PIG is very efficient in handling large volumes of Data. We could effectively join files, saving a lot of CPU cycles from Teradata server.

It operates in 2 modes - Local and MapReduce.

To invoke the local mode, we type the command as:

bash$> pig -x local

The default mode is Map-Reduce mode.

bash$> pig

will invoke PIG in the Map Reduce mode.

Running any pig script can be accomplished as below:

bash$> pig TestScript.pig

or

After logging into Pig:

grunt> exec TestScript.pig

We will follow up with PIG commands and how PIG can be combined with Teradata to give great performance improvements.

Data Analysis Example

Pages

Tuesday, 25 June 2013

Handling Large data volumes - PIG basics and advantages

No comments:

Post a Comment

Knowledge Archive