Showing posts with label Big Data. Show all posts
Showing posts with label Big Data. Show all posts

Thursday, February 12, 2015

R Programming

R is a language and environment for statistical computing and graphics.

For more information please refer to http://r-project.org

For Big Data Analytics

Who Uses R?
- Spreadsheet Users
- Programmers
- Statisticians
- Data Scientists


RStudio for R



Will update more information as I learn R...

Tuesday, March 26, 2013

Monday, October 8, 2012

Preview Apache™ Hadoop™-based Services for Windows Azure


In this blog post i will document the preview I did on "Apache™ Hadoop™-based Services for Windows Azure"

After my first look - I feel the following are some of best things for Admins/Developers Microsoft had made for this Hadoop offerings on Windows Azure.
1. Metro Style user experience to Create a Hadoop cluster on Windows Azure.
2. Very easy to configure and manager Hadoop cluster and nodes.
3. Job history management and Viewing the job execution history is so seamless.
4. Deploying Map/Reduce job is so simple(might be for basic job routines?)
5. Development of Map/Reduce code in your favorite JAVASCRIPT language
6. Webconsole to execute and manage the Map/Reduce jobs.
7. View Map/Reduce Job results in web console, view in GRAPH output, etc...

So much to document and explore in this new BigData approach from Microsoft....

Ok Let's get started...

You can start from https://www.hadooponazure.com/

Once you logged in using Windows Live credentials, you see the "Request a new Hadoop Cluster" screen in Home page.

Provide all required information and finally click on "Request Cluster" button link at right navigation.




Hadoop Cluster allocation in progress...



Hadoop Cluster - Allocation in progress




Hadoop Cluster - Allocation in progress

Hadoop Cluster - Allocation in progress



Hadoop Cluster - Allocation in progress



Hadoop Cluster Nodes - Allocation in progress....


Hadoop - Manage Cluster screen

Login using MSTSC Remote access to view the created Node in Hadoop cluster....



Hadoop Cluster - Map/Reduce job administration



Summary screen of particular Hadoop Cluster created just now...

Hadoop Cluster - Configure Ports 
- FTP
- ODBC Server

Default these ports are in Closed status



Hadoop Cluster - Job execution History

You can download Client Utilities for Microsoft Apache Hadoop based services that will help in querying data in Hive using ODBC driver or Excel.


Hadoop Cluster - Summary Screen (Metro-Style web interface)



Hadoop - Release Cluster




Hadoop Cluster - Release in progress...




Test Hadoop cluster created is now released...

Thanks to Microsoft again for enabling developer communities to preview the Hadoop on Windows Azure.

If you are also interested in this and need more information refer to this link 
How-To and FAQ Guide for ApacheTM HadoopTM-based Services for Windows Azure