Monday, October 8, 2012

Preview Apache™ Hadoop™-based Services for Windows Azure


In this blog post i will document the preview I did on "Apache™ Hadoop™-based Services for Windows Azure"

After my first look - I feel the following are some of best things for Admins/Developers Microsoft had made for this Hadoop offerings on Windows Azure.
1. Metro Style user experience to Create a Hadoop cluster on Windows Azure.
2. Very easy to configure and manager Hadoop cluster and nodes.
3. Job history management and Viewing the job execution history is so seamless.
4. Deploying Map/Reduce job is so simple(might be for basic job routines?)
5. Development of Map/Reduce code in your favorite JAVASCRIPT language
6. Webconsole to execute and manage the Map/Reduce jobs.
7. View Map/Reduce Job results in web console, view in GRAPH output, etc...

So much to document and explore in this new BigData approach from Microsoft....

Ok Let's get started...

You can start from https://www.hadooponazure.com/

Once you logged in using Windows Live credentials, you see the "Request a new Hadoop Cluster" screen in Home page.

Provide all required information and finally click on "Request Cluster" button link at right navigation.




Hadoop Cluster allocation in progress...



Hadoop Cluster - Allocation in progress




Hadoop Cluster - Allocation in progress

Hadoop Cluster - Allocation in progress



Hadoop Cluster - Allocation in progress



Hadoop Cluster Nodes - Allocation in progress....


Hadoop - Manage Cluster screen

Login using MSTSC Remote access to view the created Node in Hadoop cluster....



Hadoop Cluster - Map/Reduce job administration



Summary screen of particular Hadoop Cluster created just now...

Hadoop Cluster - Configure Ports 
- FTP
- ODBC Server

Default these ports are in Closed status



Hadoop Cluster - Job execution History

You can download Client Utilities for Microsoft Apache Hadoop based services that will help in querying data in Hive using ODBC driver or Excel.


Hadoop Cluster - Summary Screen (Metro-Style web interface)



Hadoop - Release Cluster




Hadoop Cluster - Release in progress...




Test Hadoop cluster created is now released...

Thanks to Microsoft again for enabling developer communities to preview the Hadoop on Windows Azure.

If you are also interested in this and need more information refer to this link 
How-To and FAQ Guide for ApacheTM HadoopTM-based Services for Windows Azure

Monday, September 17, 2012

PowerShell - Get List of files in a Folder path recursively and export to CSV file the output



In this blog post i will discuss document some PowerShell scripts that will help you
  1. Get List of files in a Folder path recursively and export to a CSV file the output with the files meta data information like ( Directory, Name, Length, CreationTime, LastWriteTime, LastAccessTime, FullName, Extension )
  2. Get Distinct COUNT of File extensions inside a folder path recursively.

Get List of files recursively and export to a CSV file with the files meta data information

Open PowerShell
%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe

Get-Childitem "C:\Bharath\Files" -Recurse | where { -not $_.PSIsContainer } | Select Directory, Name, Length, CreationTime, LastWriteTime, LastAccessTime, FullName, Extension | Export-Csv "C:\Users\E317351\Desktop\FileswithSize.txt"


Get Distinct COUNT of File extensions inside a folder path recursively.

Open PowerShell
%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe

Get-Childitem "C:\Bharath\Desktop\Some Folders" -Recurse | where { -not $_.PSIsContainer } | group Extension -NoElement | sort count -desc

Exclude some folder (use -exclude) parameter as below

Get-Childitem "C:\Bharath\Desktop\Some Folders" -exclude TestFolderName -Recurse | where { -not $_.PSIsContainer } | group Extension -NoElement | sort count -desc