Tuesday, November 5, 2013

App Pool vs Process ID listing

Scenarios

- Every AppPool creates W3WP.EXE process however when you are debugging through visual studio with IIS server having too many websites and app pools and you are challenged to find the right W3WP.EXE process to attach so that code debugging can be done against respective apppool W3WP.EXE


In IIS 6 (Windows 2003)
%windir%\system32\

c:\Windows\System32> iisapp.vbs

In IIS 7 (Windows 2008) and IIS 7.5 (Windows 2008 R2)

c:\Windows\System32\inetsrv>appcmd list wp

- List the web sites on your IIS server
c:\Windows\System32\inetsrv>appcmd list sites 

- Application pools on your IIS server

c:\Windows\System32\inetsrv>appcmd list appPools

Thursday, August 29, 2013

MapReduce Data Processing Pattern

MapReduce is
  • Data processing approach for processing highly parallelizable datasets.
  • Implemented as a cluster, with many nodes working in parallel on different parts of the data.
Data is divided up into small chunks that are then distributed across all data nodes in cluster for parallel processing.

MapReduce requires writing two functions:
  • a mapper
  • a reducer
MapReduce functions can be impleted in JAVA, C++, C#, Python, Javascript (to script PIG), etc...

These functions accept data as input and then return transformed data as output. Functions are called repeatedly, with subsets of data, with the output of the mapper being aggregated and then sent to the reducer. JobTracker coordinates jobs across the cluster.

Limiting factor for MapReduce is the size of the cluster.

Hadoop 

Implements MapReduce as a batch-processing system.
Optimzed for flexible and efficient processing of huge amounts of data, not for response time.

Hadoop ecosystem includes higher level of abstractions beyond MapReduce.

Hive provides a SQL kind of query language. When we submit a HiveQL query, Hive generates MapReduce functions and runs behind the scenes to carryout the requested query.

Pig is another query abstraction with a data flow language known as Pig Latin. Pig also generates MpaReduce functions and runs behind the scenes to implement the higher-level operations described in Pig Latin.

Mahout is a machine learning abstraction....TB typed

Sqoop is a relational database connector.

Use Hadoop to get right data subset and shape it to the desired form, then use BI tools to finish the analytical processing.