Thursday, May 15, 2014

Hadoop Hive External vs Internal Table

Hive tables can be created as EXTERNAL or INTERNAL. This is a choice that affects how data is loaded, controlled, and managed.

Use EXTERNAL tables when:

  • The data is also used outside of Hive. For example, the data files are read and processed by an existing program that doesn't lock the files.
  • Data needs to remain in the underlying location even after a DROP TABLE. This can apply if you are pointing multiple schemas (tables or views) at a single data set or if you are iterating through various possible schemas.
  • You want to use a custom location such as ASV.
  • Hive should not own data and control settings, dirs, etc., you have another program or process that will do those things.
  • You are not creating table based on existing table (AS SELECT).

Use INTERNAL tables when:

  • The data is temporary.
  • You want Hive to completely manage the lifecycle of the table and data.

Wednesday, May 14, 2014

Hadoop Quick Info

Apache Hadoop– Hadoop is an open source software framework which allows you to cheaply store and process vast amounts of structured and unstructured data.

Flume– A service for collecting, aggregating, and moving large amounts of log and event data into Hadoop.

HBase- A scalable, distributed, column-oriented data store that runs on top of HDFS. A short video overview of Flume.

HDFS– an acronym for "Hadoop Distributed File System"

Hive- A data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. It allows you to query data using a SQL-like language called HiveQL (HQL).

HiveQL (HQL)- A SQL like query language for Hadoop used to execute MapReduce jobs on HDFS.

JobTracker– the service within Hadoop which distributes MapReduce tasks to specific nodes in the cluster.

NameNode– the core of the HDFS file system. The NameNode maintains a record of all files stored on the Hadoop cluster.

Oozie - workflow scheduler system to manage Apache Hadoop jobs.

Pig– a high level programming language for creating MapReduce programs used within Hadoop. An introduction to Pig.

Sqoop– a tool for transferring data between Hadoop and relational databases.

YARN– a resource manager for Hadoop 2. YARN is short for "Yet another resource negotiator". Introduction to YARN on the Apache Hadoop website.

ZooKeeper - Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

Thursday, April 3, 2014

Visual Studio and Browser Link Feature

Announcement made today in Build 2014.

Visual Studio and Browser Link Feature

You can EDIT the Styles/Javascripts loaded in browser using the Browser's developer Tools (which we launch by pressing F12).

As you EDIT, you can see the changes made inside the Visual Studio code base and also other browsers (say chrome) getting auto-refreshed with the changes made in another browser Developer Tools.

Azure Web Jobs

Announcement made today in Build 2014.

Now we can run Web Jobs as part of Windows Azure Websites and run as background jobs (calling queues, etc...)

Need not spin up a new VM or new cloud services to run the background jobs, instead create Web Jobs as apart of Windows Azure Websites.

Wednesday, April 2, 2014

Rethink - 2 in 1 User Experience

I liked this video explaining about “Rethink Navigation for 2 in 1s

- Explaining Easy, OK, Hard to hit areas on Mobile, Tablets & Touch screen laptops.



Research Analysis


Positioning of the main Navigational elements of your App on these device variations





Tuesday, February 18, 2014

Commerce Server 10.1 Setup on Windows Server 2012

In this blog post i am documenting the steps required to install and configure Commerce Server 10.1 on Windows Server 2012 machine.

List of downloads i did from http://commerceserver.net/


Launch CommerceServer-10.1.67.0 setup file (note: version may vary in future).

Splash screen of Commerce Server 10.1 installation




Accept the Licence Agreement page


Installation in progress



Installation in progress



Commerce Server Configuration Wizard. Click Next





MSCS_Admin DB configuration


Add a new user "CS_RunTime"


Configure Staging Service to run with CS_RunTime user



Commerce Server Configuration summary



Commerce Server Configuration - In Progress


Commerce Server Configuration - Completed


Commerce Server Setup - Completed



Post installation of Commerce Server 10.1 on Windows Server 2012 we can now see in the Metro UI following apps are added



Wednesday, February 5, 2014

Docker Overview

In this post, I will document the learning I had with Docker.

Container History

- Linux Containers

  • Open VZ 
    • Single patched linux kernal ability to use architecture kernel version of system for executing the container
  • Linux V-Server
    • Virtual Private Server implementation. Created for adding OS level virtualization to the linux kernel itself.
What's a Container?
  • OS level Virtualization.
  • Kernel of OS allowing multiple isolated user space instances instead of one. 
  • Like real server from point of view of operator.
What is Docker?
Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.

Technology behind Docker 

  • Developed using Google's Go language
  • with Linux kernel using cgroup & cnames (namespace)
  • AUFS union filesystem
  • LxC (Linux containers)

TB typed more...as i learn...

Sunday, January 19, 2014

Android Layouts

In this blog post, i am documenting at high-level the leanings i had on the Android - Layout Types
  • Linear Layout (Aligns its child elements in single direction one after another)
    • Horizontal Layout
    • Vertical Layout
  • Relative Layout
    • Align's child elements relative to its sibling or parent. Help to avoid nested linear layouts.
  • Frame Layout (Allows child views to put on top of one another child views)
  • Table Layout (Aligns elements in Row & Column format)

More TB typed...

Friday, January 3, 2014

Powershell commands

This post i am documenting some of powershell quick reference scripts

Get Members of a command
c:> get-service | gm

Get-help CMD-NAME
Example:
c:> get-help start-service

Get-process -name "Name"
Example
c:> get-process -name "outlook"

Get Process - physical directory path
PS C:\Windows\system32> get-process outlook | dir

Directory: C:\Program Files (x86)\Microsoft Office\Office14

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         3/31/2011   4:08 PM   15933792 OUTLOOK.EXE

c:> get-service -filter * | select -Property name, @{name='ComputerName',expression={$_.name}_