Security Hardening of admin pages in Sitecore

  • Scenario:
  • Solutions:
    • There are many method which can be used for security hardening of Sitecore admin pages but URL Rewrite is one of the easiest to implement.
    • First of all the the URL Rewrite module on the IIS must be installed.
    • Now in your web.config file in the “rewrite” section add the below rules
      • In the rule you are defining the match pattern that if the URL contains sitecore/admin, sitecore/login, sitecore/diag, sitecore/debug, sitecore/shell or sitecore/, then redirect to the CD login page if the condition that the URL matches the EXTERNAL website URL. If the condition of EXTERNAL URL does not match, let the user open these admin pages using INTERNAL URL)
      • <rewrite>
        <rule name=”Disable Admin tool stopProcessing=”true>
        <match url=”^sitecore/(admin|login|diag|debug|shell).*[^s]*$/>
        <action type=”Redirect url=”https://{HTTP_HOST}/Login/Login.aspx appendQueryString=”false/>
        <add input=”{HTTP_HOST} pattern=”Your_Website_ExternalDNSName/>




Dockers 101 – Series 9 of N – How to setup Apache and NGinx server using docker

  • Requirement:
    • Setup Apache and NGinx server using docker
  • Strategy:
    • First of all create source folders and html files for Apache and NGinx on the host drive so that it is used as mount/volume for the Apache and NGinx containers respectively
    • Use docker run command with -v and -p options to specify the volume and port mapping
    • Test the containers
  • Solution:
    • If you have not installed docker on your system then first do that
    • Create source folders and html files for Apache and NGinx on the host
      • rootPath=$(pwd)

        test -d $hostNginxPath && echo “$hostNginxPath exists, continuing…” || echo “$hostNginxPath does not exist, creating…” && mkdir -p $hostNginxPath
        echo ”
        <title>Custom test page – My NGinx Server… </title>
        <h1>This is a custom test page from My NGinx Server…</h1>
        ” > $hostNginxHtml
        cat $hostNginxHtml

        test -d $hostHttpdPath && echo “$hostHttpdPath exists, continuing…” || echo “$hostHttpdPath does not exist, creating…” && mkdir -p $hostHttpdPath
        echo ”
        <title>Custom test page – My Httpd Server… </title>
        <h1>This is a custom test page from My Httpd Server…</h1>
        ” > $hostHttpdHtml
        cat $hostHttpdHtml

    • Use docker run command with -v and -p options
      • docker stop $(docker ps -a -q)
        docker rm $(docker ps -a -q)
        docker rmi $(docker images -q)
        docker run -d –name mynginxserver -p 8080:80 -v $hostNginxPath:$containerNginxPath nginx:latest
        docker exec mynginxserver cat $containerNginxPath/index.html
        docker run -d –name myhttpdserver -p 8090:80 -v $hostHttpdPath:$containerHttpdPath httpd:latest
        docker exec myhttpdserver cat $containerHttpdPath/index.html

    • Test the containers

AWS 101 – Series 1 of N – Monitoring and Maintenance – How to check if an Amazon instance is a valid approved golden image


  • How to check if an Amazon instance is a valid approved golden image


  • Problem Detection 
    • You can use CLI as well as Console to perform this action
    • Using CLI
      •  Save the below script as
      • Change the mode – chmod u+x
      • The AWS command ‘aws ec2 describe-instances‘ will return all instances in a specific region and is filtered by ImageId
      • The array arrImages is holding the array of all ImageId
      • Run a for loop with command ‘aws ec2 describe-images‘ to get the Image owner
      • If the image owner is not “Self” it is not a valid approved image specific to your own customized base image

        #!/bin/bash#regions to check
        arrRegions=(“us-east-1” “us-east-2”);

        #image list
        for regionId in ${arrRegions}
        echo “Instances for $regionId region:”
        arrImages=$(aws ec2 describe-instances –region $regionId –output text –query ‘Reservations[*].Instances[*].ImageId’);

        #get Image owner
        for imgId in ${arrImages}
        aws ec2 describe-images –region $regionId –output text –image-ids $imgId –query ‘Images[*].ImageOwnerAlias’

    • Using Console
      • Go to EC2 Dashboard, Select Instances tab and then select a specific instance
      • In the Description tab, click on AMI ID link and select the AMI ID from the pop up. Copy the AMI ID.
      • Capture01
      • Now go to AMI tab is Images section of EC2 Dashboard
      • Select “Owned by me” from the drop down and filter by AMIID. Paste the AMI ID for the filter.
      • Capture02
      • If no rows are returned it means the images are either from Market Place or Amazon and are not “Self” customized approved valid image.


  • Problem Remedy – to continue
Categories AWS

Adding new Rendering to Presentation details using PowerShell

  • Adding/Removing/Updating the presentation details of a page is very easy to implement using a PowerShell script
  • First of all you can get the list of pages where you want to add the new rendering using Get-Item and/or Get-ChildItem
  • Then for each such page use “Get-Rendering” method to check if the rendering which you want to add is present or not.
  • Then add the new rendering using “Add-Rendering” method.
  • Please see the example below
  • cd “master:/sitecore”
    $db = “master:”
    $newRenderingInstance = gi -Path “master:/sitecore/layout/Renderings/MyNewHeaderRendering” | New-Rendering

    function GetAllPages() {
    $homePage = Get-Item -Path ($db + “/Sitecore/Content/Home”) -Language *
    $pages = Get-ChildItem -Path ($db + “/Sitecore/Content/Home”) -Recurse -Language *
    $pages += $homePage
    $pages | ? { $_ -ne $null } `
    | % {
    AddMyNewHeaderRendering $_

    function AddMyNewHeaderRendering ($page){
    $renderings = $page | Get-Rendering -ErrorAction SilentlyContinue
    if($renderings -ne $null -and $renderings.length -gt 1 -and $newRenderingInstance.Id -notin $renderings.ItemID ){
    Add-Rendering -Item $page -Rendering $newRenderingInstance -Index 1 -Placeholder “header”



DevOps 101 – Series 1 of N – DevOps Refresher

What is DevOps?

  • What is SDLC?
    • SDLC(Software Development Life Cycle) is an integral part of development, testing, deployment, maintenance of any software.
    • There are many development models to make a project a success ( will not enlist all of them, but only two to just give an example)
      • Water Fall Model:
        • Project is divided into large chunks of phases e.g Phase 1, 2 and 3 etc
        • Success or failure of current phase has impacts on future phases
        • It is a rigid approach as  unless Phase 1 succeeds we can’t start phase 2 as there are inter dependencies
        • Releases and Rollbacks are fearsome process and impact the teams success and failure a lot
      • Agile model:
        • Project is divided into small sprints and stories (weekly, by weekly or even smaller)
        • There are daily, weekly  scrum or status  meeting to check the status of the project, bottle necks etc.
        • Releases and Rollbacks are common day to day practice helping project run smoothly.
    • There are two kinds of major groups involved  in the life cycle of any project
      • Developers ( Includes developers, testers, QAs, BAs)
      • Operations ( includes System/Server admins or web masters, release managers)
  • Whats are Developers?
    • Developers are responsible for change (functionality)  in the state of a project
    • They love to bring as many changes as the business needs.
    • Their productivity is driven by number of changes they bring in the state of the project.
  • Whats is Operations?
    • are responsible for smooth functioning, releases, deployments of the project
    • They love to keep system stable and love not to bring too many changes to the state of the project to keep a working system stable.
    • Their productivity is driven by less hours of downtime, keeping most of the servers up and running most of the time.
  •  So what is wrong in this approach or model?
    • Developers objective – bring as much as change without caring much about stability.
    • Operations objective – bring stability in the system with as less changes as possible.
    • Both the objectives are going against each other and thus probability of smooth running of project decreases, bottlenecks increase, project is susceptible to failure.
  • What is DevOps?
    • DevOps is rather a culture to resolve above problems.
    • In DevOps culture – which works in  more aggressive agile model with small sprints but all changes to the system is driven by automated builds, deployments, testing, rollbacks.
    • Deployments happen even on hourly basis as the smallest changes to the system goes through automated builds, deployments, testing, thus chances a change breaking a system is minimized. If some change breaks the automated build, deployment or fails the automated testing is immediately roll-backed to the previous running state.
    • Tools and techniques are used to give developers an environment similar using automated setup. So that they can’t excuse that something which works on their local systems is not working in staging or production.
    • Automated build, deployment and testing tools give the operations the guts to take as many changes as possible because now they don’t have an excuse for the stability of the system.
    • Thus DevOps is = Dev(Developers) + Ops(Operations) .
      • is a culture or best practices
      • Smaller development cycles
      • More frequent deployments
      • Better collaboration between developers and operations and now both working on same objective:
        • bring quick and stable changes to the system
      • DevOps is NOT  a standard, tool, or job title.
        • It will use tools to automate the process as much as possible and
        • the tools can vary based on languages, platforms, business needs as
        • their is no standard that if one tool works for say company A will also work for company B.
        • It is more of a best practices based on the needs.
    • Salient Features of a DevOps culture:
      1. Build Automation:
        • It is the process of building the code to make it run using an automated tool or script
        • It is independent of the IDE
        • Benefits of build automation  –
          • fast(since no/least manual tasks),
          • repeatable(runs the same any time),
          • consistent(produces similar results any time),
          • reliable(will alert about build errors, will do tasks based on a predefined set of commands in same way) and
          • portable(will run o any similar environment the same way)
        • Tools:
          • Tool are based on programming language:
            • Java – Maven, Ant
            • JavaScript – npm
            • Make – Unix based
            • VS – C#, .NET etc
      2. Continuous Integration:
        • It is a continuous process of merging the developer’s code to the master(deploy-able, release ready branch)
        • It uses the automated test cases to pass or fail the developer’s change
        • It uses a CI server which detects any changes and run the automated test cases against the new build and passes of fails the change.
        • If any developer’s code “Breaks the build“, first they are alerted and the change is roll-backed for the developers to fix it.
        • Benefits:
          • Continuations testing(changes are tested continuously)
          • Early detection of problem(due to contentious changes problems are detected early )
          • No rush for deployments(developers don’t rush to push their code for release)
          • Frequent releases(due to contiguous changes there are small but frequent releases)
        • Tools:
          • Jenkins – open source, widely used to easy integration
          • TravisCI – open source, GitHub Integration
          • Bamboo – an enterprise product with strong integration with JIRA
      3. Continuous Delivery and Deployment:
        • Continuous Delivery is process of keeping the code always in deployable state
        • Continuous Deployment is the actual process of deploying the code
        • Both are Not interchangeable terms
        • No standard of how often to deploy. It depends on companies needs
        • Benefits:
          • Faster Time to market(due to less problems in the whole process)
          • Less risk(due to increased reliability)
          • Reliable deployments and rollbacks( due to consistent process)
        • Tools:
          • Tool
      4. Infrastructure as a code:
        • The process of using code to provision and manage resources or infrastructure by using code
        • Benefits:
          • Reusable(we can execute it as many times)
          • Scalable(we can execute it to as many servers
          • Consistent(it runs the same ways wherever it runs)
          • Trackable/Documenting(any infrastructure changes are well documented as these changes can be committed similar like a normal code )
        • Tools:
          • Tool
        • Please see the code below as an example of infrastructure as the code (The below Dockerfile uses Docker hub to install Python 2.7, sets up working directory, copies content into it and then runs the python file to run the game):
        • # Use an official Python runtime as a parent image
          FROM python:2.7-slim
          # Set the working directory to /app
          WORKDIR /app
          # Copy the contents to /app container folder
          ADD . /app
          # Run when the container launches
          CMD ["python", ""]
      5. Configuration Management:
        • It is the process of managing or maintaining the state of the infrastructure changes in a consistent, stable and maintainable way
        • We use Infrastructure as a code ensure efficient configuration management
        • Benefits:
          • Time saving(the infrastructure as code can be executed on any numbers of times on any number of servers)
          • Consistent (same changes are made wherever the change is executed)
          • Maintainable(it will maintainable due to well documentation)
          • Less configuration drift(Since similar code is executed there is less configuration drift and that to is well documented)
        • Tools:
          • Ansible – open source, uses YAML config files, does not need a server and agent model, uses declarative configuration
          • Puppet – open source, needs a server and agent model, uses declarative configuration
          • Puppet – open source, needs a server and agent model, uses procedural configuration
          • Salt– needs a server and agent model, uses declarative configuration, uses YAML config files
      6. Orchestration:
        • It is a process of using a builder tool which automates the whole workflow or process.
        • E.g. docker-compose, Kubernetes etc
        • Benefits:
          • Scalibility(the orchestration tool can be used to execute the changes to any number of servers)
          • Stable(the changes are stable as always executed in the same fashion)
          • Self servicing(it is a auto healing method )
          • Granularity(there is full control over the whole process  due to each steps well defined)
          • Time saving(automation leads to quick turn around time)
        • Tools:
          • Kubernetes – biggest hit these days
          • Docker-compose
      7. Monitoring:
        • It is a process monitoring the state of a system, alerting any change in the state, presenting the state of the change in a meaningful manner
        • You can monitor the system resources like CPU, memory, I/O, Network, Logging etc
        • Benefits:
          • Fast recovery from failures(automated alerts help in recovering from failures by provisioning resources based on the alert )
          • Automated alerting and response(the alerts give the impulse for auto healing systems)
          • Root cause and visibility(helps in root cause analysis based on the information tracked and monitored)
          • Auto healing(With proper health checks and alerting system, we can configure the system to auto heal in the event of failure of a few servers by provisioning more servers )
        • Tools:
          • AppDynamics
          • Newrelic
    • Here is a website with the links and details of most of the DevOps tool, you click on these links to see the details of each of these. That is why although I enlisted quite a many tools but did not gave much details about them as you can see basic details in the periodic table itself and some of the tools I will cover in details-

Dockers 101 – Series 8 of N – Stateful containers and Importing and Exporting containers

  • Stateless vs Stateful containers
    • Stateless – they don’t need to maintain the state of an application
      • e.g The TicTacToe game container we created is a simple game. We just wanted that when the container image is downloaded then the game should run. But we are not maintaining any users, their scores or anything like that.
    • Stateful – they need the application state to be maintained on some storage volume e.g in a database we are storing the users, scores, history of the games etc.
  • Approaches for Stateful containers
    • -v <host-dir>:<container-dir> parameter option
      • -v host-dir:container-dir option instructs the docker to map a host directory to a container directory. It can be a good option for some scenarios but not an effective solution. What if the container is run from another docker where the host directory does not exist?
    • Using Data Containers
      • they are responsible for storing data
      • but they don’t run like other containers
      • they hold the data/volume and are referenced by other containers who want to use this volume
  • Data containers in action
    • Lets use Busybox(one of the smaller Linux distributions) we will use this container to hold our data and to be referenced by other containers
    • We will use docker create command to create a new container and pass -v parameter to create a container folder
    • We will then copy the configuration file from host folder to container folder
    • Now with the new data container created, we will use this container to reference/mount on a Ubuntu container using command –volumes-from
    • We will see how in the Ubuntu container since out container is mounted as volume, we can see the config file there.
    • This data container can be exported and imported too.
    • # create a config file
      echo “test=true” >> config.conf

      # create a container by a specific name , with v option to create a folder in the container
      # (busybox is very small container)
      docker create -v /config –name naeemsDataContainer busybox

      # copy data from local to the container
      docker cp config.conf naeemsDataContainer:/config/

      # run an ubuntu container, referencing the container naeemsDataContainer using command –volumes-from
      docker run –volumes-from naeemsDataContainer ubuntu ls /config

      # export the container
      docker export naeemsDataContainer > naeemsDataContainer.tar

      # import the container
      docker import naeemsDataContainer.tar

      # check the docker images and see the imported image (you will see naeemsDataContainer – a data container)
      docker images

      # check docker containers and see the running container
      # (you will not see naeemsDataContainer, as it actually does not run, it is just a mount volume for other containers)
      docker ps -a

    • Capture
    • Capture

Apache Spark – A Deep Dive – series 9 of N – Analysis of most popular movies – using SparkSQL


  • Analyse the Most Popular Movie in a more optimized way:
    • Spark Core has efficient mapper, reducer and event functions  to analyse a complex data BUT
      • to get the output we used to a lot of logic to create key value pairs,
      • lot of lambda operations to aggregate the data etc
      • we were using data not in a structured format which can be used to optimize the queries as well as exporting or importing to and fro other databases would get a lot easier


  • In addition to Spark Core we will use SparkSQL
    • to give a structure to the data we use SparkSQL
    • We will use two terms a lot – Dataframes and Datasets
    • DataFrame
      • schema view of an RDD.
      • In RDD each row is a Key value pair
      • In DataFrame each is a Row Object
    • DataSet
      • object(OOPS) view of an RDD.
      • In DataSet each is a Named Row object
      • means a Dataset is a named DataFrame as a type object
  • Advantages of using Spark SQL
    • abstracts the internal intricacies of a RDD by exposing APIs to handle the data
    • can be extended by using user defined functions
    • If each line is a Row object you can use the power of SQL like querying to process data across a cluster as if it was a single database
    • export import data using JDBC, JSON etc


  • Explanation of the code
    • Row Object: See how instead of returning a key-value pair its is returning a Row Object where column name is movieID. So this RDD will hold one column where it stores movie IDs
      • # python function to return a Ratings Row Object
        def processRatings(line):

        fields = line.split()
        mvID = int(fields[1])
        return Row(movieID = mvID)

    • DataFrame: See how a Row based RDD is converted to a DataFrame
      • ratingsDataset = session.createDataFrame(ratings)

    • Processing DataFrame: see in one line we are applying SQL like logic to process the data by using functions like group By, count, orderBy etc
      • topMostMovieIDs = ratingsDataset.groupBy(“movieID”).count().orderBy(“count”, ascending=False).cache()

    • Spark SQL like statements:
      • ratings.createOrReplaceTempView(“tblRatings”)

      • spark.sql(“SELECT top 5 movieID, count(movieID) FROM tblRatings groupby movieID order by count”)

  • Please down the code from either of these locations:
    • wget
    • wget
    • OR
    • git clone
    • OR
    • # import SparkSession, Row and functions from puspark.sql module
      from pyspark.sql import SparkSession
      from pyspark.sql import Row
      from pyspark.sql import functions

      # python function to return a Movie Dictionary
      def processMovies():

      movies = {}
      with open(“/home/user/bigdata/datasets/ml-100k/u.item”) as mfile:

      for line in mfile:

      fields = line.split(“|”)
      movieID = int(fields[0])
      movieName= fields[1]
      movies[movieID]= movieName

      return movies

      # python function to return a Ratings Row Object
      def processRatings(line):

      fields = line.split()
      mvID = int(fields[1])
      return Row(movieID = mvID)

      #python function to print results
      def printResults(results):

      for result in results:

      print(“\n%s:\t%d ” %(moviesDictionary[result[0]], result[1]))

      # create a SparkSession
      session = SparkSession.builder.appName(“MostPopularMovies”).getOrCreate()

      # load the movies
      moviesDictionary = processMovies()

      # load the ratings Row Objects
      rawData = session.sparkContext.textFile(“/home/user/bigdata/datasets/ml-100k/”)

      # conevert the ratings to an RDD of Row objects
      ratings =

      # convert the ratings Row Objects into an RDD
      ratingsDataset = session.createDataFrame(ratings)

      # process the Dataframe
      topMostMovieIDs = ratingsDataset.groupBy(“movieID”).count().orderBy(“count”, ascending=False).cache()

      # show all topMostMovieIDs

      # collect and display results for topmost 25 movies
      topMost5MovieIDs = topMostMovieIDs.take(5)

      # print the Movie Names with ratins count

      # close the spark sessions

The Output:

  • Capture