How to handle Deadlocks in Sitecore EventQueue, History and PublishingQueue tables

What was the issue?

  • All of a sudden on a Sunday morning our content editor team started reporting:
    • In the CM servers, the Sitecore items(the beauty products) are not getting published to web database.
    • the same products are not getting updated to Solr Master and thus to Solr slave
    • They also reported that on the CD servers the searches are not showing up the newly updated products.

 

What is the overall architecture?

  • Our Sitecore implementation receives product notifications though a RabbitMQ notification service where the messages are pooled by the product service(repository of all products)
  • In Sitrecore, we have written a processor to listen to these notifications and update the product features and attributes.
  • Once the updated product feature or attribute is saved in the master database first, then publishing service is called to update it to web database.
  • Finally, re-indexing is executed to update it to Solr Master.
  • The Solr master through the replication process updates it to Solr Slave.
  • While Solr master is used as an indexing destination, Solr slave is used for searches.
  • All this is done through the automated process called as “Notification Service” which is nothing but a listening processor over the pipeline.

 

What was done to resolve the issue after initial investigation?

  • We checked for any publishing error logs, there were none.
  • We checked for any notification service logs, there were none.
  • We also checked the Solr logs, there were none.
  • We then restarted the publishing service, reset the app pools on the CM and CD servers, reset the Solr master and the slave.

 

Did this resolve the issue?

  • Yes and No.
  • Yes because when we restart the CM and CD servers, it will use the initialization pipeline processor  to update all the products from the Product service into master and web databases and thus also in Solr master and slave after auot-re-indexing. So we thus saw the new products which were not earlier showing up, they were showing up now.
  • No because, when we tried to update any new products, it was again not showing it up in web databases, Solr master or Solr slave.
  • This was very peculiar scenario that on server reset the products were getting synched the first time but when we tried to update any other product it will not.

 

What was the root cause then?

  • The products from Sitecore master database were not being saved to web database because of “SQL deadlocking”.
  • The deadlock was killing the save event and thus the product was not getting saved and thus also not reflecting in Solr
  • The tables which were deadlocking in core, master as well as web databases were –
    • EventQueue table,
    • History table and
    • PublishQueue table
  • Sitecore uses these tables to keep track of events, history and publishing and thus know which products to publish or not based on these tables
  • If these tables consists hundreds of thousands of records then it will fail to retrieve the info in timely fashion and thus slowing the CPU and memory and thus resulting in deadlocks.

 

So what was the final solution?

  • The solution is :
    • you have to prune the data in these 3 tables( say more than 12 hours or so).
    • You can archive any data older than 12 hours in an archive table so that you can reclaim it in case of Disaster recovery.
    • You can also create a job which regularly cleans up data from these 3 tables, archives it in another table and later clean the archive table too say on weekly basis.
    • One of the solutions is mentioned as below
    • Commands to clean the data older than 12 hours
      • Delete data from EventQueue table
        delete FROM [TheDatabase_Master].[dbo].[EventQueue] where [Created] < DATEADD(HOUR, -4, GETDATE())
        delete FROM [TheDatabase_Core].[dbo].[EventQueue] where [Created] < DATEADD(HOUR, -4, GETDATE())
        delete FROM [TheDatabase_Web].[dbo].[EventQueue] where [Created] < DATEADD(HOUR, -4, GETDATE())

        Delete data from History table
        delete FROM [TheDatabase_Core].[dbo].[History] where Created < DATEADD(HOUR, -12, GETDATE())
        delete FROM [TheDatabase_Master].[dbo].[History] where Created < DATEADD(HOUR, -12, GETDATE())
        delete FROM [TheDatabase_Web].[dbo].[History] where Created < DATEADD(HOUR, -12, GETDATE())

        Delete data from Publishqueue table
        delete FROM [TheDatabase_Core].[dbo].[PublishQueue] where Date < DATEADD(HOUR, -12, GETDATE());   
        delete FROM [TheDatabase_Master].[dbo].[PublishQueue] where Date < DATEADD(HOUR, -12, GETDATE());
        delete FROM [TheDatabase_Web].[dbo].[PublishQueue] where Date < DATEADD(HOUR, -12, GETDATE());

  • After cleaning the databases, we saw that the dead locks were gone and now the data was saving into the master and web database as well as the Solr master and slave. This resolved our issue.
Advertisements

How to setup and automate Sitecore new Publishing service

Why?

  • Sitecore has released a new Sitecore publishing module which is a vast improvement over its old “Sitecore Publish” option.
  • It is now an independent service running in IIS with its security credentials
  • Its helps-
    • publish items faster, specially
    • bulk publishing
    • repair publish
    • has a better user interface and tells which publishes are complete or in queue, who ran the publish and what items are being published
  • It consists of-
    • Publishing Host – an independent service under IIS for publishing the items
    • Publishing Module – The user interface

How to automate or setup?

  • Its consists of steps which you can run manually or automate using PowerShell or any other language. I am using the automated process using a PowerShell file
  • The 4 major steps in setup or automation:
    • Installing DotNetCore module
    • Installing Publishing Host
    • Installing Publishing Module
    • Setup database user permissions
    • Update database schema and Add Database user
  • Please download this file and save it on your local server – https://testbucket786786.s3.amazonaws.com/sitecore/Publishing%20Service%20Setup/PublishingServiceSetup.ps1
  • Execute by typing .\PublishingServiceSetup.ps1
  • Now run the publishing service from the Sitecore control panel
  • You can see the new publishing service UI now
    • Capture
  • The code and its explanation(inline in the code) –
    • #requires -version 4.0
      #requires -RunAsAdministrator
      import-module webadministration

      # This script must be executed after you have installed a working Sitecore Instance

      & {iisreset}

      ##########################################################################################################

      # Step 1 – Installation of DotNetCore module – it will install the .DotNetCore module and its prerequisites
      # required prerequisites and the .DonetCore module are at the location – https://testbucket786786.s3.amazonaws.com/sitecore/Publishing%20Service%20Setup/Softs/
      $theSoftwareSource=’https://testbucket786786.s3.amazonaws.com/sitecore/Publishing%20Service%20Setup/Softs&#8217;

      # website Names
      $sitecoreInstanceName = “TheSitecoreWebsiteName” # change the “TheSitecoreWebsiteName” to the actual website Name
      $publishingWebsiteName= “ThePublishWebsiteName” # change the “ThePublishWebsiteName” to the actual publish website Name

      #web sites Paths
      $sitecoreInstancePath = “D:\Sitecore\$sitecoreInstanceName”
      $publishingWebsitePath = “D:\Sitecore\publishingWebsiteName”

      # app pool user names
      $theWebsiteAppPoolUserName = IIS AppPool\websitename.com # this the the name of the IIS App pool website username
      $thePublishWebsiteAppPoolUserName = IIS AppPool\publish.website.com # this the the name of the IIS App pool publish username

      function InstallPackage($theFolder, $thePackage){

      $theProcess = Start-Process $thePackage -WorkingDirectory $theFolder -ArgumentList “/q /norestart” -Verb RunAs -Wait -PassThru
      if ($theProcess.ExitCode -eq 0) {
      Write-Host “Installation completed for package: $thePackage”
      }
      else {
      Write-Error “An error was encountered during the installation. .NET Release after installation of package: $thePackage”
      }

      }

      $theDotNetCoreHostingModule = Get-WebGlobalModule | where-object { $_.name.ToLower() -eq “aspnetcoremodule” }
      if (!$theDotNetCoreHostingModule)
      {
      Write-Host “Installing .Net Core Module”

      $theDestination=’D:\PublishingDServicesSofts’
      Copy-Item -Recurse -Filter *.* -path $theSoftwareSource -destination $theDestination -Force

      $workingDirectory = “D:\deploy\Installers\publishing service”
      $theVCRedistPackage = “vc_redist.x64.exe”
      $theDotNetCoreRuntimePackage = “dotnet-win-x64.1.1.2.exe”
      $theDetNetCoreHostingPackage = “DotNetCore.1.0.5_1.1.2-WindowsHosting.exe”

      InstallPackage $workingDirectory $theVCRedistPackage
      InstallPackage $workingDirectory $theDetNetCoreHostingPackage

      }
      else {
      Write-Host “.DontNetCore installation successful”
      }

      ##########################################################################################################

      # Step 2 – Installation of Publishing service Host

       

      RemovePublishingServiceWebsite
      CreatePublishingServiceWebsite “admin” “b” # this would be a non admin/b credentials for non default/local instances
      DeployPublishingServiceFiles

      function RemovePublishingServiceWebsite() {
      $SitecoreWebsiteInstanceName = $sitecoreInstanceName
      # removing web site from IIS
      if (test-path “IIS:\Sites\$SitecoreWebsiteInstanceName”) {
      write-host “Removing Website $SitecoreWebsiteInstanceName”
      & “$($env:windir)\system32\inetsrv\AppCmd.exe” Stop Site “`”$SitecoreWebsiteInstanceName`””
      remove-website -name “$SitecoreWebsiteInstanceName”
      }

      # removing app pool from IIS
      if (test-path “IIS:\AppPools\$SitecoreWebsiteInstanceName”) {
      write-host “Removing AppPool $SitecoreWebsiteInstanceName”
      & “$($env:windir)\system32\inetsrv\AppCmd.exe” Stop AppPool “`”$SitecoreWebsiteInstanceName`””
      remove-webapppool -name “$SitecoreWebsiteInstanceName”
      }

      & {iisreset}
      }

      function DeployPublishingServiceFiles() {
      $sitecoreInstanceRoot = sitecoreInstancePath
      if (!(test-path $sitecoreInstanceRoot)) {
      unzip $($theSoftwareSource\\Sitecore Publishing Service 2.1.0 rev. 171009.zip) $sitecoreInstanceRoot
      unzip $($theSoftwareSource\\Sitecore Publishing Module 2.1.0 rev. 171009.update) $sitecoreInstanceRoot
      }
      }

      function CleanPublishingServiceBindings() {
      $SitecoreWebsiteInstanceName = sitecoreInstanceName
      $website = get-website |? { $_.name -eq $SitecoreWebsiteInstanceName -and $_.bindings.collection.count -ne 0 }
      if ($website) { get-webbinding -name $SitecoreWebsiteInstanceName | remove-webbinding }
      }

      function UpdatePublishingServiceBindings([hashtable] $config) {
      $sitecoreInstanceRoot = sitecoreInstancePath
      $SitecoreWebsiteInstanceName = sitecoreInstanceName
      $hostNames = sitecoreInstanceName

      & “$($env:windir)\system32\inetsrv\AppCmd.exe” Stop Site “`”$SitecoreWebsiteInstanceName`””

      # remove bindings
      CleanPublishingServiceBindings

      # create bindings
      write-host “Creating bindings for $SitecoreWebsiteInstanceName”
      new-webbinding -name $SitecoreWebsiteInstanceName -protocol http -port 80 -ipaddress “*” -hostheader $config.hostNames[0]

      & “$($env:windir)\system32\inetsrv\AppCmd.exe” Start Site “`”$SitecoreWebsiteInstanceName`””
      #start-website $SitecoreWebsiteInstanceName
      }
      function CreatePublishingServiceWebsite([string] $theWebsiteAppPoolUserName, [string] $runtimeAccountPassword) {
      $sitecoreInstanceRoot = sitecoreInstancePath
      $SitecoreWebsiteInstanceName = sitecoreInstanceName

      # remove existing site if it exists
      RemovePublishingServiceWebsite

      # create app pool
      write-host “Creating app pool for $SitecoreWebsiteInstanceName”

      $appPool = “IIS:\AppPools\$SitecoreWebsiteInstanceName”
      new-webAppPool -name “$SitecoreWebsiteInstanceName”

      $pool = get-item $appPool
      $pool.startMode = ‘AlwaysRunning’
      set-item $appPool $pool

      set-itemProperty -Path $appPool -Name managedRuntimeVersion -value “”
      set-itemProperty -Path $appPool -Name recycling.disallowOverlappingRotation -Value $true
      set-itemProperty -Path $appPool -Name processModel.idleTimeout -value ([TimeSpan]::FromMinutes(0))
      set-itemProperty -Path $appPool -Name recycling.periodicrestart.time -value ([TimeSpan]::FromMinutes(0))

      # set the app pool identity
      if ([string]::IsNullOrEmpty($theWebsiteAppPoolUserName) -or [string]::IsNullOrEmpty($runtimeAccountPassword)) {
      set-itemProperty -path $appPool -Name processModel.identityType -Value 4 # AppPoolIdentity
      } else {
      set-itemProperty -Path $appPool -Name processModel.identityType -Value 3 # Specific User
      set-itemProperty -Path $appPool -Name processModel.userName -Value $theWebsiteAppPoolUserName
      set-itemProperty -Path $appPool -Name processModel.password -Value $runtimeAccountPassword
      }

      # create web site
      write-host “Creating website $SitecoreWebsiteInstanceName”
      $id = (dir iis:\sites | foreach {$_.id} | sort -Descending | select -first 1) + 1
      new-webSite -name $SitecoreWebsiteInstanceName -physicalpath “$sitecoresitecoreInstanceRoot” -port 80 -applicationpool $SitecoreWebsiteInstanceName -Id $id

      # update bindings
      UpdatePublishingServiceBindings
      }

      ##########################################################################################################

      # Step 3 – Add Database users

      $dbInstance = “.” # referes to the current default local instance, for non-local instances, give actual name of the named db instance
      $dbServer = new-object Microsoft.SqlServer.Management.Smo.Server($sqlConfig.server)

      AddDatabaseUser $dbServer core $thePublishWebsiteAppPoolUserName
      AddDatabaseUser $dbServer master $thePublishWebsiteAppPoolUserName
      AddDatabaseUser $dbServer web $thePublishWebsiteAppPoolUserName
      AddDatabaseUser $dbServer preview $thePublishWebsiteAppPoolUserName
      AddDatabaseUser $dbServer reporting $thePublishWebsiteAppPoolUserName

      function AddDatabaseUser($databaseServer, $databaseName, $userName)
      {
      if (test-database $databaseServer $databaseName) {
      # if login doesn exist at the database level the create it
      if (!(($databaseServer.logins).Name -contains $userName)) {
      write-host “Adding database user $userName”
      $login = new-object Microsoft.SqlServer.Management.Smo.Login($databaseServer,$userName)
      $login.loginType = ‘WindowsUser’
      $login.create()
      }

      # add the user to the database
      $db = $databaseServer.databases[$databaseName]
      if (!(($db.users).Name -contains $userName)) {
      write-host “Adding $userName to database $databaseName”
      $user = new-object Microsoft.SqlServer.Management.Smo.User($db,$userName)
      $user.login = $userName
      $user.create()

      # grant db_owner permissions
      write-host “Adding db_owner permission for $userName to database $databaseName”
      $db.roles[‘db_owner’].addMember($userName)
      }
      } else {
      write-warning “$databaseName does not exist”
      }
      }

      ##########################################################################################################

      # Step 4 – Update Database schema
      if(test-path $publishingWebsitePath) {
      write-host “Updating Schema by executing $publishingWebsitePath\Sitecore.Framework.Publishing.Host.exe schema upgrade –force”
      Invoke-Expression “$publishingWebsitePath\Sitecore.Framework.Publishing.Host.exe schema upgrade –force”
      & {iisreset}
      }

      ##########################################################################################################
      # Step 5 – Update Publishing service UI
      $thePublishingServiceFilesSource=’$theSoftwareSource/Sitecore_PublishingService_files.7z’
      if ((test-path “$sitecoreInstancePath\website”)) {
      write-host “Copying Publishing service files to $sitecoreInstancePath\website”
      unzip $thePublishingServiceFilesSource “$sitecoreInstancePath\website”
      }

      & {iisreset}

      ##########################################################################################################

Security Hardening of admin pages in Sitecore

  • Scenario:
  • Solutions:
    • There are many method which can be used for security hardening of Sitecore admin pages but URL Rewrite is one of the easiest to implement.
    • First of all the the URL Rewrite module on the IIS must be installed.
    • Now in your web.config file in the “rewrite” section add the below rules
      • In the rule you are defining the match pattern that if the URL contains sitecore/admin, sitecore/login, sitecore/diag, sitecore/debug, sitecore/shell or sitecore/, then redirect to the CD login page if the condition that the URL matches the EXTERNAL website URL. If the condition of EXTERNAL URL does not match, let the user open these admin pages using INTERNAL URL)
      • <rewrite>
        <rules>
        <rule name=”Disable Admin tool stopProcessing=”true>
        <match url=”^sitecore/(admin|login|diag|debug|shell).*[^s]*$/>
        <action type=”Redirect url=”https://{HTTP_HOST}/Login/Login.aspx appendQueryString=”false/>
        <conditions>
        <add input=”{HTTP_HOST} pattern=”Your_Website_ExternalDNSName/>
        </conditions>
        </rules>

        </rewrite>

 

Dockers 101 – Series 9 of N – How to setup Apache and NGinx server using docker

  • Requirement:
    • Setup Apache and NGinx server using docker
  • Strategy:
    • First of all create source folders and html files for Apache and NGinx on the host drive so that it is used as mount/volume for the Apache and NGinx containers respectively
    • Use docker run command with -v and -p options to specify the volume and port mapping
    • Test the containers
  • Solution:
    • If you have not installed docker on your system then first do that
    • Create source folders and html files for Apache and NGinx on the host
      • rootPath=$(pwd)
        hostNginxPath=”$rootPath/docker/www/nginx”
        containerNginxPath=”/usr/share/nginx/html”
        hostNginxHtml=”$rootPath/docker/www/nginx/index.html”
        hostHttpdPath=”$rootPath/docker/www/httpd”
        hostHttpdHtml=”$rootPath/docker/www/httpd/index.html”
        containerHttpdPath=”/usr/local/apache2/htdocs”

        test -d $hostNginxPath && echo “$hostNginxPath exists, continuing…” || echo “$hostNginxPath does not exist, creating…” && mkdir -p $hostNginxPath
        echo ”
        <html>
        <head>
        <title>Custom test page – My NGinx Server… </title>
        </head>
        <body>
        <hr/>
        <h1>This is a custom test page from My NGinx Server…</h1>
        <hr/>
        </body>
        </html>
        ” > $hostNginxHtml
        cat $hostNginxHtml

        test -d $hostHttpdPath && echo “$hostHttpdPath exists, continuing…” || echo “$hostHttpdPath does not exist, creating…” && mkdir -p $hostHttpdPath
        echo ”
        <html>
        <head>
        <title>Custom test page – My Httpd Server… </title>
        </head>
        <body>
        <hr/>
        <h1>This is a custom test page from My Httpd Server…</h1>
        <hr/>
        </body>
        </html>
        ” > $hostHttpdHtml
        cat $hostHttpdHtml

    • Use docker run command with -v and -p options
      • docker stop $(docker ps -a -q)
        docker rm $(docker ps -a -q)
        docker rmi $(docker images -q)
        docker run -d –name mynginxserver -p 8080:80 -v $hostNginxPath:$containerNginxPath nginx:latest
        docker exec mynginxserver cat $containerNginxPath/index.html
        docker run -d –name myhttpdserver -p 8090:80 -v $hostHttpdPath:$containerHttpdPath httpd:latest
        docker exec myhttpdserver cat $containerHttpdPath/index.html

    • Test the containers

AWS 101 – Series 1 of N – Monitoring and Maintenance – How to check if an Amazon instance is a valid approved golden image

Problem:

  • How to check if an Amazon instance is a valid approved golden image

Solution:

  • Problem Detection 
    • You can use CLI as well as Console to perform this action
    • Using CLI
      •  Save the below script as checkIfValidApprovedImage.sh
      • Change the mode – chmod u+x checkIfValidApprovedImage.sh
      • The AWS command ‘aws ec2 describe-instances‘ will return all instances in a specific region and is filtered by ImageId
      • The array arrImages is holding the array of all ImageId
      • Run a for loop with command ‘aws ec2 describe-images‘ to get the Image owner
      • If the image owner is not “Self” it is not a valid approved image specific to your own customized base image

        #!/bin/bash#regions to check
        arrRegions=(“us-east-1” “us-east-2”);

        #image list
        for regionId in ${arrRegions}
        do
        echo “Instances for $regionId region:”
        arrImages=$(aws ec2 describe-instances –region $regionId –output text –query ‘Reservations[*].Instances[*].ImageId’);

        #get Image owner
        for imgId in ${arrImages}
        do
        aws ec2 describe-images –region $regionId –output text –image-ids $imgId –query ‘Images[*].ImageOwnerAlias’
        done
        done

    • Using Console
      • Go to EC2 Dashboard, Select Instances tab and then select a specific instance
      • In the Description tab, click on AMI ID link and select the AMI ID from the pop up. Copy the AMI ID.
      • Capture01
      • Now go to AMI tab is Images section of EC2 Dashboard
      • Select “Owned by me” from the drop down and filter by AMIID. Paste the AMI ID for the filter.
      • Capture02
      • If no rows are returned it means the images are either from Market Place or Amazon and are not “Self” customized approved valid image.

 

  • Problem Remedy – to continue
Categories AWS

Adding new Rendering to Presentation details using PowerShell

  • Adding/Removing/Updating the presentation details of a page is very easy to implement using a PowerShell script
  • First of all you can get the list of pages where you want to add the new rendering using Get-Item and/or Get-ChildItem
  • Then for each such page use “Get-Rendering” method to check if the rendering which you want to add is present or not.
  • Then add the new rendering using “Add-Rendering” method.
  • Please see the example below
  • cd “master:/sitecore”
    $db = “master:”
    $newRenderingInstance = gi -Path “master:/sitecore/layout/Renderings/MyNewHeaderRendering” | New-Rendering

    function GetAllPages() {
    $homePage = Get-Item -Path ($db + “/Sitecore/Content/Home”) -Language *
    $pages = Get-ChildItem -Path ($db + “/Sitecore/Content/Home”) -Recurse -Language *
    $pages += $homePage
    $pages | ? { $_ -ne $null } `
    | % {
    AddMyNewHeaderRendering $_
    }
    }

    function AddMyNewHeaderRendering ($page){
    $renderings = $page | Get-Rendering -ErrorAction SilentlyContinue
    if($renderings -ne $null -and $renderings.length -gt 1 -and $newRenderingInstance.Id -notin $renderings.ItemID ){
    Add-Rendering -Item $page -Rendering $newRenderingInstance -Index 1 -Placeholder “header”
    }
    }

     

    GetAllPages

DevOps 101 – Series 1 of N – DevOps Refresher

What is DevOps?

  • What is SDLC?
    • SDLC(Software Development Life Cycle) is an integral part of development, testing, deployment, maintenance of any software.
    • There are many development models to make a project a success ( will not enlist all of them, but only two to just give an example)
      • Water Fall Model:
        • Project is divided into large chunks of phases e.g Phase 1, 2 and 3 etc
        • Success or failure of current phase has impacts on future phases
        • It is a rigid approach as  unless Phase 1 succeeds we can’t start phase 2 as there are inter dependencies
        • Releases and Rollbacks are fearsome process and impact the teams success and failure a lot
      • Agile model:
        • Project is divided into small sprints and stories (weekly, by weekly or even smaller)
        • There are daily, weekly  scrum or status  meeting to check the status of the project, bottle necks etc.
        • Releases and Rollbacks are common day to day practice helping project run smoothly.
    • There are two kinds of major groups involved  in the life cycle of any project
      • Developers ( Includes developers, testers, QAs, BAs)
      • Operations ( includes System/Server admins or web masters, release managers)
  • Whats are Developers?
    • Developers are responsible for change (functionality)  in the state of a project
    • They love to bring as many changes as the business needs.
    • Their productivity is driven by number of changes they bring in the state of the project.
  • Whats is Operations?
    • are responsible for smooth functioning, releases, deployments of the project
    • They love to keep system stable and love not to bring too many changes to the state of the project to keep a working system stable.
    • Their productivity is driven by less hours of downtime, keeping most of the servers up and running most of the time.
  •  So what is wrong in this approach or model?
    • Developers objective – bring as much as change without caring much about stability.
    • Operations objective – bring stability in the system with as less changes as possible.
    • Both the objectives are going against each other and thus probability of smooth running of project decreases, bottlenecks increase, project is susceptible to failure.
  • What is DevOps?
    • DevOps is rather a culture to resolve above problems.
    • In DevOps culture – which works in  more aggressive agile model with small sprints but all changes to the system is driven by automated builds, deployments, testing, rollbacks.
    • Deployments happen even on hourly basis as the smallest changes to the system goes through automated builds, deployments, testing, thus chances a change breaking a system is minimized. If some change breaks the automated build, deployment or fails the automated testing is immediately roll-backed to the previous running state.
    • Tools and techniques are used to give developers an environment similar using automated setup. So that they can’t excuse that something which works on their local systems is not working in staging or production.
    • Automated build, deployment and testing tools give the operations the guts to take as many changes as possible because now they don’t have an excuse for the stability of the system.
    • Thus DevOps is = Dev(Developers) + Ops(Operations) .
      • is a culture or best practices
      • Smaller development cycles
      • More frequent deployments
      • Better collaboration between developers and operations and now both working on same objective:
        • bring quick and stable changes to the system
      • DevOps is NOT  a standard, tool, or job title.
        • It will use tools to automate the process as much as possible and
        • the tools can vary based on languages, platforms, business needs as
        • their is no standard that if one tool works for say company A will also work for company B.
        • It is more of a best practices based on the needs.
    • Salient Features of a DevOps culture:
      1. Build Automation:
        • It is the process of building the code to make it run using an automated tool or script
        • It is independent of the IDE
        • Benefits of build automation  –
          • fast(since no/least manual tasks),
          • repeatable(runs the same any time),
          • consistent(produces similar results any time),
          • reliable(will alert about build errors, will do tasks based on a predefined set of commands in same way) and
          • portable(will run o any similar environment the same way)
        • Tools:
          • Tool are based on programming language:
            • Java – Maven, Ant
            • JavaScript – npm
            • Make – Unix based
            • VS – C#, .NET etc
      2. Continuous Integration:
        • It is a continuous process of merging the developer’s code to the master(deploy-able, release ready branch)
        • It uses the automated test cases to pass or fail the developer’s change
        • It uses a CI server which detects any changes and run the automated test cases against the new build and passes of fails the change.
        • If any developer’s code “Breaks the build“, first they are alerted and the change is roll-backed for the developers to fix it.
        • Benefits:
          • Continuations testing(changes are tested continuously)
          • Early detection of problem(due to contentious changes problems are detected early )
          • No rush for deployments(developers don’t rush to push their code for release)
          • Frequent releases(due to contiguous changes there are small but frequent releases)
        • Tools:
          • Jenkins – open source, widely used to easy integration
          • TravisCI – open source, GitHub Integration
          • Bamboo – an enterprise product with strong integration with JIRA
      3. Continuous Delivery and Deployment:
        • Continuous Delivery is process of keeping the code always in deployable state
        • Continuous Deployment is the actual process of deploying the code
        • Both are Not interchangeable terms
        • No standard of how often to deploy. It depends on companies needs
        • Benefits:
          • Faster Time to market(due to less problems in the whole process)
          • Less risk(due to increased reliability)
          • Reliable deployments and rollbacks( due to consistent process)
        • Tools:
          • Tool
      4. Infrastructure as a code:
        • The process of using code to provision and manage resources or infrastructure by using code
        • Benefits:
          • Reusable(we can execute it as many times)
          • Scalable(we can execute it to as many servers
          • Consistent(it runs the same ways wherever it runs)
          • Trackable/Documenting(any infrastructure changes are well documented as these changes can be committed similar like a normal code )
        • Tools:
          • Tool
        • Please see the code below as an example of infrastructure as the code (The below Dockerfile uses Docker hub to install Python 2.7, sets up working directory, copies content into it and then runs the TicTacToe.py python file to run the game):
        • # Use an official Python runtime as a parent image
          FROM python:2.7-slim
          
          # Set the working directory to /app
          WORKDIR /app
          
          # Copy the contents to /app container folder
          ADD . /app
          
          # Run TicTacToe.py when the container launches
          CMD ["python", "TictacToe.py"]
      5. Configuration Management:
        • It is the process of managing or maintaining the state of the infrastructure changes in a consistent, stable and maintainable way
        • We use Infrastructure as a code ensure efficient configuration management
        • Benefits:
          • Time saving(the infrastructure as code can be executed on any numbers of times on any number of servers)
          • Consistent (same changes are made wherever the change is executed)
          • Maintainable(it will maintainable due to well documentation)
          • Less configuration drift(Since similar code is executed there is less configuration drift and that to is well documented)
        • Tools:
          • Ansible – open source, uses YAML config files, does not need a server and agent model, uses declarative configuration
          • Puppet – open source, needs a server and agent model, uses declarative configuration
          • Puppet – open source, needs a server and agent model, uses procedural configuration
          • Salt– needs a server and agent model, uses declarative configuration, uses YAML config files
      6. Orchestration:
        • It is a process of using a builder tool which automates the whole workflow or process.
        • E.g. docker-compose, Kubernetes etc
        • Benefits:
          • Scalibility(the orchestration tool can be used to execute the changes to any number of servers)
          • Stable(the changes are stable as always executed in the same fashion)
          • Self servicing(it is a auto healing method )
          • Granularity(there is full control over the whole process  due to each steps well defined)
          • Time saving(automation leads to quick turn around time)
        • Tools:
          • Kubernetes – biggest hit these days
          • Docker-compose
      7. Monitoring:
        • It is a process monitoring the state of a system, alerting any change in the state, presenting the state of the change in a meaningful manner
        • You can monitor the system resources like CPU, memory, I/O, Network, Logging etc
        • Benefits:
          • Fast recovery from failures(automated alerts help in recovering from failures by provisioning resources based on the alert )
          • Automated alerting and response(the alerts give the impulse for auto healing systems)
          • Root cause and visibility(helps in root cause analysis based on the information tracked and monitored)
          • Auto healing(With proper health checks and alerting system, we can configure the system to auto heal in the event of failure of a few servers by provisioning more servers )
        • Tools:
          • AppDynamics
          • Newrelic
    • Here is a website with the links and details of most of the DevOps tool, you click on these links to see the details of each of these. That is why although I enlisted quite a many tools but did not gave much details about them as you can see basic details in the periodic table itself and some of the tools I will cover in details- https://xebialabs.com/periodic-table-of-devops-tools/