Visualising Continuous Deployment

An excellent lean coach at my workplace encouraged me to post this quick article about how [macro|micro]-services teams can quickly visualise how successful their continuous deployment implementation is. A picture says a thousand words and all that…so

This slideshow requires JavaScript.

I forgot to grab a snap of it today, here’s one Gus posted earlier this week. I’ll add more

Too many teams too few monitors

We had 10 teams all trying to integrate, normally we could list 5/6 teams vertically using the standard Jenkins BuildPipeline View – but we were short on real estate here – apparently a ticket to solve it was floating around somewhere 😉

To conserve space I checked out a copy of Dashing and re-organised the screen into what you see now, but with just three colours:

Red: Build has failed
Grey: Building
Light Green: No current failures

But.. I got a lot of questions, just general confusion. So I added the Production version to the screen too; indicating x.x.x was successfully deployed to Prod. It still lacked something though, you glanced at it and came away more confused than you were to start.

What message is it supposed to impart ?

This got me thinking, we’re an CI/CD driven floor – what we really need to know is how much we are honouring this principle –

is the latest stable build in Production?

More colours to the rescue

Red: Build has failed – stage indicated – e.g. Acceptance (probably Cloudtsack ;-))
Grey: Building with % indicator
*Light Green: All Good, last stable build is in Production
Less Light Green: Could be better, maybe 2/3 stable builds since Prod release
Getting Yellow: More work to be done here
Yellowy: 5+ stable builds since last release – let the floggings commence

*Colours are near values, it’s a (dodgy) function mapping the distance to an RGB value.

Now it seems to relay more information, I can quickly see which teams are making good progress – there’s probably a lot more you could do with this.

Building a GPS tracking device for your kayak

I like kayaking but I’m terrible for having a charged phone, or waterproof case for that matter to keep it safe. My friends always have great trip maps, with photos plotted nicely on Ramblr or such. In an effort to keep up, I’m going to try and build my own version.

The idea is to intermittently post batched geo information back to a server for real-time plotting (another blog post).

Here’s my shopping list: it’s arrived but Be warned, I have no idea what I am doing; really.

Female to Male Breadboard Wires for Electronic DIY 22cm
Ceramic Capacitor for DIY Electronic Circuit – Red (270-Piece Pack)
Temperature Humidity Sensor DHT11 Module for (For Arduino) – Deep Blue (Works with Official (For Arduino) Board)
121305 3.3V / 5V Power Module for Breadboard
GY-GPS6MV1 GPS APM2.5 Module with Antenna – Deep Blue (3~5V)
DIY LM317 Adjustable Voltage Regulator ICs – (5 PCS)
433M Superregeneration Wireless Transmitter Module (Burglar Alarm) and Receiver
Module Accessories for Arduino
SD Card Module Slot Socket Reader and Accessories for Arduino
Geeetech Updated GPRS/GSM SIM900 Shield V2.0 Compatible w
ith Arduino
Aluminum Electrolytic Capacitor for DIY Project (120-Piece Pack)

pgstufffa

GPS

I got it working, some quick soldering, download TInyGPSPlus and off you go. I need to add two resisitors to protect the GPS device, but it’s working.

Getting started load testing

Tools for Analysis

I find it’s easier to get something done if I can see the results quick. This post , maybe series (wishful thinking) covers that. A simple quick Docker setup to pipe stats into a InfluxDB and graph them with Grafana – giving you nice graphs with real data quickly. See below,

Eventually, I’ll add Prometheus and update the clients to allow use of ZipKin for distributed tracing analysis.

Docker to satisfy the Instant Gratification Monkey

For prototyping, or even Prod for the explorers, I find Docker to give the best return time. You can spin up complex networks in minutes, here’s what we’re trying to build.

Setup

The aim is to generate some load using Gatling. Once we get a baseline from each client, we can try adding load balancers or even on demand instances to see how much more performance we can eek out.

Load Generation

Gatling is a good candidate for this, it simulates multiple users per thread – making it more resourceful than JMeter and provides a good DSL for intuitive request generation.

There is a recording tool provided, similar to that of JMeter’s. You hit go, point your browser to proxy through it ; then navigate away through the sequence of clicks that will provide the test ‘scenario’.

I wanted to dabble with Scala so I took a manual approach – there’s a first cut of it below. The steps are as follows:

Download payload file from OAT server, regexp replace all the ids. (Offline)
Load all requests from the file
For each user we configure, stagger their arrival, add the unique id to with a rexgp/replace and send each message in the payload file, pausing realistically as we proceed.

Gatling Source (2.2.0)

class ToDo extends Simulation {

  val ids = csv("user-files/data/ids.csv")
  val destinationUrl: String = System.getProperty("destinationUrl", "http://localhost:8080")
    val duration = 60 //3600 + 1800
    val SETUP_URL = "/AngularJSRestful/rest/todos/158"
    val MAPPING_URL =  "/AngularJSRestful/rest/todos"


    val httpConf = http 
      .baseURL(destinationUrl) 
      .acceptHeader("application/json, text/plain, */*")
      .acceptLanguageHeader("en-US,en;q=0.5")
      .acceptEncodingHeader("gzip, deflate")
      .userAgentHeader("Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0")



      val incidents = Source.fromFile("user-files/data/incidents.txt").getLines.toList
      var chainList : MutableList[ChainBuilder] = new MutableList[ChainBuilder]()
      val pausetime = (duration / incidents.size)
      println(s":::>> Pausetime =  ${pausetime} seconds")


      def go() = {
        chainList+=setup()
        for(incident <- incidents) {
          chainList += generateWebRequest("New TODO", "${id}", Map("incident"->incident))
        }
        chainList
      }


      def setupRequest() =  {
        exec(http(SETUP_URL).get(SETUP_URL).queryParam("x","1")
          .check(regex("[0-9]")
          .saveAs("owner_id"))
        ).pause(10)
      }



      def generateWebRequest(requestName:String, sort:String, queryParamMap:Map[String,String]) = {
        val pl = queryParamMap.get("incident").getOrElse("{}").replace("_","${owner_id}")
        exec(http(requestName).post("/AngularJSRestful/rest/todos").header("Content-Type", "application/json").body(StringBody(pl))).pause(pausetime)
      }


      val scn = scenario("REST Client Publishing")
        .exec(go())

        setUp(
          scn.inject(constantUsersPerSec(20) during(15 seconds))
        ).protocols(httpConf)

}

The payload file from ./user-files/data/incidents.txt

{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"High"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Medium"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan1","owner":"_","priority":"Low"}
{"name":"Alan2","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan2","owner":"_","priority":"Low"}
{"name":"Alan2","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan3","owner":"_","priority":"Low"}
{"name":"Alan2","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan4","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}
{"name":"Alan","owner":"_","priority":"Low"}

Clients / Target

You can see from the Scala source above that each client is exposing a RESTful interface on 8080 – For this each client will need a Tomcat instance along with a RESTful app. I used Allen Fang’s – borrowed one from github

FROM tomcat:8.0
RUN apt-get -y update
RUN apt-get -y install collectl-utils vim-tiny

RUN sed -i 's!^DaemonCommands.*!DaemonCommands = -f /var/log/collectl -P -m -scdmn --export graphite,influx:2003,p=.os,s=cdmn!' /etc/collectl.conf
COPY app/AngularJS-RESTful-Sample/target/AngularJSRestful.war /usr/local/tomcat/webapps/AngularJSRestful.war
CMD service collectl start && /usr/local/tomcat/bin/catalina.sh start && while true; do sleep 5; done;

Above , I configure the client using a pre-rolled Tomcat 8 image from DockerHub, I add collectl and vim for tweaking in case I get it wrong.

After this, tell collectl to post it’s data regarding disk,network,memory and cpu all in graphite format to to our yet to be created InfluxDB on influx:2003

Coy the war file to $TOMCAT_HOME/webapps and start the services

InfluxDB

This is one is far less involved, this essentially just sits on the network listening for the screams from our overloaded clients.

FROM tutum/influxdb #easy huh!

This will start influx on port 2003 and an admin interface available on http://$IP:8083. See Docker Compose section below

Influx Setup

We need to add a database called graphitedb* once the host is up – so the the clients have their own Repos. You can log in and create one, with root/root I think , or use CURL to send a request:

curl -G http://localhost:8086/query --data-urlencode "q=CREATE DATABASE graphitedb"

This database name is specified in the config for InfluxDB

Listening for Grpahite data from the clients

In order for the clients to post their data to Influx, we need to enable the following in /config/config. To do this I ran the Dockerfile and copied the directory created by the instance to my local drive. I make the modification below and then mount this as the config for the instance created under Docker Compose

➜ ~ docker cp config/ 430b87a00999:/var/influxdb/config/

Enable this in /config/config

### Controls one or many listeners for Graphite data.
###
[[graphite]]
  enabled = true
  bind-address = ":2003"
  protocol = "tcp"
  consistency-level = "one"
  separator = "."
  database = "graphitedb"
  # These next lines control how batching works. You should have this enabled
  # otherwise you could get dropped metrics or poor performance. Batching
  # will buffer points in memory if you have many coming in.
  # batch-size = 1000 # will flush if this many points get buffered
  # batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
  batch-size = 1000
  batch-timeout = "1s"
  templates = [
     # filter + template
     #"*.app env.service.resource.measurement",
     # filter + template + extra tag
     #"stats.* .host.measurement* region=us-west,agent=sensu",
     # default template. Ignore the first graphite component "servers"
     "instance.profile.measurement*"
  ]

Grafana

Even easier: Grafana will be available on 3000. See Docker Compose section below

FROM grafana/grafana

Docker Compose

To wire all these together, we’ll use a docker-compose session. The directory structure for above looks as follows:

docker-compose.yml

version: "2"

services:
  grafana:
    build: grafana
    ports: ["3000:3000"]
    container_name: 'grafana'
    environment:
      -  GF_SECURITY_ADMIN_PASSWORD=secret
    links:
      - influx
      - clienta
      - clientb
      - clientc


  influx:
    build: influx
    ports: ["8083:8083","8086:8086"]
    container_name: 'influx'
    volumes:
      - '/var/influxdb:/data'
      - '/var/influxdb/config:/config'

  clienta:
    hostname: 'clienta'
    build: client
    ports: ["8080:8080"]
    container_name: 'clienta'

  clientb:
    hostname: 'clientb'
    build: client
    ports: ["8081:8080"]
    container_name: 'clientb'

  clientc:
    hostname: 'clientc'
    build: client
    ports: ["8084:8080"]
    container_name: 'clientc'

Running Load Tests

Now if you set up gatling, create the file above and run

./gatling.sh -s ToDo

We can get graphs like the following:

It’s hard to tell here without Prometheus or something similar for monitoring but the Tomcat container dropped about 6 requests about 3 minutes in – more than likely thread pool exhaustion – next post we’ll add monitoring.

Latest Buildbot (0.9.0b6) on CentOS 7

Shortly after I wrote that last post, I realised there was a newer version of Buildbot, and it has a far nicer UI. I’ve linked to the new Dockerfiles below.

The main changes in the Dockerfile are

forcing a version from pip
manually installing waterfall & console plugins

The main changes in the master.cfg are:

Removing old WWW config
Removing old Auth settings

Both of the above are replaced by a short block of code:

# www

c['buildbotURL'] = "http://localhost:8020/"

c['www'] = dict(port=8020,
 plugins=dict(waterfall_view={}, console_view={}))

which is enabled by a new Web Server module. Note the port change.

New Master Dockerfile, no change to slave: http://pastebin.com/Gz30Qe0f

Buildbot – developer sourced pipelines

Cheap-Ass / DevOps CD

I was working on a project at home in conjunction with a few mates who are scattered about; one in Galway one in the UK – both with differing IT abilities. The UI guy will never run release scripts, or really get git branching and the other guy is a developer, but a pretty busy one. I thought a CD pipeline similar to what we have, (and docker) would mean the UI guy would only need to master a graphical git client and it’s push to production for the busy developer. It needed to be cheap, essentially open-source and low maintenance , especially for setup.

BuildBot to the rescue

Buildbot is a very lightweight and configurable Build Framework, with laboursome documentation– it’s written in Python and requires getting hands on with Python to really make it do what you need. It’s a 1-N master/slave setup. It’s build customisation is similar to Gradle in that you code the sequence of steps – all of this is done in a single file, master.cfg – created by running:

>> buildbot create-master linode-master

In my case I wanted to have a Pipeline providing: test, acceptance and systest stages and that could deploy my Dockerised production Web application (nodesjs/scala/redis):

Admittedly, I was a few nights mucking about with it before the penny dropped. It has a plugin architecture, which provides SVN pollers, schedulers, triggers etc.. – but other than that it’s essentially down to taking your individual build stages (e.g. test or UAT) and transcribing them into a series of steps usually using somethting like:

factory.add(new ShellCommand(
           cmd = ["git", "checkout"....etc], failOnError = True));
factory.add(new Trigger(scheduler = "unit-test",  waitForFinish = True));
factory.add(new Trigger(scheduler = "acceptance",  waitForFinish = True));

BuildBotArch — BuildBot Component Architecture

Project Build Architecture

I got two identical linode servers, one for running git, and test environments – one for prod deployment. My setup looks like this:

Customising the Build

I was relying heavily on Docker for this deployment, I needed each of the developer slaves to be Dockerised for ease of use and versioning. Dockerfiles for the master and slave are attached.

As the Dockerised Acceptance & Systest stages needed to be run on baremetal, the BuildConfig had to be aware that it needed to use a different slave for these stages.

With buildbot that equates to something like:

buildBranch = Interpolate("scraper-release-candidate-%(prop:buildnumber)s")
1: f.addStep(steps.ShellCommand(command = ["git", "push", "--set-upstream",
     "origin", buildBranch], haltOnFailure = True))
2: f.addStep(trigger.Trigger(schedulerNames = ['baremetal-slave'],
     waitForFinish = True,
     copy_properties = ['buildnumber'], haltOnFailure = True))

On completing the Docker slave’s build & test stage then branch for release candidate
If previous stage was success, then launch the baremetal-slave which has a builder for acceptance,systest and Prod.

Admin and Dashboards

The UI is not so pretty, at first I was struggling with it, but once you add your own stages it becomes more intuitive. The screenshot below from a build that passed the test stage and triggered the / baremetal builder for the next stage. (it was a test run, simply echo to console). The build starts on the bottom right and works up and to left.

Organic DevOps

In line with the DevOps self-sufficiency practices teams could self enable given sufficient access to the IAAS layer.

Config Files for Docker

Master Docker Image: http://pastebin.com/cbbiCemn
Slave Docker Image: http://pastebin.com/P0Tm6TxT

VisualVM Basics

We recently deployed AppDynamics at work and not to be leaving all the good stuff to the new tools, I thought I’d have a look at VisualVM . I wanted to have a good look at this for a while, it’s shipped with the JDK for the last couple of versions too.

I wrote some bad code and cracked it open to get a feel for it. I made the performance bottleneck obvious so that it will be easy to find in the output of of VisualVM. (Code is below)

Download VisualVM or use the one shipped with your JDK $JAVA_HOME/bin/jvisualvm

Running VisualVM, I see my application in the left hand side (Illustration A 1,2) , double clicking this opens a tabbed pane to the right (Illustration A3).

From here I can choose various ‘profiling’ options, I’m going to use Profiler as my program is trivial and I don’t care about instrumentation overhead. Selecting the Profiler tab, I profile Memory, after editing the settings and selecting the option to Profile Stack Traces ((Illustration A 4, 6).

The char allocation here is clearly marked by VisualVM, this is the large String I am building up in badAppend(..) – (Profile the CPU to see the method execution info).

Interestingly here, there is a TreeMap coming in strongly behind the expected char[] array – to investigate the origins of this, I can right click the list item during profiling and select the TakeSnapshot and Show Stack Trace option – in reality I had to take the Snapshot first and then right click – my target program crashes out otherwise. The graph presented identifies the garbage collector (Illustration A 5) – who would have been working hard … Take a look at the heap/memory allocation rate in the first tab – monitor and you’ll see it’s climbing up to a peak – dropping as the garbage collector reclaims the thousands of unnecessary objects we are allocating in createDataSize

Illustration B

Notice the pattern of the peaks here, I’m wondering is this a sweep of the Eden space, followed by some admin then a sweep of the survivor spaces?

I couldn’t find the VM arguments in VisualVM* so I ran jps -v on the command line; giving me this

4529 Main -Xms128m -Xmx750m -XX:MaxPermSize=350m -XX:ReservedCodeCacheSize=96m -XX:+UseCodeCacheFlushing -ea -Dsun.io.useCanonCaches=false -Djava.net.preferIPv4Stack=true -Djb.vmOptionsFile=/home/alan/apps/idea-IU-129.1359/bin/idea64.vmoptions -Xbootclasspath/a:/home/alan/apps/idea-IU-129.1359/bin/../lib/boot.jar -Didea.paths.selector=IntelliJIdea12 -Djb.restart.code=88
11096 Jps -Dapplication.home=/home/alan/apps/jdk1.8.0 -Xms8m

* Actually there they are, just below the Heap Tab in Illustration B

The Xmx750m is my pool/heap size – when we near this the GC must intervene – next time I’ll investigate the length of these GC runs – in theory it should be quick as we are following the ‘expected’ many short lived objects pattern, meaning that most garbage will be run on the small Eden space – with fewer on the survivor spaces. To check this I installed the excellent VisualGC plugin:

Illustration C

I can see the eden space being GC’d quite often and the survivor spaces what seems to be less often – but climbing. I think I took this screenshot early in the run which would explain the increasing allocation rate.

Watching the full animation of the GC Plugin is well worth it. This VM is using a Parallel GC and we can see it in action as the program runs. When the Eden space fills, a GC occurs, at about 50% of Max. Eden space on this run , S0 (Survior 0) is populated with a ‘2nd generation’ of objects – freeing up some room in (the cheapest memory space to GC) Eden, S1 is often purged at this point too. This allows the application to optimistically run on with a ‘healthy’ Eden space available. The GC people wouldn’t be catering for these intentional heap filling executions!! When we quickly once again fill Eden, S0 is purged, promoting it’s objects to their 3rd generation and into S1. My large Heap space / Old Gen fills far slower relative to the other spaces. This cycle repeats as the program runs, explaining I believe the shape of the Heap allocation graph in Illustration B

None of this is science, just me passing time!

I’m using Java 1.8 and Visual VM 1.3.8

package tests;

/**
* Created with IntelliJ IDEA.
* User: alan
* Date: 09/09/14
* Time: 18:54
* To change this template use File | Settings | File Templates.
*/
//http://stackoverflow.com/questions/2474486/create-a-java-variable-string-of-a-specific-size-mbs
public class SimpleStringHogger {

public static void main(String... args) {
new SimpleStringHogger().createDataSize(Integer.valueOf(args[0]));
System.out.println("Done");
}

private StringWrapper createDataSize(int msgSize) {
StringWrapper data = new StringWrapper("a");
while (data.length() < (msgSize * 1024) - 6) {
data = data.badAppend("s");
}
return data;
}

private class StringWrapper {
private String s;

private StringWrapper(String s) {
this.s = s;
}

public StringWrapper badAppend(String s) {
return new StringWrapper(this.s + s);
}

public int length() {
return s.length();
}
}
}

dpkg: error processing archive /var/cache/apt/archives/openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb (–unpack):

I’ve been getting this error during package install lately:

Preparing to unpack .../openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb ...
Unpacking openjdk-7-jre-headless:amd64 (7u65-2.5.1-4ubuntu1~0.14.04.1) over (7u55-2.4.7-1ubuntu1) ...
dpkg: error processing archive /var/cache/apt/archives/openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb (--unpack):
 trying to overwrite shared '/etc/java-7-openjdk/content-types.properties', which is different from other instances of package openjdk-7-jre-headless:amd64
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cache/apt/archives/openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

I’m running,

Distributor ID:   Ubuntu
Description:   Ubuntu 14.04.1 LTS
Release:   14.04
Codename:   trusty

The main culprit is the emboldened text. I checked the archive against the files it was trying to overwrite and since they’re the same I just omitted them from the install. I thought about blacklisting (see below) the file, but that might not be such a good idea. Anyhow, below is how to get around this, once you check the files you are omitting are the same as the currently installed ones.

Workaround

alan@desktop:/tmp/pkg$ cd /tmp
alan@desktop:/tmp/pkg$ mkdir pkg
alan@desktop:/tmp/pkg$ cd pkg
alan@desktop:/tmp/pkg$ wget https://launchpad.net/~ubuntu-security-proposed/+archive/ubuntu/ppa/+build/6233940/+files/openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1%7E0.14.04.1_amd64.deb
alan@desktop:/tmp/pkg$ dpkg-deb -x openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb openjdkdeb
alan@desktop:/tmp/pkg$ dpkg-deb --control openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb openjdkdeb/DEBIAN
alan@desktop:/tmp/pkg$ vi openjdkdeb/DEBIAN/conffiles

Remove all the lines with files that clash with those in /etc/java-7-openjdk/ ..

!Once you have checked these files: openjdk/etc/ are identical to your current bunch in /etc/java-7-openjdk/ then carry out the following, if not then copy all the properties files contained in /etc/conffiles to another tmp directory, to copy them :

alan@desktop:/tmp$ mkdir newprops && for file in $(cat openjdkdeb/DEBIAN/conffiles); do cp openjdkdeb/DEBIAN/etc/$(echo $file | cut 
-d/ -f4) newprops/; done

Remove the files from the archive

alan@desktop:/tmp/pkg$ find openjdkdeb/etc/ -name "*.properties" -exec rm '{}' \;

Cont…rebuild package

alan@desktop:/tmp/pkg$ rm openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb && dpkg -b openjdkdeb/ openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb
alan@desktop:/tmp/pkg$ sudo dpkg -i openjdk-7-jre-headless_7u65-2.5.1-4ubuntu1~0.14.04.1_amd64.deb

If you made a copy of the .properties files from the extracted .deb archive, then copy them back to /etc/java-7-openjdk/

As always, your mileage may vary – I need to get some things installed and I’ll deal with any issue I’ve caused in opendk as they arise….

Blacklist the file

alan@desktop:~$ sudo apt-mark hold package_name openjdk-7-jre-headless:amd64
openjdk-7-jre-headless set on hold.

codeistuff

computer stuff

Author: alanryanwp