Software Engineering/Architecture Blog: devops

Showing posts with label devops. Show all posts

Friday, 20 May 2016

Inverting Ansibe execution flow. The Pull Mode

In most cases, where I've seen Ansible being implemented to automate ops taks is using the Push mode. On this approach, playbooks start running from a given host where Ansible is set. The Ansible host is gonna interpret the tasks and apply "pushing" them to all target hosts through SSH.
What is maybe unnoticed when start playing with It, is the fact that the same results can be achieved by using a totally different flow, the Pull Mode.

There isn't much about It on the official documentation, but the idea is pretty simple.Instead of pushing playbooks to the target hosts, using pull mode you can make target host "pull" them from a given repository. By doing this, there is no need to have a single machine playing the ansible-host role, in this scenario, this responsibility is spread on the machines on the datacenter..
There is nothing special in order to get It running.Both, Pull and Push will be available after following the installing steps available here.
Lets say I want to deploy the application I build here using the Pull mode in all my cluster machines. After having Ansible properly installed on the target hosts, the following command should be raised:

This command will connect on Github and download the entire repository locally. After doing this, Ansible will look for a file named as Local.yml. This file should contains all tasks, or a reference to the ones who have them in order to perform a playbook.
An interesting approach is to make the target hosts pull the remote repository times to times. By doing this, changes will be applied on all target machines asynchronously and in background as soon they are available on the repository.That could be quite interesting when talking about provision hundreds or thousands of machines. This mode will scale much better than the Push mode. This can be achieved by just setting a cron job. and calling a script that encapsulates the pull command described before, like this:

The Pull mode can be useful also to change application configuration more dynamically. By using tags, I can update the log4j config as soon they hit the remote repository:

As we can see, there are a range of scenarios where the Pull mode can be useful. BTW, It could be a bit more flexible by letting the user specify which playbook to run (It only look or a file named as Local.yml, something different than that is gonna produce an error). Users need also be careful when sending code to repository when using this feature. Code badly written can break an entire datacenter without you notice.

Cheers,

Sunday, 1 May 2016

Organizing Automation - Ansible Roles

When talking about automation, Ansible is definitely one of the most simple and easy to use frameworks. It has a pretty low learning curve due Its comprehensive DSL, which is easy to understand. You also don't need to install anything on the server that will be provisioned (agent less architecture), which makes the setup simple. Everything looks great when the provisioning process has only two or three script files, but as soon you add more functionalities ,there will be some issues to deal with:

Reuse: there are certain provisioning tasks that are common to all servers, how to organise them in a such way they can be reused easily?
Organisation: similar as any programming code, without maintenance and good engineering practices, the provisioning process will be difficult to maintain and understand. Naming, modules organisation, conventions are all aspects that needs to be taken into account.

Ansible Roles

Ansible Roles are conventions that as a programmer you need to follow in order to achieve good level of reuse and modularisation. These conventions were added on version 1.2 and before this, the way to achieve better level of reuse was separating scripts into different files and including them on other scripts you want to reuse.

The documentation is very sparse when describing how Roles work, but the idea is pretty simple. Using Roles, you will be able to automatically load tasks, variables, files and handlers when provisioning a server or a group of them.

Lets look to an example, here I'm provisioning a Java application service. The server to run this application will need to have the following roles:

The role common is a role that any server in my infrastructure will need to have (reuse), which in this case is have JDK installed. The other role is called service, which is basically the things needed to run the service run the service Itself.

Ansible will automatically look for a directories called commons and service inside the main directory roles and execute all steps defined for them.
For the role service, we have:

vars

tasks

handlers

files: All files used on tasks will loaded from there.

There is still more directories that can be defined like templates and defaults. They aren't present on this example but are still useful. This is the full working example that provision a server that is able to run this Java application.

Using roles is great because they are expressive. Working with them properly you will be able to say what a given server is, which is much more declarative than just use the include directives. The directory conventions are good to define patterns to the whole team follow since day one and reuse achieved by defining very granular roles that can be set on different play books.

Cheers,

Friday, 15 April 2016

Releasing Applications as Native OS Packages

Agility when building, testing, packing and deploying Software is certainly a key quality aspect to pursue. BTW, there is no specific recipe to get there, among several other things that needs to done, avoid manual steps and not reinvent the wheel by relying on well established solutions on the community are some of them. Going on this direction, there is very interesting project maintained by Netflix that is called Nebula.

Nebula Project is a series of individual Gradle plugins, each focused on providing a very specific functionality in an usual Development Pipeline tasks. Today I'm gonna talk about one of those, nebula-os-package.

The main idea

The main idea behind It is to pack a JVM based application and Its metadata as a native OS package. The plugin is able to generate Deb and RPM packages, which are the most popular package formats on the Linux world.

Using

First of all, add the plugin on your build.gradle file.

Then, add the specify the plugin dependency

Now you need to say how the application is gonna be set on Host, after the package is installed.

Couple of important things that are happening here:

Specifying package name and version (lines 2 and 3)
Under which directory the package is gonna be placed after installed on the target Host (line 5).
All jars produced by Gradle during the build (the application jar and Its dependencies), are gonna be placed under /opt/packageName/lib on the target Host. (line 8)
Same thing for configuration files under the resource folder (line 18).
The scripts generated by Gradle when building a Java application are gonna be used to start It up on the target Host (line 11).

With everything property set, just execute the build command accomplished by the package task specified on the build file. The Debian package is gonna be placed at projectName/build/distributions

So What?

Someone could be arguing:

Why should I use this?
Is It better than build a Fat jar with all Its dependencies inside!
Gradle application plugin takes care of the whole Application start up for me generating useful scripts.

Yes, these are all valid points. Actually, this is the way we've being done so far when releasing applications outside of the J2EE world. But doing like this tasks like: deploy, start/stop, update and removing applications are all on you. Scripts will need to be create to manage all these, so one more thing that Ops and Dev teams will need to care about.

When deploying applications as Native OS packages, you can leverage a whole set of tools that are already there and none of the scripts mentioned before would be needed. This is a valid point when that affects agility when releasing and maintaining software.

Here I have a working example in case you wanna try It out.

Cheers,

Sunday, 28 June 2015

AWS-Lambdas - Automating function deploys

On this post, I talked about the ideas behind AWS-Lambdas computation service and how It works.The presented example shows how It can be be deployed and used. Even having an working example, there is an issue on the way I'm using It. All steps regarding the deploy process are manual. It just goes against agility. Manual deploys as such are error prone. More complex the application get, more expansive It will be to maintain.The side-effects when maintaining manual deploy steps are endless, so there should be an alternative to automate It and make AWS-Lambdas really cost effective as It promises to be.

Kappa seems to fill this gap. It is a command line tool that greatly simplifies the process of having lambdas deployed on the cloud. All steps described on the mentioned post can be automated. Now are talking!

Setup

Before start, be sure you have python (2.7.x) and pip available on the command line.

Installing kappa:

I strongly advice build It from sources as far there are important bug fixes that seems to be fixed recently:

git clone https://github.com/garnaat/kappa.git
cd kappa
pip install -r requirements.txt
python setup.py install

Instaling awscli:

sudo pip install awscli

Configuration:

First thing to do is create the the kappa configuration file. This is where I'm gonna tell It how to deploy my lambda function (config.yml).

---
profile: my-default-profile
region: us-west-2
iam:
  policy:
    name: AWSLambdaExecuteRole
  role:
    name: lambda_s3_exec_role
lambda:
  name: myLambdaFuncId
  zipfile_name: integrator.zip
  description: Somethhing that helps describe your lambda function
  path: src/
  handler: Integrator.handler
  runtime: nodejs
  memory_size: 128
  timeout: 3
  mode: event
  test_data: input.json
  event_sources:
    -
      arn: arn:aws:s3:::[set your bucket name]
      events:
        - s3:ObjectCreated:*

Lets see what is going on:

Line 2: There should be a profile that kappa will use to authenticate Itself on amazon and create the function in my behalf. We are gonna see it later on the awscli configuration;
Line 4: The policies assigned to this lambda. In case they aren't there yet, kappa will create them for me.
Line 9 - 18: function Runtime configs.
Line 19: This is the file that contains an example request in order to test the function. It is useful once we want to be sure everything is working fine after the deploy is over.
Line 20: Here I'm setting from where events will come from. In this case, any changes on the given bucket, will trigger a call to my function.

Now It's time to configure aws-cli. The only configuration needed is the security profile. Kappa will use It as stated before:

Create the following file in case Isn't already there: ~/aws/credentials and put the following content

[my-default-profile]
aws_access_key_id=[YOUR KEY ID ]
aws_secret_access_key=[YOUR ACCESS KEY ]

Having it set, It's time to deploy it using kappa tasks:

kappa config.yml create
kappa config.yml add_event_source
kappa config.yml invoke
kappa config.yml status

It should be enough to see the function deployed on the aws-console. The previous commands in order did:

create the function on the amazon
Make it listen changes on a given bucket
Test the deployed function using fake data (simulating an event)
Check the status of the deployed function on amazon.

As far Kappa let me automate all deploy tasks,I'm able to create a smarter deploy process. I worked in an example about how could It be done here. I may forgot to mention some detail about the process of having It work, so in this case leave me message and I'll be glad to help.

Sunday, 10 May 2015

Devops Culture - Why Silos are an issue?

Devops is becoming a hot subject nowadays. There are some nice materials here and here explaining the principles behind It. At this point, we know that Deveops isn't only about automation tools and frameworks. It's about a different way to thing I.T, One of these principle is the Culture.
This is actually an essential dimension when talking about changing the way companies build software. They can pick the best automation tools available or hire the best market consultants to built their architecture, but if cultural aspects aren't improved, they may wont ship software as they expect. Today I'm gonna talk about one these cultural aspects, the Silos.

The Traditional Division

It's very commons see companies splitting up teams by their skills. A traditional division that follows this idea would look like that:

On this model, the software creation will be supported by a series of specialised teams. These teams will "somehow" be talking to each other in order to build a solution. Using this structure, companies can have specialists on each that will lead people, ensuring issues/solutions are addressed properly. This structure can change a bit depending on the case. There are cases where companies nominate people to supervision (manage) to whole pipeline. They can be responsible by make everybody talk, that environments are delivered on schedule, issues are addressed by the right people/groups, etc.

Software has been delivered on top of this model during years. But we're living in a different era now and there are some scenarios this model does not handle well.

The Software Complexity

The problems we need to solve with software today are different than the ones we needed 10 years ago. Lets see some examples that illustrate this:

The data volume systems need to handle is way bigger than before. Architectural decisions need to take place in order to build systems that met business expectations. Such decisions can even change the way business sells their products to Its consumers.
The decision of moving to the cloud or not is today a strategic business decision not from some geek on the ops area.
Solutions needs to support traffic growth in terms of seconds. Systems need to be elastic, but architectural decisions can make it impossible to achieve. Ops and arch teams need to be on the same page here before take any architectural decision.
How to test a solution with the characteristics as highlighted before? QA team will need special skills.

Teams needs to be on the same page since the beginning here. Even small decisions can affect everybody's and make the release process slower release after release as far application grows. What happens with the traditional model is that people tend to ignore the software complexity as whole and focus the area where they are experts. People can workaround this issue and achieve the necessary coordination even on the traditional model, BTW, It is gonna demand more effort from everybody involved.
The complexity is well handled when people understand the system as whole. Adding barriers between teams does not help achieve this goal.

The Barriers Between Teams

When workstreams are used as silos, the development cycle is over simplified. Software is treated as a simple package that could be sent here and there. The issue is that the whole software complexity described before is ignored. As the complexity grows, releases and maintenance start take more time and then became more expansive. When dealing with complex software, communication, alignment and proper architectural understanding are essential to support the whole process. The barriers created due oversimplification wont help on this scenario. Companies tend to normalise the communication between teams in order to reduce misunderstanding but they actually just add more noise. The idea behind "breaking Silos" is remove the barriers to let people collaborate and then understand the system as whole.

As more barriers are created between business and production less agile the development cycle will be. Does not matter how many experts are on each Silo or how many automation tools they implement, The barriers between them will be always something to improve. By removing Silos, people are able to see the system as whole rather than only their "working-area". The real benefits became when system is optimised as whole instead of specific parts.