miércoles, abril 01, 2015

Apache Mesos + Marathon and Java EE


Apache Mesos is an open-source cluster manager that provides efficient resource isolation and sharing across distributed applications or frameworks.

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. It uses dynamic allocation of applications inside machines.

In summary Apache Mesos is composed by master and slaves. Masters are in charge of distributing work across several slaves and knowing the state of each slave. You may have more than one master for fault-tolerant.

And then we have the slaves which are the responsible of executing the applications. Slaves isolate executors and tasks (application) via containers (cgroups). 

So each slave offers its resources and Apache Mesos is in charge of schedule which slave will execute it. Note that each slave may execute more than one task if it has enough resources to execute it.



For example let's say that an Slave has 4 CPUs (to simplify I am not going to take into account other parameters), then it could execute 1 task of 4 CPU, 2 tasks of 2CPUs, ...

But Apache Mesos only manages resources, but for building a PaaS we need something more like service discovery or scaling features. And this is what Marathon does.

Marathon is a framework that runs atop of Apache Mesos and offers:

  • Running Linux binary
  • Cluster-wide process supervisor
  • Service Discovery and load balancing (HAProxy)
  • Automated software and hardware failure handling
  • Deployment and scaling
  • REST friendly

But one of the main advantages of using Marathon is that it simplifies and automates all those common tasks.

So main task of Marathon is deploy an application to different salves, so if one salve fails there are other slaves to serve incoming communications. But moreover Marathon will take care of reallocating the application to another slave so the amount of slaves per application is maintained constant. 



Installing Apache Mesos and Marathon in a developer machine is as easy as, having VirtualBox, Vagrant and git installed.

Cloning next repo:


And simply run vagrant-up command from the directory:

cd playa-mesos
vagrant up

First time it will take some time because it need to download several components.

After that you can check that it is correctly installed by connecting to Mesos and Marathon Web Console. http://10.141.141.10:5050 and http://10.141.141.10:8080

Next step is installing HAProxy. Although it is not a requirement HAProxy is "required" if you want to do Service Discovery and Load Balancing.

Run vagrant ssh.

Install HAProxy

sudo apt-get install haproxy

Download haproxy-marathon-bridge script:

chmod 755 haproxy-marathon-bridge

./haproxy_marathon_bridge localhost:8080 > haproxy.cfg
haproxy -f haproxy.cfg -p haproxy.pid -sf $(cat haproxy.pid)

And this configures HAproxy. To avoid having to run this command manually for every time topology change you can run:

./haproxy_marathon_bridge install_haproxy_system localhost:8080 

which installs the script itself, HAProxy and a cronjob that once a minute pings one of the Marathon servers specified and refreshes HAProxy if anything has changed.

And that's all, now we have our Apache Mesos with Mesosphere and HAProxy installed. Now it is time to deploy the Java EE application server. In this case we are going to use Apache TomEE.

The only thing we need to do is sending a JSON document as POST to http://10.141.141.10:8080/v2/apps 

{
  "id": "projectdemo",
  "cmd": "cd apache-tomee-plus* && sed \"s/8080/$PORT/g\" < ./conf/server.xml > ./conf/server-mesos.xml && ./bin/catalina.sh run -config ./conf/server-mesos.xml",
  "mem": 256,
  "cpus": 0.5,
  "instances": 1,
  "ports":[10000],
  "constraints": [
    ["hostname", "UNIQUE"]
  ],
  "uris": [
  ]
}

This JSON document will make Marathon to deploy the application in one node. Let's explain each attributes:

id: is the id of the application, not much secret here.

cmd: the command that will execute when node is chosen an ready. In this case note that we are creating a server-mesos.xml file which is a modified version of server.xml file but replacing 8080 port to $PORT var. For now is enough. Finally it starts TomEE with server-mesos.xml configuration file.

mem: Memory that will require in the node.

cpus: Cpu resources that will require in the node.
instances: number of nodes that we want to replicate this application. In this case only one because we are running locally.

ports: which ports will group all application instances. Basically this port is used by HAProxy to route to the correct instance. We are going to explain deeply in next paragraph.

constraints: constraints control where apps run to allow optimizing for fault tolerance or locality. In this case we are setting that each application should be in a different slave. With this approach you can avoid port collision.

uris: Sets the URI to execute before executing the cmd part. In case of a known compressed algorithm, it is automatically uncompressed. For this reason you can do a cd command in cmd directly without having to uncompress it manually.

So let me explain what's happening here or what Mesosphere does:

First of all reads the JSON document and inspect which slave has a node that can process this service. In this case it only needs to find one. (instances = 1).

When it is found, then the uri element is downloaded, uncompressed and then executes the commands specified in cmd attribute in current directory.
And that's all.

But wait what is ports and $PORT thing?

$PORT is a random port that Mesosphere will assign to a node to communicate with. This port is used to ensure no two applications can be run using Marathon with overlapping port assignments.

But also it is used for Service Discovery and Load Balancing by running a TCP proxy on each host in the cluster, and transparently forward a static port on localhost to the hosts that are running the app. That way, clients simply connect to that port, and the implementation details of discovery are completely abstracted away.

So the first thing we need to do is modifying the configuration of the TomEE to start at random port assigned by Marathon, for this reason we have created a new server.xml file and setting listening port to $PORT.

So if port is random, how a client may connect if it doesn't know in which port is started? And this is the ports attribute purpose. In this attribute we are setting that when I connect to port 10000 I want to connect to the application defined and deployed to any slave and independently of the number of instances.

Yes it may be a bit complicated but let me explain with a simple example:

Let's say I have the same example as before but with two instances (instances = 2). Both TomEE instances will be started in two different slaves (so in different nodes) and in different ports. Let's say 31456 and 31457. So how we can connect to them?

Easy. You can use the IP of Marathon and the random port (http://10.141.141.10:31456/) which you will access to that specific server, or you can use the global defined port (http://10.141.141.10:10000/) which in this case HAProxy will route to one of instances (depending on load balancing rules).

Note this has a real big implication on how we can communicate between applications inside Marathon, because if we need internal communication between applications that are deployed in Marathon, we only need to know that global port, because the host can be set to localhost as HAProxy will resolve it. So from within Marathon application we can communicate to TomEE by simply using http://localhost:10000/ as HAProxy will then route the request to a host and port where an instance of the service is actually running. In next picture you can see the dashboard of Marathon and how the application is deployed. Note that you can see the IP and port of deployed application. You can access by clicking on it or by using Marathon IP (the same as provided in that link) but using the port 10000. Remember that HAProxy is updated every minute so if it works by using random port and not using port 10000 probably you need to wait some time until HAProxy database is refreshed.


And that's all, as you may see Apache Mesos and Marathon is not as hard as you may expect at first.

Also note that this is a "Hello World" post about Mesos and Java EE, but Mesos and Mesosphere is much more than this like healthy checks of the services, running Docker containers, Artifact Storage or defining dependencies, but I have found that running this simple example, helps me so much clarifying the concepts of Mesosphere and it was a good point of start for more complex scenarios.

We keep learning,
Alex.
Dilegua, o notte!, Tramontate, stelle!, Tramontate, stelle!, All'alba vincerò!, Vincerò! Vincerò! (Nessun dorma - Giacomo Puccini)