HAProxy and ejabberd

While ago i have posted article about load balancing ejabberd using Cisco SLB. It was targeted for people who have ability to use Cisco based solutions - which are kind of expensive.

Today i will write a little bit about HAProxy, if you don’t have Cisco hardware in handy.

Many administrators setup XMPP clusters, and route traffic to them using SRV records as RFC standard says. This unfortunately is not the best option, SRV have several weaknesses (mostly related to it’s DNS nature…):

DNS do not have information about reliability nor load of nodes
SRV need to be implemented correctly on client side (which often is a problem due to DNS cache etc.)
In case of multiple entries and event of failure, client connection can be significantly delayed

That’s why i think DNS balancing (srv or round-robin) should never be considered as “load-balancer”. There is better solution, and it is called: HAproxy.
It is one of available solutions, but from my experience - the best one can choice. HAproxy “is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications”. Because it offers TCP proxying it is good fit for load-balancing XMPP protocol.
OK, so how do we do it? Here is the overview of simple setup:

As you can see we are using ejabberd - the best industry standard XMPP server developed by ProcessOne. The setup is a cluster witch 3 nodes. In this example we will setup one SRV record pointing to HAproxy (optionally as a fail-over you can add another HAproxy instance or add SRV records for any single one node directly - but remember to add it with proper “priority” setting - so haproxy is always selected first!), so all connections will be handled by load-balancer. (As a side-note, this example introduces spof. To mitigate it you can run second instance of the HAproxy and add it as a second option into SRV records)

Next: haproxy.conf. Here is sample config, tested on some not very small environments:

global

log /var/run/syslogd.sock local0

maxconn 60000

nosplice

chroot /usr/share/haproxy

uid 65534

gid 65534

pidfile /var/run/haproxy.pid

stats socket /usr/share/haproxy/haproxy-stats.sock

daemon

defaults

log global

mode tcp

retries 2

option redispatch

option tcplog

option tcpka

option clitcpka

option srvtcpka

timeout connect 5s #timeout during connect

timeout client 24h #timeout client->haproxy(frontend)

timeout server 60m #timeout haproxy->server(backend)

frontend access_clients 213.134.1.1:5222

default_backend cluster_clients

frontend access_clients_ssl 213.134.1.1:5223

default_backend cluster_clients_ssl

frontend access_servers 213.134.1.1:5269

default_backend cluster_servers

backend cluster_clients

log global

balance leastconn

option independant-streams

server server1 10.0.0.1:5222 check fall 3 id 1005 inter 5000 rise 3 slowstart 120000 weight 50

server server2 10.0.0.2:5222 check fall 3 id 1006 inter 5000 rise 3 slowstart 120000 weight 50

server server3 10.0.0.3:5222 check fall 3 id 1007 inter 5000 rise 3 slowstart 120000 weight 50

backend cluster_clients_ssl

log global

balance leastconn

option independant-streams

server server1 10.0.0.1:5223 check fall 3 id 1008 inter 5000 rise 3 slowstart 240000 weight 50

server server2 10.0.0.2:5223 check fall 3 id 1009 inter 5000 rise 3 slowstart 240000 weight 50

server server3 10.0.0.3:5223 check fall 3 id 1010 inter 5000 rise 3 slowstart 240000 weight 50

backend cluster_servers

log global

balance leastconn

option independant-streams

server server1 10.0.0.1:5269 check fall 3 id 1011 inter 5000 rise 3 slowstart 60000 weight 50

server server2 10.0.0.2:5269 check fall 3 id 1012 inter 5000 rise 3 slowstart 60000 weight 50

server server3 10.0.0.3:5269 check fall 3 id 1013 inter 5000 rise 3 slowstart 60000 weight 50

I will not explain every single option, as this is done in excellent documentation, but i will interpret shortly what is going on here. As you can see config reflects graph introduced before, we have 3 “frontend” services (5222, 5223, 5269 - for client TLS, client SSL, and server-2-server), pointing to three backend servers. HAproxy in this example will spread the load equally on all backend servers for all services (leastonn+weight) also it will start accepting connections gradually in the event of failure (slowstart) to prevent connection storm hitting servers when they start. There is couple of options that you can fine-tuned for your needs - like timeouts, fail counts. This is base for your experiments with LB topic. Go and play with it!

One of main advantages of HAproxy is that it is extremely simple, fast to setup, highly reliable and have low footprint on hardware too. Thanks to all that pros we can imagine variety of usages like geographically dislocated proxy servers for XMPP (super interesting topic - will write on that some day), or cross-proxy for better availability.

This is not my last post about LB and XMPP, next stop is Amazon Elastic Load Balancing (ELB), which is great solutions for admins who host their servers on AWS. Stay tuned!

This was posted 13 years ago. It has 0 notes.