HAProxy and ejabberd

While ago i have posted article about load balancing ejabberd using Cisco SLB. It was targeted for people who have ability to use Cisco based solutions - which are kind of expensive.

Today i will write a little bit about HAProxy, if you don’t have Cisco hardware in handy.
Many administrators setup XMPP clusters, and route traffic to them using SRV records as RFC standard says. This unfortunately is not the best option, SRV have several weaknesses (mostly related to it’s DNS nature…):
  • DNS do not have information about reliability nor load of nodes
  • SRV need to be implemented correctly on client side (which often is a problem due to DNS cache etc.)
  • In case of multiple entries and event of failure, client connection can be significantly delayed
That’s why i think DNS balancing (srv or round-robin) should never be considered as “load-balancer”. There is better solution, and it is called: HAproxy.
It is one of available solutions, but from my experience - the best one can choice. HAproxy “is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications”. Because it offers TCP proxying it is good fit for load-balancing XMPP protocol.
OK, so how do we do it? Here is the overview of simple setup:
image
As you can see we are using ejabberd - the best industry standard XMPP server developed by ProcessOne. The setup is a cluster witch 3 nodes. In this example we will setup one SRV record pointing to HAproxy (optionally as a fail-over you can add another HAproxy instance or add SRV records for any single one node directly - but remember to add it with proper “priority” setting - so haproxy is always selected first!), so all connections will be handled by load-balancer. (As a side-note, this example introduces spof. To mitigate it you can run second instance of the HAproxy and add it as a second option into SRV records)

Next: haproxy.conf. Here is sample config, tested on some not very small environments: 

global
        log /var/run/syslogd.sock local0
        maxconn 60000
        nosplice
        chroot /usr/share/haproxy
        uid 65534
        gid 65534
        pidfile /var/run/haproxy.pid
        stats socket /usr/share/haproxy/haproxy-stats.sock
        daemon
defaults
        log     global
        mode    tcp
        retries 2
        option redispatch
        option tcplog
        option tcpka
        option clitcpka
        option srvtcpka
        timeout connect 5s      #timeout during connect
        timeout client  24h     #timeout client->haproxy(frontend)
        timeout server  60m     #timeout haproxy->server(backend)

frontend access_clients 213.134.1.1:5222
        default_backend cluster_clients

frontend access_clients_ssl 213.134.1.1:5223
        default_backend cluster_clients_ssl

frontend access_servers 213.134.1.1:5269
        default_backend cluster_servers

backend cluster_clients
        log global
        balance leastconn
        option independant-streams
        server  server1 10.0.0.1:5222 check fall 3 id 1005 inter 5000 rise 3 slowstart 120000 weight 50
        server  server2 10.0.0.2:5222 check fall 3 id 1006 inter 5000 rise 3 slowstart 120000 weight 50
        server  server3 10.0.0.3:5222 check fall 3 id 1007 inter 5000 rise 3 slowstart 120000 weight 50

backend cluster_clients_ssl
        log global
        balance leastconn
        option independant-streams
        server  server1 10.0.0.1:5223 check fall 3 id 1008 inter 5000 rise 3 slowstart 240000 weight 50
        server  server2 10.0.0.2:5223 check fall 3 id 1009 inter 5000 rise 3 slowstart 240000 weight 50
        server  server3 10.0.0.3:5223 check fall 3 id 1010 inter 5000 rise 3 slowstart 240000 weight 50

backend cluster_servers
        log global
        balance leastconn
        option independant-streams
        server  server1 10.0.0.1:5269 check fall 3 id 1011 inter 5000 rise 3 slowstart 60000 weight 50
        server  server2 10.0.0.2:5269 check fall 3 id 1012 inter 5000 rise 3 slowstart 60000 weight 50
        server  server3 10.0.0.3:5269 check fall 3 id 1013 inter 5000 rise 3 slowstart 60000 weight 50

I will not explain every single option, as this is done in excellent documentation, but i will interpret shortly what is going on here. As you can see config reflects graph introduced before, we have 3 “frontend” services (5222, 5223, 5269 - for client TLS, client SSL, and server-2-server), pointing to three backend servers. HAproxy in this example will spread the load equally on all backend servers for all services (leastonn+weight) also it will start accepting connections gradually in the event of failure (slowstart) to prevent connection storm hitting servers when they start. There is couple of options that you can fine-tuned for your needs - like timeouts, fail counts. This is base for your experiments with LB topic. Go and play with it!
One of main advantages of HAproxy is that it is extremely simple, fast to setup, highly reliable and have low footprint on hardware too. Thanks to all that pros we can imagine variety of usages like geographically dislocated proxy servers for XMPP (super interesting topic - will write on that some day), or cross-proxy for better availability.
This is not my last post about LB and XMPP, next stop is Amazon Elastic Load Balancing (ELB), which is great solutions for admins who host their servers on AWS. Stay tuned!

This was posted 13 years ago. It has 0 notes.