Nowadays, a website is not only a simple HTML page. Your visitors expect dynamic, personalized information fast and you need a scalable way to deliver content as quickly as possible. This, of course, puts significant pressure on page loads and response time. In this series of posts, we’ll explore eZ’s system architecture and provide recommendations on how you can optimize caching and decrease response time with eZ software.
As we’ve talked about on this blog many times before, eZ Platform, and its predecessor eZ Publish Platform 5.x, are LAMP (Linux, Apache, MySQL and PHP) friendly and based on the Symfony full-stack framework, the leading PHP framework today. Together, eZ and Symfony provide a trusted toolbox which helps you to optimize performance.
First, let's take a look at a typical example of a system architecture used by eZ's customers. As you can see, the main components of the standard eZ architecture include Varnish, Apache and MySQL. All of these components allow high load and high performance, and in this post we’ll explain how you can take advantage of them to optimize the performance of your own eZ-powered site or app.
In the typical eZ system architecture, three layers are implemented: Load Balancer, Varnish servers and eZ servers. Load Balancer dispatches your visitors to your Varnish Servers (cache system). Varnish servers have a mechanism to load balance the requests to eZ’s servers.
For organizations using eZ Publish 5.x or eZ Platform, eZ recommends using Memcached to cache and share data (could be PHP sessions, too) and a NAS (Network Attached Storage within a cluster of eZ servers, as shown in the diagram above. Moreover, we recommend each layer has at least two servers to avoid a SPOF (Single Point Of Failure). That said, eZ’s content platform is flexible and organizations can configure the architecture to meet their needs.
Now, let’s dig into the individual components.
Varnish is an HTTP accelerator that delivers cached pages. It’s used to reduce load on a web server. In a perfect world, the server would return a response immediately, without having to do any real work. In the real world, however, the server may have to do a bit of work before returning a response to the client.
Varnish also supports additional features that can provide much more value such as Edge Side Includes (ESI), HTTP Streaming, Experimental support for Persistent Storage, DNS and Load Balancing.
So when and why use Varnish on a web project? The major questions you should ask yourself are:
- Does dynamic content in your application change (very) often?
- Do you want to implement a caching strategy without touching (or just barely touching) your codebase?
If you answer yes to these questions, you should use Varnish.
eZ Publish 5.x, which is based on the Symfony full-stack framework, implements HTTP Cache - a feature also implemented by Varnish. To take advantage of the available cache layers, your application must be able to communicate which responses are cacheable and the rules that govern when and how that cache should become stale. This is done by settings HTTP cache headers on the response.
HTTP Protocol uses two different models to purge cache:
- Expiration: HTTP caches assign expiration times, employing algorithms that use other header values to estimate a plausible expiration time. Most commonly used headers are Expires and Cache-Control.
- Validation: Gateway could send an HTTP request to the backend (with some headers like Last-Modified and Etag) to determine if the cache is up-to-date or not. If so, the backend responds with a 304 status code meaning “Not Modified.”
eZ Publish uses both models in order to have less network requests on your architecture. Check our online documentation to configure eZ and Varnish (https://doc.ez.no/display/EZP/Using+Varnish)
A small tip: be careful with Varnish and query strings. It’s really important to parse and order them or else for each different order (if you used exactly the same parameters), the gateway could create a different cache variation. If this happens, you will increase your cache size and not be able to properly cache pages containing query strings.
Apache is an open-source HTTP server for modern operation systems, and the world’s most widely used HTTP server today. Instead of implementing one architecture, Apache provides a variety of MultiProcessing Modules (MPMs):
- Prefork. Not threaded; this means in a process list, each Apache process is running a child that is either servicing a request or waiting for a request.
- Worker (process and threads). It turns Apache into a multi-process and multi-threaded web server. Worker can handle more requests with less resources than Prefork.
- Event mode. The big difference with Worker is that Apache will dedicate a thread to a request and not the whole HTTP connection.
Side note: In this blog post, we decided to talk only about Apache. Of course since eZ Publish 5.x, eZ Systems supports usage of Nginx. Nginx is an open-source HTTP server but it’s only running on event mode. You could find more information on the Nginx website http://nginx.org/.
Which modules should you use for your project? If your operating system (OS) supports thread and polling thread-safe (kqueue and epoll functions), you can use event mode. If it only supports threads, you can use worker mode or event mode. Be careful, in certain cases prefork mode could be better than event mode. For example, if the server has more than 3Go of RAM memory and serves only static files.
Optimizing Apache, like any other part of the LAMP pile, takes time. Here are some ways to tune your system:
- Unload unused modules: by not using them, you will save memory.
- Disable .htaccess files: Apache requires some time to check the existence of a .htaccess file and apply rules to describe it. Use only VirtualHost configuration to disable this behavior.
Limit the number of apache process and children: To calculate a good amount of MaxChildClients directives, find out how much memory your Apache processes use. Using “top” command line, check the RES column (Resident Memory Size), as shown below.
For example, here the value is 7832 which means it’s not quite using 10MB of RAM. If I limit Apache to a maximum of 20 child processes, then it should max out at 200MB of RAM. When your change is done, restart Apache and monitor the uptime. It should not go above 1.
- Use caching: in order to reduce the overhead of your server and minimize processor requests, use disk caching (mod_disk_cache) over memory caching (mod_mem_cache) especially if you are limited by the number of RAM available.
- Use compression: Using gzip will reduce the size of the file being transmitted.
Turn off host name lookup: With host name lookup, every time a host name is encountered, a DNS lookup is required. Check the value is “off”.
MySQL and MariaDB
MySQL is an open source relational database management system (RDBMS) founded in 1994 by Michael Widenius and David Axmark. In 2008, Sun Microsystems acquired MYSQL AB, thenOracle acquired Sun Microsystems in 2010. The day Oracle announced the purchase of Sun, Michael Widenius forked MySQL, launching MariaDB and taking a swath of MySQL developers with him.
eZ Publish 5.x and above support both databases. Feel free to take a look at our requirements: https://doc.ez.no/display/TECHDOC/Requirements.
MySQL engines can be really complex if you want to understand all the mechanisms. You can check the engine’s documentation:
- MySQL Storage Engines https://dev.mysql.com/doc/refman/5.7/en/storage-engines.html
- MariaDB Storage Engines https://mariadb.com/kb/en/mariadb/storage-engines/
Some tools give you some hints to improve your configuration, but sometimes it’s better not to change a value. That’s why it’s really important to understand the parameters you want to change.
To help you to brave the optimization process, many tools are freely available.
This tool will analyze the performance of your database. To use it correctly, MySQL/MariaDB should first be running for a couple of days. You can apply your modification and wait another couple of days to run it again.
This tools examines SHOW VARIABLES for bad values and settings according to different rules and returns feedback defined by Percona. You can see the complete list of rules here: https://www.percona.com/doc/percona-toolkit/2.2/pt-variable-advisor.html
Of course, when you start having high traffic, you need more than one MySQL/MariaDB server. We’re seeing more and more eZ customers starting to use MySQL Cluster or MariaDB Galera Cluster. Both clusters do the tasks we ask for, but which one should you choose? Well, this is up to you but first we recommend you get to know the limitations of each cluster:
- Known limitations of MySQL Cluster https://dev.mysql.com/doc/refman/5.0/en/mysql-cluster-limitations.html
- Known limitations of MariaDB Galera Cluster: https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/
Coming in Part 2
In part 2 of this series, we’ll share tips on how to optimize PHP and your eZ configuration using Memcached, Network FileSystem (NFS) and other tools. In the meantime, head over to share.ez.no to join the discussion with the eZ Community.