These are the results of some benchmark tests comparing
Mailismus
to
Postfix,
the leading open source mail server.
These benchmarks can be run on any Unix/Linux/MacOS system, or on a Cygwin environment in Windows. The only prerequisite is the usual Mailismus one, that Java must be installed.
The results below were obtained on an OpenSUSE 13.1 Linux system, which comprised a 3.3GHz Intel Core i3-2120 CPU and a SATA hard drive, but the relative figures are of more interest than the absolute ones.
To provide a level playing field, all sender/recipient validation was turned off in both Mailismus and Postfix.
This test measures how long it takes to relay a batch of 500,000 messages through Mailismus (or Postfix, for comparison) thus exercising both its SMTP client and server components.
In outline, the test procedure consists of the automated mass submission of messages into a local mail server (called the Relay) which in turn relays them to another mail server on the same system (called the Sink) which simply accepts all incoming messages.
The Relay mail server is the entity which is under test and everything else is just part of the test-suite furniture.
We need to install two instances (Relay and Sink) of Mailismus, so obtain the "generic all-platforms download" from this website's
Downloads
page - either of the compressed Tar or the ZIP forms will do equally well.
Then simply unzip/untar in place under a common parent directory (henceforth referred to as
TESTROOT).
Rename each Mailismus installation directory after unzipping, so that we have
TESTROOT/relay
and
TESTROOT/sink,
with their respective
bin,
lib
etc subdirectories directly underneath them.
We are going to set up these two Mailismus installations so that their TCP ports do not clash with each other or a live instance of Mailismus (or any other local mail server running on port 25).
1.1.1 - Mailismus Relay Instance
Replace the named config files under
TESTROOT/relay/conf
with the following:
•
naf.xml
•
mailismus.xml
No further setup is required, but a brief explanation follows.
This
naf.xml
file updates the default installed version by altering the NAFMAN base port (to avoid conflicts with any live Mailismus instance that might be running).
The
mailismus.xml
file listens on port 55125 (rather than the standard SMTP port of 25, once again to avoid conflicts) and routes five notional domains to the Sink mail server (which we will later configure to listen on SMTP port 55225).
It also specifies a default smarthost destination of 55999, but the latter is simply intended as a non-existent port to safely trap any errant messages, as this test does not generate any messages other than for the notional domains.
By the same token, this test is not expected to generate any bounces, so non-delivery replies are turned off in the
mta/report
config block and recorded as flat files under
TESTROOT/relay/var/bounces
instead.
Note that these notional
domainX.grey
domains do not need to exist in the DNS or anywhere else in your network config, they are simply a Mailismus routing construct which contains the required TCP/IP addressing info.
Finally, note that the max concurrency
(
maxserverconnections)
of the delivery task is set to 50 and since there are 5 destination domains (albeit they all point at the same destination IP) that means the effective concurrency is 250, which exactly corresponds to the number of simultaneous incoming connections made by the test script.
1.1.2 - Mailismus Sink Instance
The purpose of the Sink mail server is to accept incoming messages as quickly as possible, such that it ought not to become the bottleneck in any of these tests.
As such, we use another Mailismus instance, as that is the fastest test rig we know ...
We configure the Sink node to map all the soak-test recipients to a "sink" alias, meaning it will silently discard all received messages without writing them to disk.
This is merely to lighten the system load, since as explained above, the Relay mail server is the one under test.
Replace the named config files under
TESTROOT/sink/conf
with the following:
•
naf.xml
•
mailismus.xml
As per the Relay instance, the
naf.xml
config file updates the initial post-install version by altering the NAFMAN base port to avoid conflicts.
In addition, it also disables the Delivery and Reporting NAFlets, leaving the SMTP Submit task as the only active Dispatcher. This is after all a Sink mail server, so received messages will not be propagated onwards.
The
mailismus.xml
config file contains a
Directory
block to specify the required alias mapping.
It also strips the config of the unused NAFlets for simplicity, but its SMTP-server config is essentially identical to the Relay instance of Mailismus.
To conclude the Sink config, paste the following into the
TESTROOT/sink/conf/ms_aliases
file which is referenced by the Directory config. This should replace any existing file contents.
recip1@domain1.grey : .
recip1@domain2.grey : .
recip1@domain3.grey : .
recip1@domain4.grey : .
recip1@domain5.grey : .
|
1.1.3 - Postfix
Assuming Postfix is installed on your system, we are going to configure it to behave as a relay, in a matter compatible with the Mailismus Relay instance.
FIRST!!! Stop any running Postfix instance:
sh /etc/init.d/postfix stop
You can run
postfix status
to verify whether Postfix is running.
Edit
/etc/postfix/master.cf
to move the
smtpd
daemon (the Postfix SMTP server) from the standard SMTP port of 25 to 55125 (as per the Relay instance of Mailismus).
Simply comment out the
smtp line and add the
55125 one, including the option to disable some checks.
#smtp inet n - n - - smtpd
55125 inet n - n - - smtpd
-o receive_override_options=no_header_body_checks
|
Replace the contents of
/etc/postfix/main.cf
with this
config.
Make sure all the referenced pathnames are appropriate for your system - they should be standard.
You can run
postfix check
to verify the config.
Note that we cannot configure Postfix to relay arbitrary recipient domains to a specified address, so we have simply specified the Mailismus Sink instance as a smarthost, meaning everything gets routed there unconditionally.
This means Postfix will have to do marginally less processing than Mailismus when routing its messages, but that should not give it a significant advantage.
Also note that all DNS lookups are disabled to maximise throughput, while outgoing SMTP concurrency is increased to 250 (default=20) to align with the related
maxserverconnections
setting in the Mailismus Relay instance.
The process limit is set to 300, to be comfortably high enough to support this concurrency.
Incoming concurrency
(
smtpd_client_connection_count_limit)
is also set to 500, to make sure the 250 simultaneous incoming connections are handled.
1.1.4 - Test Script
Paste this
shell script
into
TESTROOT/submit.sh.
As can be seen, the Postfix
smtp-source
utility is used as the submission tool.
This script runs 5 smtp-source processes in parallel, each of which opens 50 concurrent SMTP sessions to the Relay mail server (so 250 simultaneous SMTP connections) and sends 100,000 messages (each on its own SMTP connection, as we didn't specify the smtp-source
-d
option) making for a total of 500,000 messages.
The recipients are in the notional domains we defined in the Mailismus relay instance and the sender is an arbitrary non-existent domain (this is ok because sender validation is disabled).
The message size is modest, but you can vary the
msgcnt,
msgsiz and
sesscnt
settings at will.
1.2.1 - Mailismus relay
Execute the following from the Unix shell (obviously replacing
TESTROOT with your actual path)
ulimit -n 999999
cd TESTROOT/sink
rm -rf var
sh bin/mta.sh start
cd ../relay
rm -rf var
sh bin/mta.sh start
cd ..
sleep 10
sh submit.sh
|
... and wait for the test to complete.
You can observe progress by counting the number of spooled messages on the Relay instance's queue, by running the following from the
TESTROOT
directory:
find relay/var/spool -type f | wc
The file count will initially rise as messages are received, before dropping to zero once all the messages have been relayed onwards to the Sink instance.
The Relay node's Delivery log file
(
TESTROOT/relay/var/logs/trace/mta-deliver-DATE.log)
also reports a running total of the messages relayed at the end of each batch, which will eventually reach 500,000, while the Sink node's Submit log file
(
TESTROOT/sink/var/logs/trace/mta-submit-DATE.log)
reports them as incoming messages and should have the following announcement of the final message being received, reporting a total message-count of 500,000.
2016-01-14 17:56:19.213 [INFO-T9] SMTP-Server/E145-189 accepted/discarding msg #51/500000: msgid=1612Z305/size=0
from tester2@domain9.grey for recips=1=>0 [recip1@domain2.grey]
|
Now halt the Relay and Sink mail servers
cd TESTROOT/relay
sh bin/mta.sh stop
cd ../sink
sh bin/mta.sh stop
|
Finally, record how long the entire test took, which we calculate as follows:
First look in the Relay instance's SMTP-server logfile
(
TESTROOT/relay/var/logs/trace/mta-submit-DATE.log)
and find the first and last lines announcing that a message has been accepted.
These are of the general form:
2016-01-14 17:50:17.091 [INFO-T10] SMTP-Server/E120-1 accepted msg #1/1: msgid=414244E6/size=548
from tester2@domain9.grey for recips=1 [recip1@domain2.grey]
|
Simply subtract the millisecond-precision timestamps of the two log lines from each other and that's our result for how long it took to accept all the messages.
Then look in the Relay instance's Delivery logfile
(
TESTROOT/relay/var/logs/trace/mta-deliver-DATE.log)
and find the first line announcing that a batch of pending messages has been loaded from the Queue.
These are of the general form:
2016-01-14 17:50:23.755 [TRC-T11] SMTP-Delivery: Loaded queued recipients=2500 (qtime=312ms)
|
Then find the last line announcing that delivery of a batch of messages has been completed.
These are of the following general form (note total=500,000):
2016-01-14 17:56:19.506 [INFO-T11] SMTP-Delivery: Completed batch #200 size=2500 in time=4s827ms (qtime=92ms) - relayed=2500
with connections=50 and messages=2500 (total=500000/500000 with sendtime=26m29s996ms, qtime=17m47s281ms, launchtime=18s792ms)
|
As above subtract the millisecond-precision timestamps of these two lines from each other and that's our result for how long it took to deliver all the messages.
Equally, you could simply note the timestamps of the Sink node's first and last "accepted/discarding msg" lines in
TESTROOT/sink/var/logs/trace/mta-submit-DATE.log
and compute their difference to represent the total test time.
This will give the same result, but the above way gives more insight into the processing of each Mailismus task and also accounts for a few seconds of pre and post processing on the Relay node's queue.
1.2.2 - Postfix
In this test, Postfix (as configured in step 1.1.3) is substituted for Mailismus as the Relay instance.
The Sink instance is still Mailismus and its config is unchanged.
Execute the following from the Unix shell ...
cd TESTROOT/sink
rm -rf var
sh bin/mta.sh start
cd ..
postsuper -d ALL # clear out any existing Postfix queue
sh /etc/init.d/postfix start
sleep 10
sh submit.sh
|
... and wait for the test to complete.
As before, the test is complete when the Mailismus Sink instance reports the receipt of its 500,000th message - see
TESTROOT/sink/var/logs/trace/mta-submit-DATE.log.
You can also observe the Postfix queue with
postqueue -p,
to watch it wax and wane.
Now halt the Relay and Sink mail servers
cd TESTROOT/sink
sh bin/mta.sh stop
sh /etc/init.d/postfix stop
|
Finally, record how long it took.
There is only a single score for Postfix, as it doesn't have separate submission and delivery threads to be independently analysed.
The start time is measured by going to the last line in
/var/log/mail.info
that matches the regex pattern
postfix.master.*: daemon started
and then finding the next line that matches this regex pattern:
postfix.smtpd.*: connect from.
Make a note of the timestamp on the latter log message, as the start time.
Then look in the same logfile for the last message matching the regex pattern
postfix.qmgr.*: removed
and make a note of its timestamp.
Subtract the two timestamps and that's how long the Postfix Relay instance took to process the 500,000 messages.
Alternatively, it might be simpler to just subtract the timestamps of the first and last "accepted/discarding msg" lines in the Mailismus Sink logs
(
TESTROOT/sink/var/logs/trace/mta-submit-DATE.log).
This might underestimate Postfix's own processing time by up to 30 secs (as there will obviously be some queue processing before and afterwards) but that makes no significant difference to its observed times.
The
Size=400
column reports the results of the tests as described above, with a message size of 400, but all the tests were then rerun with a message size of 64K.
The latter was simply a matter of editing the
submit.sh
script (see step 1.1.4) to set
msgsiz=65536
and then repeating step 1.2 in full.
The results are expressed as the time taken to relay all 500,000 message in minutes and seconds (and hours where necessary).
Test Step |
Relay Mailserver |
Message-Size=400 |
Message-Size=64K |
1.2.1 |
Mailismus |
2m17s to accept
4m44s to deliver |
12m2s to accept
11m58s to deliver |
1.2.2 |
Postfix v2.9.6 |
14m53s |
3h9m4s |
Note that the single result (total elapsed time) reported for Postfix does not correspond to the sum of the two finer-grained results reported for Mailismus (time to accept and time to deliver) as the accept/deliver activities overlap during all but the first few seconds of the test, so the Mailismus delivery time is effectively also its total elapsed time.
Since the Mailismus delivery time predominates, you can effectively compare the Postfix elapsed time to that alone.
While 400 bytes is unrealistically small, 64K is representative of a message with a not-too-large attachment, but the main importance of the larger message size for our purposes is that it virtually guarantees that TCP flow control will kick in during the SMTP DATA phase causing socket writes to block and demonstrates the benefits of the non-blocking-write capability provided by the
NAF
framework Mailismus is based on.
The Postfix 64K-msg time is obviously highly non-linear, so rather than showing a slowdown by a set factor, it probably indicates that Postfix tipped over once the queue reached a certain size and was unable to cope effectively.
3h9m was actually the fastest of several runs whose times ranged as high as 3h45m, while Mailismus demonstrated far more consistency.
The Postfix tests were run with Syslog enabled and disabled, but its logging turned out not to impose any measurable cost.
It is also important to note that the Mailismus architecture allows concurrency (see the
maxserverconnections
config setting) and hence throughput to be scaled up without consuming any extra threads (let alone processes) while Postfix's CPU load average was running 30-40 times higher than Mailismus during these tests, due to its high process count.