SamuelRobinson.com
A resource for Windows programmers

SMTP As A Method For Communication Between Systems

Introduction

Whenever two or more systems need to share information, the problem of getting them to talk to each other must be solved. This occurs when a sensor on a factory floor detects that a part doesn’t meet requirements and it also happens when a new customer has been added to a customer list and the local sales force will need the name and phone number of the person to contact. We could use the same methods for each of these communications but the requirements of the two examples are very different. What is appropriate to the factory automation case is expensive overkill in the customer information management case. It is critical that we carefully examine the needs of each system, as well as each systems relationship with other systems when we resolve this issue.

 

There are a large number of solutions to this problem. Understanding the requirements allows us to select the most cost-effective and appropriate solution for a given system. We will be focusing on one of the possible solutions in this paper. SMTP is the underlying standard for the one of the most important Internet applications, email. We have found, that under the correct circumstances, SMTP is an easily implemented, reliable, and cost effective method of integrating two dissimilar systems.

 

This paper will give a quick overview of data communication issues and examine the benefits and drawbacks associated with SMTP as a method of data communication. It will detail the issues involved should one decide to implement a SMTP data transfer strategy. We will contrast SMTP to other data transmission standards that are commonly used to integrate systems together. We will examine when it is appropriate to use SMTP and when it is not appropriate. We will also look at the capabilities of SMTP and related messaging standards so that we can understand how to use the strengths of this technology.


 

The Problem of Communicating Between Systems

 

 

Figure 1 Isolated Systems

Systems Are No Longer Isolated

When implementing a system where no other systems previously existed, the choices for moving data between parts of the system are a matter of design. We know what the parts of the system are going to be, and we design the parts of the system to work together. This sort of situation is becoming a rarity. In the modern enterprise, we have many existing systems which each represent massive investment by their owners which each have some piece of information that other systems need. Each system may contain customer or employee information, market data, audio or picture data, measurements, or any of a number of other kinds of information. Quite often these systems were intended to be integrated applications that performed all of the necessary functions on this data. We are using data in new ways now that we have gathered it, and we are using information in ways that the original designers of these systems never envisioned.

 

 

Figure 2 Systems Sharing Information

Systems Are Being Connected Together

The new systems being built need to communicate with the systems we have. Systems that may have never needed to communicate with any other systems are being retrofitted with communication capabilities, and in many cases are now accepting input from other systems instead of people. New systems are being built where we don’t know all of the possible output formats. Where systems used to be tightly coupled together, we now see looser connections between large systems. This may be an internal connection, such as a connection between an order system and a payroll system that computes commissions for sales persons, or an external connection like the connection between a supplier and the materials management system for a factory. In either case the systems were often designed to take input from operators and now will be retrofitted to take input from some other source.

 

When and Why SMTP?

Comparing Some Solutions to the Communication Problem

Intercommunication between systems uses a number of protocols, each of which have their own benefits. We are not going to examine these in any detail, but a quick overview will help us as we examine the protocol we’re evaluating in this paper.

 

Message queuing and transaction systems are used in those cases where we need a guaranteed single delivery of a specific piece of information. This is often the case in credit card authorization and charge systems where we need to charge a credit card for a purchase.

Figure 3 Message Queuing

Enterprise Application Integration (EAI) looks a lot like message queuing except that rather than having an optional dispatcher, EAI depends on a broker that has interface “glue” to connect the communicating applications together.

Figure 4 Enterprise Application Integration

Streaming systems are used when a set of data must be delivered within a specified time and when the loss of a small percentage of the information isn’t critical to the success of the system. We tend to see streaming systems delivering telemetry and multimedia content.

Figure 5 Streaming

The World Wide Web with the Hyper Text Transfer Protocol (HTTP) has proved very useful when a connection to the web is available. This is particularly suited to providing a browse-able interface for users. It works well when used to pull information from a repository.

 

Figure 6 Web (HTTP) Technology

 

File transfer protocol can be used when the client wishes to pull data from a directory on the server. This is a fairly primitive way of moving data, but is used in some smaller simpler systems. One can use NFS and Samba in a similar way.

file transfer protocol

Figure 7 File Transfer Protocol (FTP)

The Simple Mail Transfer Protocol (SMTP) is the focus of this white paper. We will be discussing it further, but it is useful when you wish to have very loose coupling between systems.

 

Socket connections on arbitrary ports with application specific protocols are often found in very tightly coupled systems.

 

DCom and Corba are object technologies that are somewhat more loosely coupled than socket connections, but still require that the caller know a fairly large amount about the interface that it’s calling into.

 

Microsoft’s .NET, Java RMI, and Soap are emerging standards that can be used for information interchange. They have many of the strengths of the DCom and Corba methods, but have other benefits as well. They are appropriate to medium coupled systems, and each has special strengths.

System Attributes Useful For Picking A Solution

 

We need to think about the needs of the system as we examine which of these we will use. Some of the criteria are:

 

Reviewing this list we can begin to see some of the criteria that would indicate using SMTP. Since SMTP can take quite a while (minutes/hours) to be delivered, we will not use it if communication is highly time sensitive. Pre-arranged formats are less important for SMTP as we can send multiple formats in a multipart MIME message. If there are several industry formats, and we want to let the receiver select the appropriate content type, this is a positive indication for using SMTP. If there is a specific connection methodology, then we need to evaluate it to see if SMTP is a correct decision. Transmission over the public net may add additional requirements to the communication part of the system. Email works well for this type of transmission, and there are MIME types that support encryption. Email is a best try sort of delivery mechanism, so if positive validation of delivery is required, SMTP may not be appropriate.

Deciding to Use SMTP

SMTP is implemented on almost all of the existing operating system platforms. Since it interoperates well across platforms we can comfortably expect that we won’t be surprised by incompatibilities between systems. This works well in a loosely coupled system, particularly in those systems where another organization may be implementing one of the communicating systems.

 

The issue of coupling is one of the most important considerations that would indicate that SMTP is a good fit for a system. Loosely coupled systems don’t know, or need to know much about each other. They tend to use common standards and may only know each other’s names. Consider a system that accepts pictures to display in a photo gallery. There are a fairly large number of common picture formats, but there is also a fairly large number of conversion programs that can convert one format to another on almost any platform available. As long as there is a sender’s name and one attachment that can be understood, the receiving program should be able to retrieve and store a picture attached to a SMTP message. In other cases, there are industry standards for information storage that we would expect. In those cases where the message arrives without a translatable attachment, it is also reasonably easy to respond with an error message that lists the acceptable formats. If this level of coupling is appropriate, then we may continue to consider using SMTP as a transport.

 

The next most important question is if there is a hard time requirement for message delivery. SMTP is often relayed through a number of delivery agents, so it is difficult to predict how long it may take for a message to reach its final destination. Delays tend to be between ten minutes to an hour, but delays of up to a few days are not unheard of. If a delay of a day isn’t critical (this is unusual, but useful for planning) then SMTP is still worth considering

 

Also important is the existence of standards. This isn’t as critical, as Multipurpose Internet Mail Extensions (MIME) supports new standard definition. However, the existence of one or more pre-existing standards is very helpful and a positive factor in consideration of SMTP.

 

One final defining question is the problem of lost messages. Although it’s rare for a message to be lost, SMTP is a best effort system. This means that occasionally there may be a lost or garbled message. If this introduces an unrecoverable error for the system, then a more reliable transport is indicated.

Integrating SMTP Into the System Architecture

SMTP - What Is the Simple Mail Transport Protocol?

The Simple Mail Transport Protocol is the part of electronic mail that actually manages the transfer of a mail message. SMTP is responsible for starting communication with the receiver, negotiating how the message is to be encoded for transmission, and making sure that the message is delivered to the correct entity. This is the process that happens each time we receive an email. When using SMTP to transport messages we will either build a recipient and sender into our program, or use an existing product to handle this functionality.

SMTP Message Formatting

There are two other technologies that are closely associated with SMTP that are important to us. The first of these is the Internet Message Protocol defined in RFC 2822. The other is the Multipurpose Internet Mail Extensions (MIME) defined in RFCs 2045-2049. These standards define what is in the message, how it is encoded, and other important information about how to decode the message. This message formatting, being standardized gives us a defined way to send the information without requiring that the information itself be formatted in any particular way. This allows us to pass information about the message format to the receiver and let the receiver handle the information translation.

Buy Or Build

Email is one of the most common systems in existence so we probably will not need to implement a SMTP system if we wish to use it. Using a commercial SMTP system can be done in several ways, depending on the base platform and the desired operational mode.

 

If we do have to implement this is a non-trivial effort, depending on feature set requirements. Simple systems can be implemented fairly quickly, but implementing a system that meets the requirements that we have been discussing can take between a half to an entire man-year of effort. Key requirements are the need to receive the mail directly, MIME support, authentication, and support for binary mode. Supporting the full range of features is not always necessary.

 

Direct reception is the most expensive requirement as it means that we must somehow embed a Mail Transfer Agent (MTA) into our system. There are reasonably simple ways to do this in Linux/Unix environments, and also in Windows environments, but for embedded systems and some legacy systems this may present the need for extra effort as mentioned above.

 

To take full advantage of the strengths of SMTP, MIME support is required. Under all circumstances adding MIME support from scratch is a difficult and time consuming undertaking. Many commercial solutions have some degree of support for MIME, and this is a powerful argument for using commercial software whenever possible.

 

Authentication is not particularly difficult to implement, but great care must be taken to ensure security. Security issues can bring a large set of problems, but these are not unique to SMTP. The key issue when looking at the need for this feature is the type of connection that exists between the communicating subsystems. If the communication takes place on a LAN, or on a private network, then the overhead and additional work involved in authorizing may not be appropriate.

 

Binary mode requires support of ESMTP (Extended SMTP). This is additional effort in a “build” situation, but the benefit in throughput if large messages are exchanged is worth it. The ESMTP extensions can greatly improve throughput.

 

The buy vs. build decisions is primarily driven by the availability of robust clients and servers, and the degree of integration required by the application. This must be resolved on a case-by-case basis, but in most circumstances one will acquire the needed software and just interface to it.

Receiving and Sending SMTP Messages

There are a lot of ways to handle SMTP messages in a system. The simplest of these is to invoke the mail system programs with command line arguments. This allows us to treat our input and output (I/O) as files.

 

More sophisticated systems will be notified, either through some sort of signal or event when there is mail. These systems tend to be more tightly integrated to the mail server. Some platforms (notably MS Windows) provide event models when using their mail infrastructure that simplifies this task. We are still invoking an outside program to do the work, but it is now much more tightly integrated to our system, and we have access to some of the parsing and conversion routines provided by the commercial mail system.

 

The most integrated approach is to add MTA functionality to the system that will be receiving the messages. This allows us to filter input without storing it, and we can also parse the message as it comes in and only handle the parts of the message we need. The down side of this method is that all operations must be part of our code, with the creation, maintenance, and testing costs that come with any body of code.

 

When we consider sending messages the same general set of options is available. We will want to create our message body programmatically, and then we may give that message to a commercial mail program or we may have an embedded client to send the mail. If we wish to know that the mail was accepted, then sending the mail ourselves is most attractive. Given the simplicity of mail clients, this will probably be the preferred method. Some mail client software provides integration with their event models, and in those cases we can interact with the client software instead of writing a client.

Making the System Faster

Two ESMTP features, Pipelining and Binary (Chunking) together can significantly speed up the message transfer process. Pipelining allows a group of commands to be sent at once, with status responses also returned in a grouped fashion. This removes a significant amount of time spent waiting for the responses before sending the next command. Binary mode allows us to send large chunks of data in their native format, removing the need to encode and format the data.

Closing the Loop

One of the most significant deficiencies in SMTP is the lack of acknowledgement of a message as part of the process. Delivery Status Notification (DSN) exists as a negative acknowledgement, but we are not guaranteed that the receiving system will send a DSN. The best way to handle this is to add acknowledgment messages to our system’s messaging protocol. This requires additional storage and coding, but offers the opportunity to re-send a message that doesn’t get acknowledged in a reasonable time. This feature should also guard against multiple receipts of the same message.

Conclusion

When implementing communication between loosely coupled systems SMTP should be considered as a good way to move information between the systems. SMTP is robust and does not require intimate knowledge of the receiving system. It can transport data in many formats in the same message, and when third party packages are used it may provide translation services. It can be integrated into the application or may be an external subsystem. Although care must be taken when deciding to use SMTP, for many integration efforts it is the most appropriate, cost effective choice.

 


Comments and Suggestions

Please tell me what you think about this content and how I might improve it.

Back to Top

Back to home page


Last revised: April 28, 2003