Remails：欧洲邮件传输代理

Remails：欧洲邮件传输代理
Remails: A European Mail Transfer Agent

原始链接: https://tweedegolf.nl/en/blog/197/remails

## Remails：欧洲邮件投递解决方案 Remails 是一种邮件传输代理 (MTA)，旨在提供可靠的邮件投递服务，完全托管在欧洲。它主要设计用于**事务性邮件**——例如验证码和密码重置——通过使用信誉良好的 IP 地址和正确的头部配置，确保邮件到达收件箱，并针对失败的投递进行自动重试。该项目在 GitHub 上是开源的，允许自托管，但也可在 [remails.net](https://remails.net/) 获得托管版本。最初，Remails 作为一个简单的 VPS 设置启动，现已发展成为一个**高可用 Kubernetes 集群**，配备托管的 PostgreSQL 数据库。这种架构将服务拆分为一个用于凭据管理的 Web API 和一个用于发送/接收消息的 MTA。此次演进的关键在于获得对**出站 IP 地址**的控制权——这对于避免垃圾邮件过滤器至关重要——通过与芬兰云提供商的“自带 IP”安排以及利用 DaemonSets 和中央消息总线重构的 Kubernetes 架构来实现。目前处于公开测试阶段，提供免费套餐（每月最多 3,000 封邮件），Remails 旨在为美国邮件服务提供一个注重隐私的欧洲替代方案。未来的开发包括 DNS 记录和配额通知、增强的审核功能以及*接收*邮件的能力。

## Remails：一款新型欧洲邮件传输代理最近的 Hacker News 讨论中提到 Remails，这是一款新型的欧洲邮件传输代理 (MTA)。用户对 Remails 慷慨的免费套餐（每月 3,000 封邮件）和透明度印象深刻，这体现在他们关于服务的详细博客文章中。然而，有些人认为跳到下一个套餐的价格（每月 100,000 封邮件，100 欧元）对于预计邮件量适中的项目（约 30,000 封邮件）来说太高了。有人建议使用 Lettermint 等替代方案，它提供更平缓的定价，但免费计划则不太慷慨。有趣的是，一位评论员透露他们*也在*构建一个 MTA，承认双方都致力于改善邮件传递环境，尽管项目目标不同。这次讨论表明人们对 Postmark 等成熟供应商之外的替代电子邮件解决方案越来越感兴趣。

原文

What is Remails?

About a year ago, the founder of Remails contacted us to help build Remails, a Mail Transfer Agent (MTA) hosted fully in Europe. An MTA is a service that helps deliver emails reliably by forwarding emails via IP addresses that have earned a good reputation, ensuring the right email headers are set, and retrying delivery automatically when things go wrong. Remails’ source code is available on GitHub, which allows anyone to self-host it, but they also provide a ready-to-use working instance of Remails at remails.net.

Remails is currently mainly meant for transactional emails, not broadcast emails. This means it is perfect for sending email verification codes, password reset links, and personal notifications and reminders, but it is not yet meant for sending advertising emails to hundreds or thousands of people at the same time. Eventually, we might lift this limitation in the process of further development.

In the following sections, let’s take a look at how the development has gone so far and the technical challenges we faced along the way.

From MVP to high availability

At the start of the project, our main focus was to quickly build a minimum viable product, keeping the feedback loop short and showing results early on. Therefore, we started off with a cheap, single Virtual Private Server (VPS) from a European cloud provider. The whole application was running as a single binary in a container “orchestrated” by Docker Compose next to a simple PostgreSQL database container.

Simple VPS deployment at development start

After the fundamental work of implementing SMTP communication, a basic web interface, and the database integration was done, we went on to improve the deployment setup. For Remails, we had a hard requirement to use a European cloud provider. Ideally, we want to administer as little infrastructure as possible ourselves while also ensuring high availability of the service.

High availability

We set up a managed Kubernetes cluster with a managed Postgres database from the cloud provider. We split our application into two logical parts: the web interface API used to manage credentials and the actual MTA part, which sends and receives messages via SMTP. Using Kubernetes, we can run multiple replicas of each component and distribute them over multiple machines (called nodes in Kubernetes speak). This means that services will still be available on other nodes when one of the nodes goes down, thus achieving high availability.

To make sure we are on the same page about the term high availability, let’s take a brief look at availability aspects of our threat model. Our main concern is data availability. This is partially taken care of by the cloud provider, which runs two database nodes and takes care of Point In Time Recovery (PITR) backups. Additionally, we run a daily job that stores a full backup at a technically and organizationally independent location. If that backup ever fails, we have an observability solution that will alarm us. The second most important availability property is that we are always ready to receive new emails from our clients. That is taken care of by the load balancers and multiple MTA pods (see below). The least critical part is to ensure that we are always ready to send out the emails we received from our clients, as we consider a small delay in sending out the message non-critical. Don’t get us wrong: we strive for disruption-free service in all parts, but nevertheless, it’s essential to prioritize the most vital parts.

Our initial Kubernetes-based setup

The above image depicts this first Kubernetes setup, slightly simplified. Nevertheless, it's already significantly more elaborate than the single binary VPS setup. Let's go over the different parts:

First, note that the PostgreSQL database is external to the cluster, as it is managed by the cloud provider and not part of the Kubernetes setup. All of the pods in all of the nodes are connected to this same database in order to share data.
As mentioned, the application is split into two main parts: the Web API and SMTP Mail Transfer Agent. Both are so-called deployments, which means Kubernetes will distribute them (randomly) over the available nodes. Connecting users will be forwarded to any healthy instance at random by the load balancers, which handle incoming connections (not shown in the image).
Periodic tasks is a singular cron job that regularly checks for emails that could not be sent and should be retried. At this development stage, it would send out those emails from its own pod, just like the MTA pods do (spoiler: this will soon change!).

In short, this Kubernetes setup allows us to reliably relay incoming email while ensuring high availability. However, there is one big challenge we still have to tackle that we haven't discussed yet: IP addresses.

Juggling IP addresses

So far, the setup is pretty standard. However, there is one more requirement that we haven't mentioned yet. A big problem with email is that unsolicited messages (also referred to as spam) get sent out to lots of people at once by spammers. These messages, often containing either plain old advertising or full-on fraudulent scams, have prompted email service providers to implement spam filters. These spam filters aim to reduce the exposure to spam by either putting suspected spam messages into a separate folder or rejecting them outright.

As the chance of an email making it to an inbox and not just the spam folder depends highly on the IP address the email is sent from, we have to be controlling our outbound IP addresses. This is a twofold problem. Firstly, we want to use Remails’ own block of IP addresses with the cloud provider (which is usually called “bring your own IP”, or BYOIP), as it is a lot more difficult to build up a good reputation with public IPs from cloud providers. Secondly, we want to be in control of which IP address from our block is used for every mail we send.

The Finnish cloud provider UpCloud was able to fulfill the BYOIP requirement for us. For the second requirement, we will need to improve our Kubernetes architecture to take control of the network interfaces of the nodes. We want to be able to pick which IP address an email is sent from based on the email's sender, which allows us to offer high volume customers one or multiple IPs for their exclusive usage, so that their reputation is independent of other Remails customers.

Refactoring the Kubernetes architecture

In the setup as shown in the previous image, this is not possible, as the outbound IP is based on the IP of the node from which the email is sent. To gain control over the outbound IPs, we could simply run a single binary on a single machine with many IP addresses, but that would go completely against the requirement of high availability. So instead, we refactored our Kubernetes architecture to support both high availability and managed outbound IP addresses simultaneously:

The improved Kubernetes setup (simplified)

The above image provides a simplified overview of the current architecture. Let's take a look at it step by step:

We have previously already seen the Web API, which is running as a Kubernetes deployment randomly distributed over the nodes. Besides its original task to host the web interface, it also provides the public REST API and the documentation of that API.
We separated the SMTP Mail Transfer Agent into two parts: SMTP inbound and SMTP outbound. The inbound service is a deployment just as the Web API and is thus (randomly) distributed over the available nodes. A load balancer forwards inbound traffic to one of the healthy instances, just as with the Web API.
The SMTP outbound is one of the most significant changes compared to the previous architecture. Instead of being a deployment like most other pods, it runs as a Kubernetes DaemonSet. This ensures there is always exactly one instance running on each node. Additionally, we granted host-network access to those pods, which allows them to interact directly with the network interface installed on each node.
Another addition is the Cloud IP manager. This is responsible for making sure each node is assigned the required IP addresses by the cloud provider by interacting with the cloud provider’s API. Note that a node can have multiple IPs assigned from which the SMTP outbound pod can choose using its direct access to the network devices.
The other major change is the introduction of a central Message Bus. We designed this as a very lightweight and simple broadcast message bus without any deliverability guarantees. We'll explain the reasoning for this in the next paragraph. The message bus is used for communication between the different components. For example, if an email should be sent out, the Web API, SMTP inbound, or Periodic Tasks send a simple notification to the message bus with the message ID of the email and which outbound IP it should use. The outbound DaemonSets are listening for these notifications, filtering for messages that should be sent from an IP they have access to. After a sending attempt, the outbound pod responds with a status update.

The design choice of using a best-effort message bus might come as a surprise. It’s only a single instance, without a retry or failover mechanism. Nevertheless, our setup is highly available, as every action we perform is stored in the database. Imagine a new email reaches an SMTP inbound pod. First, it will store this in our database and subsequently trigger a notification to send the message. If that message does not reach the (correct) outbound pod for any reason, it will be automatically retried by the periodic tasks until it works out eventually. Thus, a failing message bus can cause a small delay, but it will not hinder us from accepting new emails and sending them out slightly later.

Conclusion

With the architecture described in this blog post, we managed to achieve high availability while at the same time making sure our application can choose which outbound IP to use when sending emails. This allows us to reliably send the emails from users using Remails' own IP block.

Try it out now!

Remails is currently in public beta, so if you're currently using a US provider and are interested in a European alternative, feel free to give it a try. There is a free¹ plan available, allowing you to send up to 3,000 emails per month. As soon as you need more, you can simply upgrade to a paid subscription.

If you’re interested in more technical details or even self-hosting, check out the code on GitHub!

Roadmap

Soon, we will add email notifications for invalid DNS records and quota warnings. We are also working on more moderation and privacy features, such as audit logs for organization admins and configurable shorter email retention periods. Furthermore, the ability to receive emails through Remails (useful for receiving bounced messages and DMARC reports) is also on our roadmap.