What all the stuff in email headers means—and how to sniff out spoofing
Parsing email headers needs care and knowledge—but it requires no special tools.
I pretty frequently get requests for help from someone who has been impersonated—or whose child has been impersonated—via email. Even when you know how to „view headers“ or „view source“ in your email client, the spew of diagnostic wharrgarbl can be pretty overwhelming if you don’t know what you’re looking at. Today, we’re going to step through a real-world set of (anonymized) email headers and describe the process of figuring out what’s what.
Before we get started with the actual headers, though, we’re going to take a quick detour through an overview of what the overall path of an email message looks like in the first place. (More experienced sysadmin types who already know what stuff like „MTA“ and „SPF“ stand for can skip a bit ahead to the fun part!)
From MUA to MTA, and back to MUA again
The basic components involved in sending and receiving email are the Mail User Agent and Mail Transfer Agent. In the briefest possible terms, an MUA is the program you use to read and send mail from your own personal computer (like Thunderbird, or Mail.app, or even a webmail interface like Gmail or Outlook), and MTAs are programs that accept messages from senders and route them along to their final recipients.
Traditionally, mail was sent to a mail server using the Simple Mail Transfer Protocol (SMTP) and downloaded from the server using the Post Office Protocol (abbreviated as POP3, since version 3 is the most commonly used version of the protocol). A traditional Mail User Agent—such as Mozilla Thunderbird—would need to know both protocols; it would send all of its user’s messages to the user’s mail server using SMTP, and it would download messages intended for the user from the user’s mail server using POP3.
As time went by, things got a bit more complex. IMAP largely superseded POP3 since it allowed the user to leave the actual email on the server. This meant that you could read your mail from multiple machines (perhaps a desktop PC and a laptop) and have all of the same messages, organized the same way, everywhere you might check them.
Finally, as time went by, webmail became more and more popular. If you use a website as your MUA, you don’t need to know any pesky SMTP or IMAP server settings; you just need your email address and password, and you’re ready to read.
Ultimately, any message from one human user to another follows the path of MUA ⟶ MTA(s) ⟶ MUA. The analysis of email headers involves tracing that flow and looking for any funny business.
TLS in-flight encryption
The original SMTP protocol had absolutely no thought toward security—any server was expected to accept any message, from any sender, and pass the message along to any other server it thought might know how to get to the recipient in the
To: field. That was fine and dandy in email’s earliest days of trusted machines and people, but it rapidly turned into a nightmare as the Internet scaled exponentially and became more commercially valuable.
It’s still possible to send email with absolutely no thought toward encryption or authentication, but such messages will very likely get rejected along the way by anti-spam defenses. Modern email typically is encrypted in-flight, and signed and authenticated at-rest. In-flight encryption is accomplished by TLS, which helps keep the content of a message from being captured or altered in-flight from one server to another. That’s great, so far as it goes, but in-flight TLS is only applied when mail is being relayed from one MTA to another MTA along the delivery path.
If an email travels from the sender through three MTAs before reaching its recipient, any server along the way can alter the content of the message—TLS encrypts the transmission from point to point but does nothing to verify the authenticity of the content itself or the path through which it’s traveling.
SPF—the Sender Policy Framework
The owner of a domain can set a TXT record in its DNS that states what servers are allowed to send mail on behalf of that domain. For a very simple example, Ars Technica’s SPF record says that email from
arstechnica.com should only come from the servers specified in Google’s SPF record. Any other source should be met with a
SoftFail error; this effectively means „trust it less, but don’t necessarily yeet it into the sun based on this alone.“
SPF headers in an email can’t be completely trusted after they’re generated, because there is no encryption involved. SPF is really only useful to the servers themselves, in real time. If a server knows that it’s at the outside boundary edge of a network, it also knows that any message it receives should be coming from a server specified in the sender’s domain’s SPF record. This makes SPF a great tool for getting rid of spam quickly.
DKIM—DomainKeys Identified Mail
Similarly to SPF, DKIM is set in TXT records in a sending domain’s DNS. Unlike SPF, DKIM is an authentication technique that validates the content of the message itself.
The owner of the sending domain generates a public/private key pair and stores the public key in a TXT record on the domain’s DNS. Mail servers on the outer boundary of the domain’s infrastructure use the private DKIM key to generate a signature (properly an encrypted hash) of the entire message body, including all headers accumulated on the way out of the sender’s infrastructure. Recipients can decrypt the DKIM signature using the public DKIM key retrieved from DNS, then make sure the hash matches the entire message body, including headers, as they received it.
If the decrypted DKIM signature is a matching hash for the entire body, the message is likely to be legitimate and unaltered—at least as verified by a private key belonging only to the domain owner (the end user does not have or need this key). If the DKIM signature is invalid, you know that the message either did not originate from the purported sender’s domain or has been altered (even if only by adding extra headers) by some other server in between. Or both!
This becomes extremely useful when trying to decide whether a set of headers is legitimate or spoofed—a matching DKIM signature means that the sender’s infrastructure vouches for all headers below the signature line. (And that’s all it means, too—DKIM is merely one tool in the mail server admin’s toolbox.)
DMARC—Domain-based Message Authentication, Reporting, and Conformance
DMARC extends SPF and DKIM. It’s not particularly exciting from the perspective of someone trying to trace a possibly fraudulent email; it boils down to a simple set of instructions for mail servers about how to handle SPF and DKIM records. DMARC can be used to request that a mail server pass, quarantine, or reject a message based on the outcome of SPF and DKIM verification, but it does not add any additional checks on its own.