In this tutorial, I'll walk you through some real world examples of how you can use PHP and IMAP to build new features for managing your email—features that the big email providers haven't built for us.
My interest in this began in 2010 when I wrote Twelve Gmail Ideas to Revolutionize Email (Again), but mostly the ideas I wished for have remained out of reach. For as important as email is, innovation of email as an application has been quite slow.
We’re drowning in email, and managing our inboxes remains a heavy burden. Mail services and clients have done very little to help us with this. Most of the email we receive is sent by machines, not people, and yet we’re the ones who have to individually process all of it.
Analysis of my own email showed I was receiving email from more than 230 automated senders, far fewer actual people. I was tired of constructing filters in Gmail and filling in a myriad of unsubscribe forms. I wanted to have more control over managing my email and simplifying my life.
Finally, this past year, I decided to build the features I needed. The result is Simplify Email (SE), a small web app you can host yourself which offers a variety of cool new email features all of which you can check out on the project website.
The coolest thing about SE is that it's a platform for reading, analyzing, routing and managing your email—the possibilities abound. Simplify Email is essentially a programmable playground for "hacking" your own email.
I'll walk you through the code of three examples from SE that use PHP, IMAP, and MySQL to work with email:
- Checking your inbox and filtering messages
- Implementing a Whitelist challenge to unknown senders
- Reporting unanswered email
This tutorial will definitely give you a head start at writing IMAP code in PHP. But you can also work directly with the Simplify Email codebase. You can purchase the code for as little as $10, and there is an older open source version (which lacks some of the features we're describing below). Installation guides are provided for typical Linux configurations. I also offer pre-installed images at Digital Ocean for $25 as well as a handheld valet install. SE is written in PHP, in the Yii Framework.
Note that you won't be able to access most email servers via your local development machine unless you compile a secure IMAP library for PHP. This is one of the reasons I encourage people to run Simplify Email in droplets at Digital Ocean. There are also a few tricks to getting Google account security to let you in via IMAP.
Working with IMAP
How Simplify Email Works
With SE, you can continue using your email client of choice on both the Web and your mobile devices. You don't have to change any apps or personal habits. SE accesses your email accounts behind the scenes via IMAP; acting as a smart personal assistant, SE pre-processes your email, moving messages to the appropriate places based on everything you've told it.
When a message arrives from a familiar sender, SE moves it to the folder you've specified. When one arrives from an unknown sender for the first time, it moves it to the review folder.
Every couple of hours (or at a frequency you choose), SE will send you a summary of where it moved your messages and which messages are in review. Note, links for training senders are included for the review folder, making it quite easy to train SE over time.
At any time, you can browse your review folder—you don't have to wait for the digest to arrive. But the advantage of SE is that you no longer have to browse your folders; you can just read your digest to get a view of the email that you've received and train new senders.
1. Checking Your Inbox and Filtering Messages
SE uses several cron tasks to operate in the background on your server. Each is called from DaemonController.php
.
The first, processInbox
, is called frequently and needs to operate quickly—its job is to screen email and move it out of the Inbox as quickly as possible and into the triage folder, called the filtering folder.
The second, processFiltering
, is more process-intensive and performs deeper operations on email, ultimately moving messages to their final destination.
The ProcessInbox Method
The cron tasks calls processInbox
regularly:
public function actionInbox() { // moves inbox messages to @filtering // runs frequently $r = new Remote(); $r->processInbox(); }
For each account, we decrypt your e-mail credentials and then use imap_open to create an IMAP stream to your inbox folder:
public function open($account_id, $mailbox='',$options=NULL) { // opens folder in an IMAP account $account = Account::model()->findByPk($account_id); $this->hostname = $account->address; if (!stristr($this->hostname,'{')) $this->hostname = '{'.$this->hostname.'}'; $cred = Account::model()->getCredentials($account->cred); if ($account->provider == Account::PROVIDER_ICLOUD) { // icloud accepts only name part of mailbox e.g. stevejobs vs. [email protected] $temp = explode('@',$cred[0]); $cred[0]=$temp[0]; } $this->stream = imap_open($this->hostname.$mailbox,$cred[0],$cred[1],$options,1) or die('Cannot connect to mail server - account_id:'.$account_id .' '.print_r(imap_errors())); }
Within processInbox
, we use PHP library functions imap_search and imap_fetch_overview to retrieve an array of messages:
// lookup folder_id of this account's INBOX $folder_id = Folder::model()->lookup($account_id,$this->path_inbox); $this->open($account_id,$this->path_inbox); $cnt=0; $message_limit= 50; // break after n messages to prevent timeout echo 'Sort since: '.date("j F Y",$tstamp); // imap_search date format 30 November 2013 $recent_messages = @imap_search($this->stream, 'SINCE "'.date("j F Y",$tstamp).'"',SE_UID); if ($recent_messages===false) continue; // to do - continue into next account $result = imap_fetch_overview($this->stream, implode(',',array_slice($recent_messages,0,$message_limit)),FT_UID);
Then we process the array of messages in the inbox:
foreach ($result as $item) { if (!$this->checkExecutionTime($time_start)) break; // get msg header and stream uid $msg = $this->parseHeader($item);
Here's an adapted version of publicly available IMAP header parsing code which gathers the additional information that SE needs for a variety of tasks. Basically, it uses imap_rfc822_parse_adrlist to determine recipient information, message-id, subject and timestamps (or sender information when scanning the sent folder):
public function parseHeader($header) { // parses header object returned from imap_fetch_overview if (!isset($header->from)) { return false; } else { $from_arr = imap_rfc822_parse_adrlist($header->from,'gmail.com'); $fi = $from_arr[0]; $msg = array( "uid" => (isset($header->uid)) ? $header->uid : 0, "personal" => (isset($fi->personal)) ? @imap_utf8($fi->personal) : "", "email" => (isset($fi->mailbox) && isset($fi->host)) ? $fi->mailbox . "@" . $fi->host : "", "mailbox" => (isset($fi->mailbox)) ? $fi->mailbox : "", "host" => (isset($fi->host)) ? $fi->host : "", "subject" => (isset($header->subject)) ? @imap_utf8($header->subject) : "", "message_id" => (isset($header->message_id)) ? $header->message_id : "", "in_reply_to" => (isset($header->in_reply_to)) ? $header->in_reply_to : "", "udate" => (isset($header->udate)) ? $header->udate : 0, "date_str" => (isset($header->date)) ? $header->date : "" ); // handles fetch with uid and rfc header parsing if ($msg['udate']==0 && isset($header->date)) { $msg['udate']=strtotime($header->date); } $msg['rx_email']=''; $msg['rx_personal']=''; $msg['rx_mailbox']=''; $msg['rx_host']=''; if (isset($header->to)) { $to_arr = imap_rfc822_parse_adrlist($header->to,'gmail.com'); $to_info = $to_arr[0]; if (isset($to_info->mailbox) && isset($to_info->host)) { $msg['rx_email']=$to_info->mailbox.'@'.$to_info->host; } if (isset($to_info->personal)) $msg['rx_personal']=$to_info->personal; if (isset($to_info->mailbox)) $msg['rx_mailbox']=$to_info->mailbox; if (isset($to_info->host)) $msg['rx_host']=$to_info->host; } return $msg; } }
We create records for the sender and the message envelope within our database:
// skip any system messages if ($msg['email']==$system_email) continue; // if udate is too old, skip msg if (time()-$msg['udate']>$this->scan_seconds) continue; // skip msg // default action $action = self::ACTION_MOVE_FILTERED; $isNew = $s->isNew($account_id,$msg["email"]); // look up sender, if new, create them $sender_id = $s->add($user_id,$account_id,$msg["personal"], $msg["mailbox"], $msg["host"],0); $sender = Sender::model()->findByPk($sender_id); // create a message in db if needed $message_id = $m->add($user_id,$account_id,0,$sender_id,$msg['message_id'],$msg['subject'],$msg['udate'],$msg['in_reply_to']); $message = Message::model()->findByPk($message_id);
If the sender is new to us (unknown), we'll send a whitelist challenge email (we'll talk more about whitelist challenges in the next section below):
if ($isNew) { $this->challengeSender($user_id,$account_id,$sender,$message); }
Next, we determine if the user might have dragged a message from another folder back into the inbox—intending to train it via drag and drop. If so, we set the training for this sender to the inbox. In other words, next time, we'd want to just route messages from this sender to the inbox:
if ($message['status'] == Message::STATUS_FILTERED || $message['status'] == Message::STATUS_REVIEW || ($message['status'] == Message::STATUS_TRAINED && $message['folder_id'] <> $folder_id) || ($message['status'] == Message::STATUS_ROUTED && $message['folder_id'] <> $folder_id)) { // then it's a training $action = self::ACTION_TRAIN_INBOX; } else if (($message['status'] == Message::STATUS_TRAINED || $message['status'] == Message::STATUS_ROUTED) && $message['folder_id'] == $folder_id) { // if trained already or routed to inbox already, skip it $action = self::ACTION_SKIP; echo 'Trained previously, skip ';lb(); continue; }
If not, we'll prepare to move the message to the Filtering folder for further processing. First, we may send notifications to the user's phone if there is a sender match or keyword match for notifications (and it's not quiet hours):
if ($action == self::ACTION_MOVE_FILTERED) { $cnt+=1; if ($sender->exclude_quiet_hours == Sender::EQH_YES or !$this->isQuietHours($user_id)) { // send smartphone notifications based on sender if ($sender->alert==Sender::ALERT_YES) { $this->notify($sender,$message,Monitor::NOTIFY_SENDER); } // send notifications based on keywords if (AlertKeyword::model()->scan($msg)) { $this->notify($sender,$message,Monitor::NOTIFY_KEYWORD); } } // move imap msg to +Filtering echo 'Moving to +Filtering';lb(); //$result = @imap_mail_move($this->stream,$msg['uid'],$this->path_filtering,CP_UID); $result = $this->messageMoveHandler($msg['uid'],$this->path_filtering,false); if ($result) { echo 'moved<br />'; $m->setStatus($message_id,Message::STATUS_FILTERED); } }
If the message was dragged to the inbox, then we'll update our training settings:
else if ($action == self::ACTION_TRAIN_INBOX) { // set sender folder_id to inbox echo 'Train to Inbox';lb(); $m->setStatus($message_id,Message::STATUS_TRAINED); // only train sender when message is newer than last setting if ($msg['udate']>=$sender['last_trained']) { $s->setFolder($sender_id,$folder_id); } }
The ProcessFiltering Method
The secondary processing method is called processFiltering
, also in DaemonController.php
. It does the more time-consuming aspects of moving messages to the appropriate folders:
public function actionIndex() { // processes messages in @Filtering to appropriate folders $r = new Remote(); $r->processFiltering(); // Record timestamp of cronjob for monitoring $file = file_put_contents('./protected/runtime/cronstamp.txt',time(),FILE_USE_INCLUDE_PATH); }
This method opens your email account to search for recent messages and gather data about them. It also uses imap_search
, imap_fetch_overview
and parseHeader
:
$tstamp = time()-(7*24*60*60); // 7 days ago $recent_messages = @imap_search($this->stream, 'SINCE "'.date("j F Y",$tstamp).'"',SE_UID); if ($recent_messages===false) continue; // to do - continue into next account $result = imap_fetch_overview($this->stream, implode(',',array_slice($recent_messages,0,$message_limit)),FT_UID); foreach ($result as $item) { $cnt+=1; if (!$this->checkExecutionTime($time_start)) break; // get msg header and stream uid $msg = $this->parseHeader($item);
The primary processing loop for each message in the filtering folder is quite detailed. First we look at the recipient address, as SE allows people to train folders by recipient address, e.g. messages to the happyvegetarian.com domain go to the veggie folder:
// Set the default action to move to the review folder $action = self::ACTION_MOVE_REVIEW; $destination_folder =0; // look up & create recipient $recipient_id = $r->add($user_id,$account_id,$msg['rx_email'],0); $routeByRx = $this->routeByRecipient($recipient_id); if ($routeByRx!==false) { $action = $routeByRx->action; $destination_folder = $routeByRx->destination_folder; }
Then we lookup the sender and create a new record in the database (if necessary). If training exists for the sender, we can set the destination folder:
// look up sender, if new, create them $sender_id = $s->add($user_id,$account_id,$msg["personal"], $msg["mailbox"], $msg["host"],0); $sender = Sender::model()->findByPk($sender_id); // if sender destination known, route to folder if ($destination_folder ==0 && $sender['folder_id'] > 0) { $action = self::ACTION_ROUTE_FOLDER; $destination_folder = $sender['folder_id']; }
If an untrained (new) sender has verified themselves via a Whitelist challenge (which we'll discuss in the next section below), then we'll route this message to the inbox:
// whitelist verified senders go to inbox if ($sender->is_verified==1 && $sender['folder_id'] ==0 && UserSetting::model()->useWhitelisting($user_id)) { // place message in inbox $action = self::ACTION_ROUTE_FOLDER; $destination_folder = Folder::model()->lookup($account_id,$this->path_inbox); }
Then, we create a message entry in the database with the envelope information about this message:
// create a message in db $message = Message::model()->findByAttributes(array('message_id'=>$msg['message_id'])); if (!empty($message)) { // message exists already, $message_id = $message->id; } else { $message_id = $m->add($user_id,$account_id,0,$sender_id,$msg['message_id'],$msg['subject'],$msg['udate'],$msg['in_reply_to']); }
If it's from an unknown, unverified sender, we can move the message to the review folder. The review folder contains all the messages from senders we don't recognize.
If the message is from a known sender and we have a destination in mind, we can move it as long as it's not quiet hours (and do not disturb is off):
if ($recipient_id!==false) $m->setRecipient($message_id,$recipient_id); if ($action == self::ACTION_MOVE_REVIEW) { echo 'Moving to +Filtering/Review';lb(); //$result = @imap_mail_move($this->stream,$msg['uid'],$this->path_review,CP_UID); $result = $this->messageMoveHandler($msg['uid'],$this->path_review,false); if ($result) { echo 'moved<br />'; $m->setStatus($message_id,Message::STATUS_REVIEW); } } else if ($action == self::ACTION_ROUTE_FOLDER || $action == self::ACTION_ROUTE_FOLDER_BY_RX) { // lookup folder name by folder_id $folder = Folder::model()->findByPk($destination_folder); // if inbox & quiet hours, don't route right now if (strtolower($folder['name'])=='inbox' and $sender->exclude_quiet_hours == Sender::EQH_NO and $this->isQuietHours($user_id)) continue; echo 'Moving to '.$folder['name'];lb(); $mark_read = Folder::model()->isMarkRead($folder['mark_read']) || Sender::model()->isMarkRead($sender['mark_read']); //$result = @imap_mail_move($this->stream,$msg['uid'],$folder['name'],CP_UID); $result = $this->messageMoveHandler($msg['uid'],$folder['name'],$mark_read); if ($result) { echo 'moved<br />'; $m->setStatus($message_id,Message::STATUS_ROUTED); $m->setFolder($message_id,$destination_folder); } }
During quiet hours, messages are primarily kept in the filtering folder.
Every couple of hours, a different process will build the message digest using the message table records to determine which emails were recently received and filtered and how they were routed.
2. Implementing a Whitelist Challenge to Unknown Senders
The goal of the whitelist challenge is to keep any message from an unknown sender, e.g. possibly a marketing bot or spammer, out of your inbox. SE places mail from unknown senders in the review folder. However, if you turn whitelisting on, we send out a challenge email that gives the sender a chance to verify that they're human. If they respond, we'll move the message to your inbox. If the email turns out to be unwanted, you can zap the message from the digest or drag it to any folder you wish to train it to.
The user can turn whitelisting on and off in settings:
To implement whitelisting, we send out email challenges whenever mail arrives from a new sender:
if ($isNew) { $this->challengeSender($user_id,$account_id,$sender,$message); }
ChallengeSender
sends a coded link to the user for them to click. We also have some protections to make sure we don't trap ourselves in an email loop with an out of office message:
public function challengeSender($user_id,$account_id,$sender,$message) { // whitelist email challenge $yg = new Yiigun(); $ac = Account::model()->findByPk($account_id); if (!empty($ac['challenge_name'])) $from = $ac['challenge_name'].' <no-reply@'.$yg->mg_domain.'>'; else $from = 'Filter <no-reply@'.$yg->mg_domain.'>'; $cred = Account::model()->getCredentials($ac->cred); $account_email = $cred[0]; unset($cred); // safety: checks no recent email if ($sender->last_emailed>(time()-(48*60*60))) return false; if ($sender->isBot($sender['email'])) { // to do - can also set this person to bulk by default return false; } $link=Yii::app()->getBaseUrl(true)."/sender/verify/s/".$sender->id."/m/".$message->id.'/u/'.$message->udate; $subject = 'Please verify the message you sent to '.$account_email; $body="<p>Hi,<br /><br /> I'm trying to reduce unsolicited email. Could you please verify your email address by clicking the link below:<br /><a href=\"".$link.'">'.$link.'</a><br /><br />Verifying your email address will help speed your message into my inbox. Thanks for your assistance!</p>'; $yg->send_html_message($from, $sender['email'], $subject,$body); // update last_emailed $sender->touchLastEmailed($sender->id); }
Then, if the recipient clicks on the coded link, we verify them in the database. The Sender Controller processes these requests and checks their validity:
public function actionVerify($s = 0, $m=0,$u=0) { // verify that secure msg url from digest is valid, log in user, show msg $sender_id = $s; $message_id = $m; $udate = $u; $msg = Message::model()->findByPk($message_id); if (!empty($msg) && $msg->sender_id == $sender_id && $msg->udate == $udate) { $result = 'Thank you for your assistance. I\'ll respond to your email as soon as possible.'; $a = new Advanced(); $a->verifySender($msg->account_id,$sender_id); } else { $result = 'Sorry, we could not verify your email address.'; } $this->render('verify',array( 'result'=>$result, )); }
This tells our processing loops to move this and future messages from this sender to the inbox.
3. Reporting Unanswered Email
Sometimes it helps to see a summary of messages you've sent but not received a reply to. To identify these, Simplify Email monitors messages that have been sent but haven't received a reply.
Every message we receive contains a unique id, called the message_id (part of the IMAP specification). It often looks like this:
Message-Id: <[email protected]>
Furthermore, when messages are sent in reply to other messages, they have an in_reply_to
field which links back to the original message_id
.
So, we use a SQL query to find all the received messages that don't have a corresponding reply message referencing their message_id
. For this, we use a LEFT OUTER JOIN where there is no in_reply_to
id:
public function getUnanswered($account_id,$mode=0, $range_days = 7) { if ($mode==0) $subject_compare = 'not'; else $subject_compare = ''; $query = Yii::app()->db->createCommand("SELECT fi_sent_message.id, fi_sent_message.recipient_id as sender_id,fi_sent_message.subject,fi_sent_message.udate,fi_message.in_reply_to,fi_sent_message.message_id FROM fi_sent_message LEFT OUTER JOIN fi_message ON fi_message.in_reply_to = fi_sent_message.message_id WHERE fi_sent_message.account_id = ".$account_id." AND fi_message.in_reply_to is null and fi_sent_message.udate > ".(time()-(3600*24*$range_days))." and fi_sent_message.subject ".$subject_compare." like 'Re: %' ORDER BY fi_sent_message.udate DESC")->queryAll(); return $query; }
We use the $subject_compare
mode to differentiate between our sent messages that haven't been answered and our sent replies to a thread that haven't been answered. Here's the unanswered message report in your account:
SE also offers this information as an optional digest, called the unanswered email digest. You can receive it every day, every few days or every week.
We also use similar SQL tabulation with Google Charts to provide reports of how frequently certain people email you:
public function reportInbound($account_id,$range=30,$limit = 100) { $result= Yii::app()->db->createCommand('SELECT fi_sender.personal, fi_sender.email,count(sender_id) as cnt FROM fi_message LEFT JOIN fi_sender ON fi_sender.id =fi_message.sender_id WHERE fi_sender.account_id = :account_id AND fi_message.created_at > DATE_SUB( NOW() , INTERVAL :range DAY ) GROUP BY sender_id ORDER BY cnt desc LIMIT :limit ')->bindValue('range',$range)->bindValue('account_id',$account_id)->bindValue('limit',$limit)->queryAll(); return $result; }
I'll be writing more about Google Charts for Tuts+ soon.
Next Steps
I hope you've found Simplify Email intriguing enough to try your hand at PHP IMAP programming. There are so many cool features you can build without any need for the big email providers to do anything new.
If you have any questions or corrections, please post them in the comments. If you'd like to keep up on my future Tuts+ tutorials and other series, please follow @reifman or visit my author page. You can also contact me here.
Related Links
Here are some additional links that you may find useful:
- Simplify Email
- Introduction to Simplify Email (video)
-
Twelve Gmail Ideas to Revolutionize Email (Again)
- Coverage of Simplify Email in BoingBoing here and here
- PHP IMAP Reference
- Introduction to the Yii Framework (Tuts+)
Comments