Percolation Search on Complex Networks

The journal version of the percolation search paper, entitled Scalable Percolation Search on Complex Networks has appeared in the Journal of Theoretical Computer Science. This is joint work with Nima Sarshar and Vwani Roychowdhury.

On replying to email

Many (most?) email clients support a threaded view of the messages. This allows the reader to see the thread of messages in the order they were sent. It also allows for easier management as the entire thread may be selected and saved to a particular folder or tagged with some set of tags. The question is: how does the email reader know which messages are in which threads?

Email messages have headers which are machine readable. Some headers often shown to the email user are “Subject:”, “To:”, “CC:”, “From:” and geeks like me even show “X-Mailer:” which allows me to see what email reader you use (are you cool enough for Mutt). In addition, each message has a unique message-id header, there are also two headers: references, and in-reply-to which list the message-ids of messages that the current email message is in-reply-to or referencing. This is how the magic of threading works (I am excluding evil hacks like looking for string matches in the subject header).

The graph of all email messages is a rich source of information. I used some of this information for spam filtering. Just as Google has used the graph formed by the web for improving web search, email graphs could be valuable for email search and personal datamining.

Unfortunately, this is graph is BROKEN (or at least noise is added) when people click “reply” when they really mean “send a new email message to this person”. The email client will assume this is actually in-reply-to a previous email and will include the message-id of that previous message in the header of this new, unrelated, email. In this way, the user is tricking the computer into believing their is a connection between these two messages when in fact no such connection exists. It is not all the users fault. Software should make it easy to do the right thing. I think that email clients should drop all references in the case where a user erases the entire subject of a reply. Alternatively, there could be a special command to make a “new message to sender”. Making it impossible to remove all quoted text (or again, dropping all references when a user deletes all quoted text) might be another way to keep the user on the straight and narrow.

My final words: first: turn on threading in your email reader, second: do not use the reply button as a lazy “send new mail” button. Use your address book, or a specific feature afforded by your email client to send a new message. Thanks for keeping our threads clean!