Eats avocados. Develops applications. Writes here occasionally.

jul26

Writing an auto-retweet service

I'm working with a couple friends on a community service @PortsmouthTweet. It started as an auto-retweeter. People post tweets with @PortsmouthTweet and a bot picks them up and retweets them. This may seem simple, but it's surprisingly difficult.

People use twitter for a lot of things: posting one-shot updates, having quick conversations, retweeting what someone else said, encouraging followers to check out another user (#FollowFriday). Only the first of these uses concerns retweeters. At best, posting all will overwhelm your followers; at worst, you may look invasive and uncouth by retweeting a semi-personal conversation. So how can a bot discern the valuable content from the chatter?

Here's the code @PortsmouthTweet uses, where $t is an associative array built from a tweet in a Twitter API call:

        if (  !stristr(substr($t['text'],0,2),'rt') &&      			//not an RT
                strlen($t['in_reply_to_status_id']) < 4   &&            //not a reply
                !stristr($t['text'],'followfriday')        &&           //not followfriday (or #followfriday)
                !stristr($t['text'],'#ff')         &&                   //not #ff
                $t['user']['id'] != 10855142    &&                      //not from @lazytweet
                $t['user']['id'] != 45366008    )       {               //not from us -- change to your id!
       

It works, with some intelligence, at least for now. Compiling these rules has certainly been an empirical, Bayesian process. When I first wrote the retweeter, I didn't have any of them.

Soon we'll build a website and more robust databases for @PortsmouthTweet. With a user table in particular, we can keep track of our followers/following, and use this data to further filter the retweeter. For example, if a new user tweets at us, and this person follows a number of people we 'trust', we can likely trust the tweet. I'd also recommend checking for a 0.6 or higher followers/following ratio; most real people have this.

update you might also want to "blacklist" @lazytweet to avoid a recursion loop if someone includes #lazytweet in a tweet. Thanks to @Sturta for this feedback.