Mar 162013
 

If we want to post from twitter to other social media, we have to figure out how to read tweets, and how to read pictures from tweets – and let’s do it using PHP.

In the first installment of this series of articles I explained why I want to read pictures from a tweet: because Google+ and Facebook show nice big pictures if uploaded to Google+ and Facebook; posting a link to a picture will result in a thumbnail.
The second article shows how to read tweets from twitter using PHP.

In this article, I will talk about the following:

  • Finding a picture in a tweet
  • Downloading the picture
  • Finding and replacing short links in a tweet
  • Removing link or picture from the tweet

Find a picture in a tweet

Before you actually try the code below, make sure you have read my previous article first, it shows how to read tweets using PHP.

Twitter has its own picture service: pic.twitter.com. If you create a tweet with an official Twitter app and attach a picture to it, the picture will be saved on Twitter’s servers, and it will get a shortened link that is inserted in the tweet. But something else is happening: the picture is also added as an attachment to the tweet, and if you read a tweet using an extra parameter in the API, you will get the attachments as well. Try the following code:

  1. <?php
  2.  
  3. include 'lib/EpiCurl.php';
  4. include 'lib/EpiOAuth.php';
  5. include 'lib/EpiTwitter.php';
  6. include 'cfg/secret.php';
  7.  
  8. # Create a Twitter object
  9. $twitterObj = new EpiTwitter($consumer_key, $consumer_secret, $access_token, $access_secret);
  10.  
  11. # Get tweets and attachments
  12. $status = $twitterObj->get('/statuses/user_timeline.json', array('include_entities' => 1));
  13. $response = $status->response;
  14. var_dump($response);
  15.  
  16. ?>

Have a look at line 12, I have added an array with one parameter: key include_entities, value 1.
The returned tweet now contains extra information:

  1.     ["entities"]=>
  2.     array(4) {
  3.       ["hashtags"]=>
  4.       array(0) {
  5.       }
  6.       ["urls"]=>
  7.       array(0) {
  8.       }
  9.       ["user_mentions"]=>
  10.       array(0) {
  11.       }
  12.       ["media"]=>
  13.       array(1) {
  14.         [0]=>
  15.         array(10) {
  16.           ["id"]=>
  17.           int(305391866864607232)
  18.           ["id_str"]=>
  19.           string(18) "305391866864607232"
  20.           ["indices"]=>
  21.           array(2) {
  22.             [0]=>
  23.             int(12)
  24.             [1]=>
  25.             int(34)
  26.           }
  27.           ["media_url"]=>
  28.           string(46) "https://pbs.twimg.com/media/BDz4SO3CIAASVa_.jpg"
  29.           ["media_url_https"]=>
  30.           string(47) "https://pbs.twimg.com/media/BDz4SO3CIAASVa_.jpg"
  31.           ["url"]=>
  32.           string(22) "https://t.co/Lj5LSUooWU"
  33.           ["display_url"]=>
  34.           string(26) "pic.twitter.com/Lj5LSUooWU"
  35.           ["expanded_url"]=>
  36.           string(63) "https://twitter.com/vogon1test/status/305391866860412928/photo/1"
  37.           ["type"]=>
  38.           string(5) "photo"
  39.           ["sizes"]=>
  40.           array(4) {
  41.             ["medium"]=>
  42.             array(3) {
  43.               ["w"]=>
  44.               int(600)
  45.               ["h"]=>
  46.               int(399)
  47.               ["resize"]=>
  48.               string(3) "fit"
  49.             }
  50.             ["large"]=>
  51.             array(3) {
  52.               ["w"]=>
  53.               int(1024)
  54.               ["h"]=>
  55.               int(681)
  56.               ["resize"]=>
  57.               string(3) "fit"
  58.             }
  59.             ["thumb"]=>
  60.             array(3) {
  61.               ["w"]=>
  62.               int(150)
  63.               ["h"]=>
  64.               int(150)
  65.               ["resize"]=>
  66.               string(4) "crop"
  67.             }
  68.             ["small"]=>
  69.             array(3) {
  70.               ["w"]=>
  71.               int(340)
  72.               ["h"]=>
  73.               int(226)
  74.               ["resize"]=>
  75.               string(3) "fit"
  76.             }
  77.           }
  78.         }
  79.       }
  80.     }

As you can see, there are 4 types of entities: hashtags, urls, user_mentions and media. For now, we only look at the media, starting at line 13. Line 39 shows that the media is of type photo – currently, nothing else is supported (if I understood the documentation correctly).
The actual photo is at line 28/29: the media_url. That is the one pointing to the picture we can fetch.
The picture comes in 4 sizes: thumb, small, medium and large (line 42 and further). Default seems to be medium, so if you fetch (in this example) https://pbs.twimg.com/media/BDz4SO3CIAASVa_.jpg then you get a medium sized picture. By post-fixing it with the name of the size you want, you can fetch that size. So https://pbs.twimg.com/media/BDz4SO3CIAASVa_.jpg:small will get the small picture for you. Try it in your browser.

Download the picture

Let’s write some code to read tweets, find a picture in the tweet, and download it. I am very used to wget, so why not call it from PHP. If you have a better way to do it via libcurl, let me know.

  1. <?php
  2.  
  3. include 'lib/EpiCurl.php';
  4. include 'lib/EpiOAuth.php';
  5. include 'lib/EpiTwitter.php';
  6. include 'cfg/secret.php';
  7.  
  8. # Create a Twitter object
  9. $twitterObj = new EpiTwitter($consumer_key, $consumer_secret, $access_token, $access_secret);
  10.  
  11. # Get tweets
  12. $status = $twitterObj->get('/statuses/user_timeline.json', array('include_entities' => 1));
  13. $response = $status->response;
  14.  
  15. # Find tweets
  16. foreach ($response as $tweet) {
  17.     # Find media in tweet; loop through the media array
  18.    $tw_media = array();
  19.     if (isset($tweet['entities']['media'])) {
  20.         foreach ($tweet['entities']['media'] as $media) {
  21.             $media_url = $media['media_url'];
  22.             echo "Found media in tweet: $media_url\n";
  23.  
  24.             # Strip the path part from the attachment
  25.            $img = preg_replace("$.*/$", "", $media_url);
  26.  
  27.             # Fetch the picture and put it in /tmp
  28.            system("wget -q -O /tmp/$img $media_url");
  29.         }
  30.     }
  31.  
  32.     # Show the tweet
  33.    echo $tweet["text"] . "\n\n";
  34. }
  35.  
  36. ?>

If you have a picture in one of your last 20 tweets uploaded by an official twitter client then you will see the link to the picture and find the .jpg in your /tmp directory.

Find and replace short links in a tweet

When examining a tweet, you see that there are actually three formats for a link; it doesn’t matter where the link points to (a web page, an image, whatever), in the actual tweet the link is always shortened to a t.co link (link format 1). If you look at a link on the twitter.com site, you will either see the full link, or a partial link, ending in … ( link format 2) called the Display Link. And in case of a partial shown link, the real full link (link format 3) is still somewhere.
All three link types are in the tweet in the entities part (like the pictures), here is an example:

  1.   ["entities"]=>
  2.   array(3) {
  3.     ["hashtags"]=>
  4.     array(0) {
  5.     }
  6.     ["urls"]=>
  7.     array(2) {
  8.       [0]=>
  9.       array(4) {
  10.         ["url"]=>
  11.         string(22) "https://t.co/mCgHwOiXAt"
  12.         ["expanded_url"]=>
  13.         string(77) "https://www.devblog.sietse.net/2013/02/27/why-write-a-script-to-cross-post-tweets/"
  14.         ["display_url"]=>
  15.         string(36) "devblog.sietse.net/2013/02/27/why…"
  16.         ["indices"]=>
  17.         array(2) {
  18.           [0]=>
  19.           int(30)
  20.           [1]=>
  21.           int(52)
  22.         }
  23.       }
  24.       [1]=>
  25.       array(4) {
  26.         ["url"]=>
  27.         string(22) "https://t.co/3TJu7d4W3D"
  28.         ["expanded_url"]=>
  29.         string(79) "https://www.devblog.sietse.net/2013/03/02/how-to-read-tweets-from-twitter-using-php/"
  30.         ["display_url"]=>
  31.         string(36) "devblog.sietse.net/2013/03/02/how…"
  32.         ["indices"]=>
  33.         array(2) {
  34.           [0]=>
  35.           int(65)
  36.           [1]=>
  37.           int(87)
  38.         }
  39.       }
  40.     }
  41.     ["user_mentions"]=>
  42.     array(0) {
  43.     }
  44.   }

This example shows that there are two URLs in the tweet, and per URL all three link types are shown.

This is what we see on the Twitter website:
tweet_links1

And this is what our PHP program outputs:
Two links in a tweet; link 1: https://t.co/mCgHwOiXAt and link 2: https://t.co/3TJu7d4W3D

I want the real links in the text, not the t.co ones when posting this tweet to Google+ or Facebook, so I am going to replace the short links in the tweet with the expanded (real) url.

Here we go:

  1. <?php
  2.  
  3. include 'lib/EpiCurl.php';
  4. include 'lib/EpiOAuth.php';
  5. include 'lib/EpiTwitter.php';
  6. include 'cfg/secret.php';
  7.  
  8. # Create a Twitter object
  9. $twitterObj = new EpiTwitter($consumer_key, $consumer_secret, $access_token, $access_secret);
  10.  
  11. # Get tweets
  12. $status = $twitterObj->get('/statuses/user_timeline.json', array('include_entities' => 1));
  13. $response = $status->response;
  14.  
  15. # Find tweets
  16. foreach ($response as $tweet) {
  17.     # Find media in tweet; loop through the media array
  18.    $tw_media = array();
  19.     if (isset($tweet['entities']['media'])) {
  20.         foreach ($tweet['entities']['media'] as $media) {
  21.             $media_url = $media['media_url'];
  22.             echo "Found media in tweet: $media_url\n";
  23.  
  24.             # Strip the path part from the attachment
  25.            $img = preg_replace("$.*/$", "", $media_url);
  26.  
  27.             # Fetch the picture and put it in /tmp
  28.            system("wget -q -O /tmp/$img $media_url");
  29.         }
  30.     }
  31.  
  32.     # Find the text of the tweet
  33.    $tweet_text = $tweet["text"];
  34.  
  35.     # Find URL's in tweet; replace shortened url by the expanded url
  36.    if (isset($tweet['entities']['urls'])) {
  37.         foreach ($tweet['entities']['urls'] as $tw_url) {
  38.             $short_url = $tw_url['url'];
  39.             $long_url  = $tw_url['expanded_url'];
  40.  
  41.             $tweet_text = str_replace($short_url, $long_url, $tweet_text);
  42.         }
  43.     }
  44.  
  45.     # Show the tweet
  46.    echo "$tweet_text\n\n";
  47. }
  48.  
  49. ?>

Lines 36 – 43 do the replacement. Now our tweet text looks like this:
Two links in a tweet; link 1: https://www.devblog.sietse.net/2013/02/27/why-write-a-script-to-cross-post-tweets/ and link 2: https://www.devblog.sietse.net/2013/03/02/how-to-read-tweets-from-twitter-using-php/.

That is a lot more informative for the reader than the t.co links.

Remove link or picture from the tweet

Why would I want to remove a picture or a link from a tweet?
Because I am going to post it as an attachment to Google+ and Facebook. It is a bit stupid to have a link in the posted text, and have it shown as attachment. This does require a bit of tweeting discipline, the first link will be removed from the tweet so make sure the text will still be readable if the link is gone. Best practice is putting the link or pictucre either at the beginning or at the end of a tweet without it being part of the actual text. If there are multiple links, only the first one will be removed from the tweet and added as attachment to the Google+ or Facebook post.
Picture has preference over a web link; if there is a picture in a tweet, it will be removed from it and added as attachment. If there is no picture, and there is a web link, then that one will be the attachment.

Here is the final code for this article:

  1. <?php
  2.  
  3. include 'lib/EpiCurl.php';
  4. include 'lib/EpiOAuth.php';
  5. include 'lib/EpiTwitter.php';
  6. include 'cfg/secret.php';
  7.  
  8. # Create a Twitter object
  9. $twitterObj = new EpiTwitter($consumer_key, $consumer_secret, $access_token, $access_secret);
  10.  
  11. # Get tweets
  12. $status = $twitterObj->get('/statuses/user_timeline.json', array('include_entities' => 1));
  13. $response = $status->response;
  14.  
  15. # Find tweets
  16. foreach ($response as $tweet) {
  17.     # Find the text of the tweet
  18.    $tweet_text = $tweet["text"];
  19.  
  20.     # Attachment to be used for Google+ and Facebook
  21.    # Either a picture or a web link – if any
  22.    $attachment = "";
  23.  
  24.     # Find media in tweet; loop through the media array
  25.    $tw_media = array();
  26.     if (isset($tweet['entities']['media'])) {
  27.         foreach ($tweet['entities']['media'] as $media) {
  28.             $media_url = $media['media_url'];
  29.             echo "Found media in tweet: $media_url\n";
  30.  
  31.             # Strip the path part from the attachment
  32.            $img = preg_replace("$.*/$", "", $media_url);
  33.  
  34.             # Fetch the picture and put it in /tmp
  35.            system("wget -q -O /tmp/$img $media_url");
  36.  
  37.             # This will be our attachment if it is the first picture found
  38.            if (!$attachment) {
  39.                 $attachment = "/tmp/$img";
  40.  
  41.                 # Remove url from tweet
  42.                $short_url = $media['url'];
  43.                 $tweet_text = str_replace($short_url, "", $tweet_text);
  44.             }
  45.         }
  46.     }
  47.  
  48.     # Find URLs in tweet; replace shortened url by the expanded url
  49.    if (isset($tweet['entities']['urls'])) {
  50.         foreach ($tweet['entities']['urls'] as $tw_url) {
  51.             $short_url = $tw_url['url'];
  52.             $long_url  = $tw_url['expanded_url'];
  53.  
  54.             # This will be our attachment if not already found one before
  55.            if (!$attachment) {
  56.                 $attachment = $long_url;
  57.  
  58.                 # Remove URL from tweet
  59.                $tweet_text = str_replace($short_url, "", $tweet_text);
  60.             } else {
  61.                 # Not an attachment, replace short url by the long url
  62.                $tweet_text = str_replace($short_url, $long_url, $tweet_text);
  63.             }
  64.         }
  65.     }
  66.  
  67.     # Show the tweet
  68.    echo "$tweet_text\n\n";
  69. }
  70.  
  71. ?>

The tweet with the two links now looks like:
Two links in a tweet; link 1: and link 2: https://www.devblog.sietse.net/2013/03/02/how-to-read-tweets-from-twitter-using-php/
The tweet does not look good because the first link was part of the text and is gone now. So it is up to you to make sure the tweet still looks ok if the first link is removed.

  One Response to “How to read pictures from tweets using PHP”

  1. Really nice, thanks

Leave a Reply to jd Cancel reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Social Widgets powered by AB-WebLog.com.