Poster Grabber via PERL and IMDB

OK, so you are sitting at home, looking at your collection of legally obtained movie trailers. But what is missing? The cover art! wouldn’t it be great to have the poster image of each trailer stored with it? Yes!

Luckily, you have been a smart collector, and placed each trailer in its own folder, and named each folder with the title of the movie trailer it contains, using _ characters for spaces.

What comes next? A PERL scraper script to read each folder, query the IMDB and grab the poster image, saving it inside the folder with a similar name to the avi file it contains. Here is such a script.  Feel free to modify it as you see fit!


#!/usr/bin/perl -w
# thumb size 101x76 ?
use strict;
use LWP;
use HTML::TreeBuilder;

sub analyze_folder {
 print " IN ALAYZE FOR " . $_[0] . " " . $_[1] . "\n";
 my @files = <$_[0]/*>;
 my $file;
 my $page;
 my $user_agent;

 foreach $file (@files) {
 if (-f $file && length( $_[1] ) ) {
 if ( substr( $file, rindex( $file, "." )) eq ".avi" ) {
 my $icon = substr( $file, 0, rindex( $file, "." )) . ".jpg";
 if ( !-f $icon ) {
 print "Grabbing thumb... " . "http://www.imdb.com/find?q=" . $_[1] . "\n";
 my $request = HTTP::Request->new(GET => "http://www.imdb.com/find?q=" . $_[1]);
 $user_agent = LWP::UserAgent->new();
 $user_agent->timeout(30);
 $user_agent->agent('Mozilla/5.0');
 my $response = $user_agent->request($request);
 our $remote = "";
 if($response->is_success){
 my $page = HTML::TreeBuilder->new();
 $page->parse( $response->content );
 $page->eof();
 $remote = $page->look_down( '_tag', 'img', 'width',23);
 if ( !$remote ) {
 $remote = $page->look_down( '_tag', 'img', 'height',317);
 }
 if ( $remote ) {
 $remote = $remote->attr('src');
 if ( length( $remote ) ) {
 $remote = substr( $remote, 0, rindex( $remote, "_SY")) . "_SY300_SX300_.jpg";
 system("wget -q -O " . $icon . " " . $remote );
 print( "GOT ICON $remote to $icon\n" );
 }
 } else { print( "HERE" );DIE( "COULD NOT GRAB THUMB!\n" ); }
 $page->delete;
 }
 }
 }
 }
 if (-d $file) {
 my $folder = substr( $file, ( rindex( $file, "/" )+1));
 $folder =~ s/_/+/g;
 analyze_folder(  $file, $folder );
 }
 }
}

analyze_folder( "/mybook","" );

 

 

Bookmark the permalink.
There are to Poster Grabber via PERL and IMDB