Monday, June 13, 2011

How to download automatically every image that is linked from inside a website with a bash script

This time I did not write any code, instead of that I want to share with you this very useful bash script written by Darrin Goodman.

For the version 1 of the script with all its full reference, please enter here, and click here for the full reference of the version 2 (the one that is posted is my blog entry).

There are occasions when an individual might wish to download any or all of the images that may be linked from a web page, such as when there is a thumbnail image that is linked to a larger version of the same image so in all those cases this script will help you to do that job ;).

NOTE: This script depends on programs such "awk", "lynx", "grep" and "wget".

If you use Debian or a Debian based distro you must install the "lynx" program first:

$ sudo apt-get install lynx

So lets begin:

STEP 1 <- Download the script here

STEP 2 <- Make it executable 

$ chmod +x  image_downloader-v2.sh

STEP 3 <- Run the script

$ ./image_downloader-v2.sh

Example

Enjoy ;)

Code:

 #!/bin/bash  
 # Written by Darrin Goodman with inspiration from:  
 # http://www.go2linux.org/linux/2010/09/how-download-all-links-webpage-including-hidden-776  
 # THIS PROGRAM WILL DOWNLOAD IMAGES THAT ARE LINKED FROM A WEBSITE,  
 # SUCH AS WHEN THERE IS A THUMBNAIL IMAGE THAT IS LINKED TO A LARGER VERSION OF THE SAME IMAGE.  
 # THIS IS VERSION 2 - THE SIMPLE IMAGE DOWNLOADER - NO EXTRA FRILLS  
 # THE MORE FULL-FEATURED VERSION CAN BE FOUND HERE:   
 # http://www.hilltopyodeler.com/blog/?p=324  
   
 function grabURL {  
 echo -n " Enter the desired URL or 'q' to QUIT: "  
 read a  
 for a in $(cat url); do #$a; done  
 if [ $a == "q" ]  
 then  
 # figlet Done!  
 echo "Done!"  
 echo  
 echo  
 exit 0  
 else  
 echo " The URL that you entered is: $a"  
 echo  
 echo "Ok...... working........................."  
 echo  
 fi ; done  
 }  
 function imageDownload {  
 grabURL  
 echo  
   
 # GRAB A LIST OF ALL IMAGES BEING LINKED TO AND STORE THEM IN FILE CALLED images.txt  
 # THIS LIST HAS BEEN EXPANDED TO ALSO GRAB SWF's AND FLV's  
 lynx --dump $a | awk '/http/{print $2}' | grep png > images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep jpg >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep gif >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep flv >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep swf >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep PNG >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep JPG >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep GIF >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep FLV >> images.txt  
 lynx --dump $a | awk '/http/{print $2}' | grep SWF >> images.txt  
   
 # LOOP THROUGH THE LIST OF IMAGES STORED IN images.txt AND DOWNLOAD THEM TO THE CURRENT DIRECTORY  
 for i in $(cat images.txt); do wget $i; done  
 echo "//////////////////////////////////////////////////////////////////////////////"  
 echo  
 echo "Your images have downloaded to your current working directory."  
 echo  
 echo "Thank you for using ImageDownloader"  
 echo  
 # figlet Done!  
 echo "Done!"  
 echo  
 echo  
 exit 0  
 }  
 function whatNext {  
 echo "What would you like to do?"  
 echo  
 imageDownload  
 }  
   
 # BEGINNING OF THE PROGRAM  
   
 # Clear the screen  
 clear  
 echo "This script relies on the program called \"lynx\"."  
 echo " ImageDownloader v.2 is ready"  
 echo  
   
 # PROMPT USER TO DECIDE WHAT TO DO NEXT  
 whatNext  
   
   
   
 exit 0  

Benjamin