Broken Link Checker Using PHP and cURL

Whether operating a commercial site, a directory, or a personal site, it is important to ensure you do not wholesale mlb jerseys have ‘dead’ links on your website. Broken links; links that point to inactive domains or 404 pages are of little use to your site visitors and may jeapordise any good search engine rankings you have, as it can be inferred your site is not well maintained while having broken links on it.

To remedy any potential problem, using a script to periodically check links on your pages means you can quickly alter & remove links that are no longer active or useful.

The following script will Pagination do this task for you, using PHP and cURL, with a simple HTML parser to find links on a page. Simply enter a URL into the form, and the results will appear on an IFrame in the same page.

5 Replies to “Broken Link Checker Using PHP and cURL”

    1. Hello Giovanni, you’re welcome. Try running the script on its own outside a template, I just tried that and it worked fine. If you’re running it within a content management system like wordpress it may affect the working of it. You can turn on error reporting and try things like “echo 123;exit;” to see how far and where the execution of the script ends up.

  1. Hi Richard,
    great script! Thanks a lot.
    I modified it a little bit:

    – replaced “yellow” with “orange” (increased readability)

    – added curl request for 301 redirect addresses:
    line 40: $string = ‘curl -I -A “Broken Link Checker” -s –max-redirs 5 -m 5 –retry 1 –retry-delay 10 -w “%{url_effective}\t%{http_code}\t%{time_total}\t%{REDIRECT_URL}” -o temp2.txt ‘.escapeshellarg($href);

    line 69: $string = ‘curl -A “Broken Link Checker” -s –max-redirs 5 -m 5 –retry 1 –retry-delay 10 -w “%{url_effective}\t%{http_code}\t%{size_download}\t%{time_total}\t%{REDIRECT_URL}” -o temp.txt ‘.escapeshellarg($_POST[‘url’]);

    – looped line 40-41 to gain information on redirection url
    Insert before line 40:
    $startURL=$href; # remember initial url
    $fromURL=”; # initialize var for displaying initial url
    $cc=0; # set hop counter to 0
    $ex=0; # set break var to false
    while($ex==0 && $cc<5){ # loop till break or max of 5 hops
    Insert after line 41:
    # in case of redirection switch to redirection url
    if($string[1][0] == '3' && $string[3]){ # check whether redirected and redirection address given
    $fromURL=$startURL; # remember initial url
    $href=$string[3]; # set href to redirection url to validate it
    }else{
    $ex=1; # break if there is no redirection
    }
    $cc++; # count hops
    } # close loop

    – show redirect information in results
    line 48 replaced with:
    echo (++$i).'. ‘.$string[1].’ ‘.$string[2].’ ‘.str_pad($string[0],50,’ ‘,STR_PAD_RIGHT);
    if($fromURL){echo “(redirected from $fromURL)”;} # inserted info on primary url
    echo “\n”;

    The same applies for the section starting at line 68

    – insert before line 68:
    $checkURL=$_POST[‘url’]; # Imortant: replace $_POST[‘url’] with $checkURL on line 69, 88, 89, 92-94
    $startURL=$_POST[‘url’]; # remember initial url
    $fromURL=”; # initialize var for displaying initial url
    $cc=0; # set hop counter to 0
    $ex=0; # set break var to false
    while(!$ex && $cc<5){ # loop till break or max of 5 hops

    – insert after line 72:
    # in case of redirection switch to redirection url
    if($string[1][0]=='3' && $string[4]){ # check whether redirected and redirection address given
    $checkURL=$string[4]; # set $checkURL to redirection url to validate it
    $fromURL=$startURL; # remember initial url
    }else{
    $ex=1; # break if there is no redirection
    }
    $cc++; # count hops
    }
    – and add new information to result: replace line 81 with
    echo 'Fetched ‘.$string[0].’ (‘.$string[2].’ bytes) in ‘.$string[3].’ seconds, it returned a ‘.$string[1].’ response’ ;
    if($fromURL){echo ‘ (redirected from ‘.$fromURL.’)’;}

Leave a Reply

Your email address will not be published. Required fields are marked *