Extract all links of a webpage in php

Extracting all links of a webpage is quite easy with PHP DOMDocument class. Using this class we can load html file directly or we can load it from string.

loadHTMLFile function is used to load html from file whereas use loadHTML to load from string.

Here is a sample script to get all links of an HTML file.

    $url = "http://google.com";
    libxml_use_internal_errors(true);
    $dom = new DOMDocument();
    $dom->loadHTMLFile($url);
 
    $hrefs = $dom->getElementsByTagName('a');
    $urls=array();
 
    for ($i = 0; $i < $hrefs->length; $i++) {
      $href = $hrefs->item($i);
      $urls[] = $href->getAttribute('href');
    }
    
    print_r($urls);
   
All Links will be stores in $urls array.
Extract all links of a webpage in php Extract all links of a webpage in php Reviewed by JS Pixels on January 12, 2013 Rating: 5

No comments:

Altaf Web. Powered by Blogger.