Monday, 23 December 2013

Remove script tag from HTML content



A simple solution to remove script tag from HTML content
 
$html = preg_replace('#<script(.*?)>(.*?)</script>#is', '', $html);

or use the PHP DOMDocument parser.
$doc = new DOMDocument();

// load the HTML string we want to strip
$doc->loadHTML($html);

// get all the script tags
$script_tags = $doc->getElementsByTagName('script');

$length = $script_tags->length;

// for each tag, remove it from the DOM
for ($i = 0; $i < $length; $i++) {
  $script_tags->item($i)->parentNode->removeChild($script_tags->item($i));
}

// get the HTML string back
$no_script_html_string = $doc->saveHTML();
This worked me me using the following HTML document:
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>
            hey
        </title>
        <script>
            alert("hello");
        </script>
    </head>
    <body>
        hey
    </body>
</html>
Just bear in mind that the DOMDocument parser requires PHP 5 or greater.


Share:

No comments:

Post a Comment