Hudzilla.org - the homepage of Paul Hudson
Contents > Functions > Playing with strings Wish List | Report Bug | About Me ]

4.7.14     Removing HTML from a string: strip_tags()

This is NOT the latest copy of this book; click here for the latest version.

string strip_tags ( string source [, string allowable_tags])

Strip_tags() is a function that allows you to strip out all HTML and PHP tags from a given string (parameter one), however you can also use parameter two to specify a list of HTML tags you want.

This function can be very helpful if you ever display user input on your site. For example, if you create your own messageboard forum on your site a user could post a title along the lines of: <H1>THIS SITE SUCKS!</H1>, which, because you would display the titles of each post on your board, would display their unwanted message in huge letters on your visitors' screens.

Here are two examples of stripping out tags:

<?php
    $input
= "<BLINK><B>Hello!</B></BLINK>";
    
$a = strip_tags($input);
    
$b = strip_tags($input, "<B><I>");
?>

After running that script, $a will be set to "Hello!", whereas $b will be set to "<B>Hello!</B>" because we had "<B>" in the list of acceptable tags. Using this method you can eliminate most users from adversely changing the style of your site, however it is still possible for users to cause trouble if you allow a list of certain HTML tags, for example, we could abuse the allow <B> tag using CSS: <B STYLE="font: 72pt Times New Roman">THIS SITE SUCKS!</B>.



If you allow <B> tags, you allow all <B> tags, regardless of whether they have any extra unwanted information in there, so it is best not to allow any tags.

This sort of attack is commonly referred to as Cross-Site Scripting (XSS), as it allows people to take advantage of user input on your site to load their own content that may make your site look bad. Even worse than that is the fact that it's fairly easy for malicious users to make their username (for example) a JavaScript document that redirects the user to their own site and passes along all their cookies from your site, which can have disastrous effects. Be careful - make sure and put strip_tags() to good use.





<< 4.7.13 Pretty-printing numbers: number_format()   4.7.15 Comparing strings: strcmp() and strcasecmp() >>
Table of Contents
Want to see this stuff in print? PHP in a Nutshell takes the core topics covered here, adds in thousands of edits from the editorial team and myself, and combines them to make an unbeatable reference for PHP programmers at all levels.



My latest book has hundreds more tips on how to use PHP, Apache, and MySQL, plus Perl, Python, shell scripts, performance tuning, and more!



Top-right shadow
 
Bottom-left shadow Bottom shadow

Comments from other readers
A PHP User - 07 Sep 2008

Both htmlentities() and strip_tags() are good ways to protect against xss, but is there a good way to use them together? Suppose I want to strip out all tags EXCEPT <b> bold tags then make all other entities such as quotes, gt, lt, etc, into their html entities.

Is there a way to do this?

A PHP User - 07 Sep 2008

"htmlentities is worth mentioning, BUT on a comestic point of view, if a user enters <b>USERNAME</b> as their username..."

You'd want to use strip_tags() on a username, but if it was something like this post use htmlentities to allow people to use html.

A PHP User - 07 Sep 2008

For clarification..this site does not suck..

A PHP User - 07 Sep 2008

htmlentities is worth mentioning, BUT on a comestic point of view, if a user enters <b>USERNAME</b> as their username, htmlentities will show the string with the <b> tags, and the user is stuck with having tags hanging off, while with string_tags the username will simply be USERNAME

J. Patrick - 07 Sep 2008

When sanitizing user submitted data of html it's best to use the htmlentities function (http://us2.php.net/htmlentities). Taken from php.net:

"…all characters which have HTML character entity equivalents are translated into these entities."

I searched Google using 'site:hudzilla.org htmlentities' and didn't find any mention of it here. This is a much better solution than removing the tags.



Add comment
Please note that by posting a comment here you are committing it to the public domain. This is important so that others can make use of your code themselves, and also so that I can incorporate helpful notes directly into the main text. Comments are limited to 2000 characters in length.

If you are reporting an error in the content, please tell me directly.

Your name/email address:
Your comment:
 
Now, in order to verify that you're a real person, please answer this simple question: what is one plus four?
The answer is:
(please write in
numbers, eg 19)


Top-right shadow
 
Bottom-left shadow Bottom shadow