URL's in regular expressions

Bill Horne ttrmt-9Ys0Lnm7GY0339nkgaCO/3viChZXdy27 at public.gmane.org
Mon Dec 14 10:41:01 EST 2009


Thanks for reading this: I need help to fine-tune a regular expression
I'm using to add anchor tags to URL's. The URL's are from (plain text)
emails, and I add anchor tags before publishing them on the web.

Here's the problem: this regexp works OK unless the URL is followed by
a period, i.e., if it's at the end of a sentence. I'd like to tune it
so that it does _NOT_ include the trailing period in the anchor tag.

So, http://billhorne.com works fine, and produces <a
href="http://billhorne.com">http://billhorne.com</a>. However, if the
URL is at the end of a line, e.g., "... visit http://billhorne.com.",
then http://billhorne.com. becomes 
<a href="http://billhorne.com.">http://billhorne.com.</a>.

All suggestions welcome. TIA.

sed '/<http:/!s/\(http:\/\/[-A-Za-z0-9:#@%/;$()~_?+=\\\.&]*\)/<a href="\1">\1<\/a>/g' c3 >c4a


-- 
Bill Horne






More information about the Discuss mailing list