junk DNA validation

For topics about current or future BETA releases, including feature requests.

junk DNA validation

Postby roedygr » Mon May 04, 2009 10:30 am

I was shocked when I wrote this Funduc regex

\<span class=\"+[a-z]\"\>\</span\>

to look for junk of the form

<span class="xxx"></span>

there were pages of it.

It can always be safely removed.

Having whitespace inside may be intentional and meaningful.
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Postby Albert Wiersch » Mon May 04, 2009 1:37 pm

Thanks for the tip.

BTW, I think this regular expression search should be "\<span class=\".*\"\>\</span\>" (without the outside quotes).

I will look into adding a message about "span" elements that don't contain any text in a future version.
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2361
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Postby Albert Wiersch » Mon May 04, 2009 2:33 pm

The next update should display this warning message for empty "span" elements:
This "span" element is empty and may be useless. Consider removing this element or placing text (or something else) in it.

Thanks for the suggestion!
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2361
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Postby MikeGale » Mon May 04, 2009 5:04 pm

Where did the HTML come from that had this in it?

Do you know what generates it?
User avatar
MikeGale
Rank VI - Professional
Rank VI - Professional
 
Posts: 604
Joined: Mon Dec 13, 2004 2:50 pm
Location: Tannhauser Gate

Postby roedygr » Mon May 11, 2009 4:50 pm

Albert Wiersch wrote:Thanks for the tip.

BTW, I think this regular expression search should be "\<span class=".*"\>\</span\>" (without the outside quotes).

I will look into adding a message about "span" elements that don't contain any text in a future version.


The regex was a Funduc Search/Replace Regex which quirkly is prefix rather than postfix notation like Java.
see http://mindprod.com/jgloss/searchreplace.html
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Re: junk DNA validation

Postby roedygr » Fri Jul 03, 2009 5:05 am

Where does this junk come from?

1. Manually editing content in a text editor. Removing text without removing surrounding spans.

2. Hitting the macro key to generate span boiler plate accidentally/inappropriately
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada


Return to CSE BETA Talk

Who is online

Users browsing this forum: No registered users and 1 guest