
roedygr wrote:The only thing it screws up is ignoring <cseignore>surrounding deliberate high bit chars, which you explained was quite hard to fix since high bits are not checked in the parser.
final int highAt = whereBadCharFound();
if ( highAt >= "<cseignore>".length() )
{
final int start = big.lastIndexOf( "<cseignore">, highAt - "<cseignore>".length() );
if ( start >= 0 )
{
final int end = big.indexOf( "</cseignore">, start + "<cseignore">.length() );
if ( end > highAt )
{
return true; // ignore the high char, it was in a <cseignore> sandwich
}
}
}
return false; // report the high char if configued to do so, it was not in a <cseignore> sandwich


Users browsing this forum: Google [Bot] and 2 guests