I was recently faced with the task of removing a bunch of tags from a string I was manipulating in C#. The string class has a whole bunch of inbuilt methods that can be used to manipulate its contents, one of those being the .Replace method. I was hoping that this method could deal with wildcards, this would make the removal of any html tags easy using syntax such as:
thestring = theString.Replace("<a>*</a>");
Unfortunately there didn’t seem to be any such functionality. So I started to write my own utility method to do such a task. This turned out to be more complex that I first reckoned, the code I have included below works just fine when removing <a> tags that contain an href attribute. The code can be called as so:
theString = Utils.WildCardReplace(theString, @"<a href="*>", string.Empty);
And here’s the implementation:
public static string WildCardReplace(string strString, string strOldValue, string strNewValue)
{
bool oldValueFound = true;
//split out the start and end values using the wildcard
string strStart = strOldValue.Substring(0, strOldValue.IndexOf('*'));
string strFinish = strOldValue.Substring(strOldValue.IndexOf('*') + 1);
while (oldValueFound)
{
//get the index of the first occurance of this string
int indexOfStart = strString.IndexOf(strStart);
int indexOfEnd;
if (indexOfStart != -1)
{
//get the index of the first appearance of the end string
indexOfEnd = (strString.IndexOf(strFinish, indexOfStart)) - indexOfStart + 1;
}
else
{
indexOfEnd = -1;
}
////did we get matches
if (indexOfStart == -1 ||
indexOfEnd == -1)
{
//no so jump out of the loop
oldValueFound = false;
}
else
{
//yes, get the complete substring to replace
string strBitToReplace = strString.Substring(indexOfStart, indexOfEnd);
//do the replace
strString = strString.Replace(strBitToReplace, strNewValue);
}
}
return strString;
}
This will remove the opening a tag. The standard Replace method can then be used to remove all the orphaned closing tags.
Please feel free to reuse/improve the code as you wish.