Simple task - replace all sequential whitespace (tabs, spaces, newlines) with your character of choice, usually a space. Fact - StackOverflow is on top in google results, when searching for "C# normalize whitespace". Namely: Why this article? Highest voted answers are not the best performing, and some answers are just wrong. I propose a solution which is based on this StackOverflow answer. Referenced answer has a problem - it fails when input = " " (single space). I wasn't sure about it being the only unhandled corner case, so I changed method to use StringBuilder, to simplify string manipulation. Probably the same level of performance, just easier to read. Below version should be much faster than using Regex (link 1 - highest voted) and slightly faster than NormalizeWithSplitAndJoin by @JonSkeet. /// <summary> /// Any consecutive white-space (including tabs, newlines) is replaced with whatever is in normalizeTo. /// </summary> /// <param name="input">Input string.</param> /// <param name="normalizeTo">Character which is replacing whitespace.</param> /// <remarks>Based on http://stackoverflow.com/a/25023688/897326 </remarks> private static string NormalizeWhiteSpace(string input, char normalizeTo = ' ') { if (string.IsNullOrEmpty(input)) { return string.Empty; } StringBuilder output = new StringBuilder(); bool skipped = false; foreach (char c in input) { if (char.IsWhiteSpace(c)) { if (!skipped) { output.Append(normalizeTo); skipped = true; } } else { skipped = false; output.Append(c); } } return output.ToString(); } |
Tech Blog >