A little late to the party but, after wanting to create my own offline version of CodePen, I implemented my own version of html syntax highlighting following CodePen's theme.
This does syntax highlighting and markup formatting, though the formatting depends on whether or not your html is well-formed.
Just add this as a class for your RichTextBox, instantiate it accordingly and call it within whichever event works for you (I'm using it with the RTB's double_click event but that does eliminate double-click text selection). What I'm planning to do is add a timer, some boolean variables and work this within the key_up and key_down events to set the highlight update to be a bit more automatic and less intrusive on shortcuts. (which is hereby included below the class)
public void HighlightHTM(RichTextBox Htm_Input)
{
Htm_Input.Visible = false;
#region store the original caret position + forecolor
int originalIndex = Htm_Input.SelectionStart;
int originalLength = Htm_Input.SelectionLength;
Color originalColor = Color.FromArgb(200, 200, 200); // Grey
#endregion
#region try to format the markup
try { Htm_Input.Text = XElement.Parse(Htm_Input.Text).ToString(); } catch { }
#endregion
#region match everything but puncutation and equals
Regex e = new Regex(@"(.*?|=)[^\w\s]");
MatchCollection eMatches = e.Matches(Htm_Input.Text);
foreach (Match m in eMatches)
{
Htm_Input.SelectionStart = m.Groups[1].Index;
Htm_Input.SelectionLength = m.Groups[1].Length;
Htm_Input.SelectionColor = Color.FromArgb(221, 202, 126); // Yellow
}
#endregion
#region match tags
Regex t = new Regex(@"(<\w+|</\w+|/>|>)[^=]");
MatchCollection tMatches = t.Matches(Htm_Input.Text, 0);
foreach (Match m in tMatches)
{
Htm_Input.SelectionStart = m.Groups[1].Index;
Htm_Input.SelectionLength = m.Groups[1].Length;
Htm_Input.SelectionColor = Color.FromArgb(167, 146, 90); // Brown
}
#endregion
#region match quotes
Regex q = new Regex("\".*?\"");
MatchCollection qMatches = q.Matches(Htm_Input.Text);
foreach (Match m in qMatches)
{
Htm_Input.SelectionStart = m.Index;
Htm_Input.SelectionLength = m.Length;
Htm_Input.SelectionColor = Color.FromArgb(150, 179, 138); // Green
}
#endregion
#region match inner html
Regex h = new Regex(">(.+?)<");
MatchCollection hMatches = h.Matches(Htm_Input.Text);
foreach (Match m in hMatches)
{
Htm_Input.SelectionStart = m.Groups[1].Index;
Htm_Input.SelectionLength = m.Groups[1].Length;
Htm_Input.SelectionColor = Color.FromArgb(200, 200, 200); // Grey
}
#endregion
#region restoring the original colors, for further writing
Htm_Input.SelectionStart = originalIndex;
Htm_Input.SelectionLength = originalLength;
Htm_Input.SelectionColor = originalColor; // Light Grey
#endregion
Htm_Input.Focus();
Htm_Input.Visible = true;
}
Happy coding!
Edit: I should also mention that !doctype breaks formatting as it's not exactly xml-friendly in the context of "well-formed". For my purposes, all tags including body and relevant closings, css and js links are added programmatically at page save so only markup within the body tags are worked with inside the html RTB. This eliminates that problem.
You'll notice that this relies exclusively on Regex rather than on hard-coded tags and properties. I did this because tags and properties have a tendency to pop on and off the w3 scene quite often. That would force a dev to continually have to go back and edit those strings to remove deprecated tags / properties or to add new. Not optimal.
I also thought it prudent to go ahead and include the instantiation / usage examples to make this a bit more plug&play.
Above public Main(), instantiate like so:
#region Class Instantiation
SyntaxHighlight syntax = new SyntaxHighlight();
#endregion
... and, within your chosen event handler, call it like so:
private void htm_input_DoubleClick(object sender, EventArgs e)
{
syntax.HighlightHTM(Htm_Input);
}
Naturally, adding a SaveFileDialog and an OpenFileDialog pretty much provides this the functionality of your very own, albeit very basic, html editor. Toss in a WebBrowser control and apply the RTB's text as the WebBrowser's source and you've upgraded to live-view.
In the very least, this should serve as a viable reference for syntax highlighting in general. It really just boils down to identifying patterns and manipulating their colors so, for example, this will work effectively with css, javascript and even C# with some light adjusting of the pattern identification parameters.
The following is how I setup the automatic refresh with key_up / key_down and a timer set to 1000 ms:
#region Globals
int r = 0;
bool refresh = false;
#endregion
private void Htm_Input_KeyUp(object sender, KeyEventArgs e)
{
refresh = true; // enter refresh cycle
}
private void Htm_Input_KeyDown(object sender, KeyEventArgs e)
{
refresh = false; // abort refresh cycle
}
private void Timer_Refresh_Tick(object sender, EventArgs e)
{
// check if refresh cycle is entered, refresh at 3 seconds or reset the counter if aborted
if (refresh) { if (r == 3) { syntax.HighlightHTM(Htm_Input); refresh = false; r = 0; } r++; } else { r = 0; }
}