Click here to monitor SSC

Simple-Talk columnist

Sundown on Markdown?

Published 28 February 2014 10:47 am

Markdown is a way of using plain text to create markup, usually, but not exclusively, HTML. It is based on the plain-text email message conventions. It has never been a formal standard. If you’re a developer or DBA, and use StackExchange, Ask.SQLServerCentral.com, Stack Overflow or Github, then you’ll have used it.

The idea behind Markdown, which seems to be a development of AsciiDoc, is to make what is written as readable as possible in its source form. It’s not a new idea by any means. There even used to be a text-based spreadsheet, but for some reason the Markdown version got a lot of traction. Markdown has had three main uses:

  • Producing content quickly, for authors uninterested in the finer points of layout.

  • Producing academic work as a substitute for, or preliminary to, LaTeX;

  • Wikis, Forums and blogs. Wiki mark-up conventions were developed to get over the need for requiring users of websites to add their contributions in HTML mark-up. HTML isn’t always entirely logical for this, and is obvious overkill where a user just wants to put in a couple of paragraphs and maybe a photo of themselves skiing. Besides, it also makes it easier to ensure that text conforms with site conventions, and if the markup convention  prevents HTML additions, it allows the site to prevent malicious users adding cross-site scripting or other JavaScript poison.

Programmers seized upon Markdown as a great way to simplify the input of content and code snippets, and as a text-to-HTML conversion tool for web writers that allows you to write using plain text, then convert it to structurally valid HTML. Github and StackOverflow both use dialects of Markdown (GFM and Stackoverflow Markdown) and have extensions to make it useful for mixing text and code for communicating about development work. Markdown is also used by some documentation-generators, most notably Doxygen.

There is a continuing need for a common mark-up convention for Wikis, Blogs and forums. At the moment, Markdown seems to be what is most commonly used, other than Creole and the original Wikimedia mark-up, which also uses plain text conventions and has grown enormously to cope with the demands of creating academic articles.

The problem with MarkDown, in its original form, is that it only does the simplest things, and even does them badly. Try puzzling out the standard way of introducing a line-break into a simple paragraph, or doing nested bullet points with paragraphs and code blocks! It makes a mess of lists, doesn’t do hyperlinks well, and has no standard way of representing tables, no understanding of semantic mark-up or style classes. It won’t do definition lists, tables, citations, mathematics, and footnotes. It isn’t versatile enough to support its main use-case, rendering code because there’s no support for labelling the language used in code snippets. It is also insecure since it allows embedded HTML, and therefore opens the door to the possibility of cross-site scripting. You have to disable this entirely or add your own HTML sanitization.

In short, the original Markdown, the de-facto standard, is illogical, and  ambiguous. As a result, its users have tried to improve it, and remedy some of the above failings, with their own extensions. Github and Doxygen both use extensions, and then there is Pandoc, MultiMarkdown, and Markdown Extra. The problem is that, in the absence of any evolving standard,  these extensions have tended to drift off in their own incompatible ways with each implementation doing things slightly differently, mainly due to the confusion over what was originally intended. Who knows what was intended when there is no public standard anyway? The original vague specification hasn’t changed since 2004.

So what’s the solution? We can adopt one of the “extended” Markdown-based languages (Pandoc seems to have the most energy behind it), but what is really lacking is an agreed international standard that is logical and consistent. For me, AsciiDoctor, a development of AsciiDoc which pre-dates Markdown, looks very hopeful if it can outgrow its Ruby roots. Markdown’s successor should allow the average user to do average things without feeling constrained, but should flex sufficiently to provide more specialised use, such as is required by GitHub. More to the point, we can do nothing without an adopted standard that can be developed. I never expected that in 2014, I’d be pining for mark-up as simple as that in Ventura, or, heaven help us, something as versatile as LaTeX!

5 Responses to “Sundown on Markdown?”

  1. Stephanie Locke says:

    I’m a big fan of LaTeX and Pandoc. I use LaTeX for all my documents that needs analytics and write it in either Rstudio or writelatex.com. If you want a nice markdown editor I can recommend Texts (from texts.io)

  2. Keith Rowley says:

    What would be great would be if this new standard was built into SQL Server, or someone wrote and extension to SQL Server that allowed us to use it from there.
    I know this “should” be taken care of at the application level, but I have sent emails directly from SQL more times than I like to admit, and have to manually format them every time.

  3. Phil Factor says:

    @Keith
    In Doxygen, the application’s documentation can be embedded in the code that Doxygen is processing, within multi-line comments as Markdown, and generated from it. I don’t think that anyone has thought of a version of Doxygen for SQL Server, but a documentation system like that would certainly help a lot with keeping the source of your documentation all in one place. I’ve been able to use MultiMarkDown from SQL Server but it is a bit clunky.

  4. paschott says:

    I know we use it within our web app to handle basic formatting. We combine that with an RTE plug-in to make it easier for the users. It avoids issues w/ injection and gives the users access to most of the formatting they need. It definitely was painful for our dev team to pick a type of markdown and implement a decent RTE.

    A standard of some sort would be nice. Without a clear fore-runner in the tools, it’s going to be hard to have a standard unless devs come together to create that standard. Until then, I just see this becoming more fragmented until people decide they need better standards. I think that will eventually happen, but not sure when.

  5. EdCharbeneau says:

    I use markdown to get content into CMS’s. Generally clients send some awfully formatted word document, which I then copy and paste into my favorite markdown editor and get back a simple HTML document which my CMS can understand.

    I’ve worked with many CMS’s over the years and this is by far the best option I’ve found. Letting non-technical users enter HTML, MarkDown, BB Code, etc… always ends in an ugly disaster.

Leave a Reply