a system of words, letters, figures, or other symbols substituted for other words, letters, etc.

Javascript, HTML, and CSS are all code. Each is its own language, with its own rules. They share many of the same symbols, but attribute different meanings to them. The process of substitution, or re-encoding, further disambiguates them, as they target different domains.


HTML is a markup language which annotates textual information with type information. It denotes a particular string as a header, or a paragraph.

CSS is a styling language which describes the desired look of a particular type of element. It denotes that a header should look this way, and a paragraph should look that way.

Javascript is a computing language which describes procedures or functional relationships between fixed or variable datapoints.

My Preference

For the most part, I don't really like styling. I'd rather someone else did it. That isn't to say that I don't think it's valuable. CSS and HTML have certainly improved over the years, which is to say that browsers which render them have improved, but they are still immature technologies.

There are tools like LESS which allow you to describe your code in more functional terms, but browsers do not yet natively support LESS as an alternative to CSS. That means that if you write less, you have to run it through a compiler to translate it into the necessary format. That's only a minor annoyance, and given that it is likely to save writing a bunch of CSS, it's probably worth it. Still, I look forward to the day when browsers break the monopoly which CSS has over the web's native ecosystem.


Though I'd rather spend my time on computing, it doesn't make sense for most of my projects to pay someone else to style it for me. As such, it's a practical necessity that I improve my design skills. I understand a fair bit about CSS, but the sheer size of modern stylesheets can be daunting.

While the Javascript for my page (not including my Markdown parser) only weighs in at about 80 lines, my CSS is well over a thousand lines. For such a simple page, that seems extraneous. I started ripping sections out, but decided that I may at some point need those portions of the code.

By spending a little bit of time programmatically editing my CSS, I can effectively cut down the size of these files. More importantly, I can gain a better understanding of what each line of code is accomplishing.

I know there are already tools that will do this, but I want the experience of doing it myself. I may even end up doing a better job of it.


It's always a good idea to identify your assumptions before jumping into a programming task. For this exercise, I'll assume that the file I'm trying to process contains valid CSS3.

I intend for my script to extract the styling rules, and output them in an easily parsed format. I'm only interested in the rules which will actually be applied. As such, if a rule is defined, and then replaced by a later rule applied to the same elements, I only want the second rule.

I want to be able to call this script from the command line, and pass any single CSS file as input.


Since I only care about machine readability, I'm going to start by stripping all of the comments and removing redundant whitespace.

var fs=require("fs"); // load the filesystem module

var fn=process.argv[2]; // fn is the filename passed at the command line

if(!fn || !fn.match(/^.*\.css$/){  // if the user didn't pass a valid filename
  console.log("try providing the name of a css file"); // complain
  process.exit(0); // and quit

var css=fs.readFileSync(fn,"utf8"); // read and store the file

var uncomment=function(css){ // take css as input
  return css.split("\n") // split the file by newlines
    .map(function(line){ // for each line
      return line.replace(/\/\/.*$/,""); // remove line comments
    .filter(function(line){ // filter out blank lines
      return !line.match(/^\s*$/);
    }).join(" ") // join each line with a single space
    .replace(/\s+/g," ") // remove extraneous spaces
    .replace(/\/\*([\s\S]*?)\*\//g,""); // remove multiline comments
    // that's it!

var trimmed=uncomment(css); // call the function and store the return value

Pulling apart the pieces

The web these days is all abuzz over 'responsiveness'. In case you've been living under a rock, that means that you should be able to view it on different screen sizes, and it should respond to your particular device's limitations.

This is usually accomplished with a combination of two techniques:

Grid classes

These divide sections of the screen into even portions and allow you to direct elements to occupy some portion of the screen.

Media queries

These ask the device for the dimensions of its screen, and allow you to write rules which only take effect under certain conditions.

Depending on your target market, it's possible (though usually not the case) that you will only need one or the other.

Media queries differ from other CSS directives in that they contain nested style directives. As such, it will simplify the task of parsing the rest of the file if we remove them now.

var findMQ=function(css){
  // this is a common pattern I use when pulling components out of strings
  var MQ=[]; // create an accumulator in the form of an array
  return { // return an object containing the results
    // the first result is the CSS without the media queries
    css:css.replace( // we replace the media queries with an empty string
      /@media\s*\([\s\S]*\)\s*\{[\s\S]*\} \}/g // after regexing them
      ,function(m){ // but store the queries themselves by passing a function
        MQ.push(m); // which pushes the recognized patterns to an array
        return ""; // before removing them from the string
      }) // so css is a little cleaner now
    ,mq:MQ // and we also return the media queries

var queryFree,queries;
(function(){ // let's make a closure
  var temp=findMQ(trimmed); // and throw away this intermediary value
})(); // immediate invocation

Getting organized

Now that we've isolated the media queries, we shouldn't have to deal with any other exceptions to the basic syntax. Every style rule should come in one format, a selector, followed by a sequence of rules encapsulated by a pair of curly braces, like so:


So that simplifies things. Order matters, when it comes to CSS. Directives occurring later in the source override conflicting directives which precede them.

We can reuse the pattern employed in findMQ, and create an accumulator which we can then use to collect the results of our search over the various rules. By using an object instead of an array, and using functionally unique selectors as keys, we can ensure that all instances of a particular selector are stored in one place.

We can break this code into two problems: dividing the CSS blob into valid rules which we can iterate over, and actually performing the iteration. Once we've defined our function, we'll be able to apply it to the contents of our media queries as well.

var splitByRule=function(trimmed){
  var temp=[]; // an accumulator in the form of an array
    /[^\{]*\{[^\}]*}/g // regex for (selectors -> rules in curly braces)
    ,function(rule){ // for every rule
      temp.push(rule.replace(/^\s+/,"")); // push a trimmed rule to the array
      return ""; // replace it with the empty string
  return temp; // return the array

var rules=splitByRule(queryFree);

Treading carefully

It's generally a good idea before iterating over a large amount of data to make sure that the function you are applying can be safely applied to each element. After all, it would be easy to miss any errors.

var parseRule=function(rule){ 
  var O={ // we're going to return an object
    src:rule // store the source of the rule
    ,rules:{} // instantiate another object to store the rules
  var temp;
  var sel=rule.replace(/\{[^\}]+\}/ // the selector is what's left over
      temp=attr.slice(1,-1); // strip out the curly braces
      return ""; // replace it with the empty string
    .replace(/^\s*/,"") // trim leading spaces
    .replace(/\s*/,""); // trim trailing spaces
  temp // at this point, temp consists of a series of rules
    .split(";") // splitting by semicolon produces an array
    .filter(function(rules){ // we only want valid rules
      return rules.match(/:/);  // they must contain a colon
    .map(function(r){ // for every valid rule
      var key,val; // to store the key and val
      r.replace(/\S+\:/,function(k){ // grab the first bit
        key=k.slice(0,-1); // throw away the colon and assign to 'key'
        return ""; // replace with the empty string
      }).replace(/[\s\S]+$/,function(v){ // get the rest of the rule
        val=v.replace(/^\s+/,""); // assign it to 'val'
        return ""; // replace it with the empty string
      O.rules[key]=val; // write (or overwrite) this attribute
  return O; // return the produced rule

Now we can compare the source to the result. I passed it the CSS for my table of contents.

{ src: '.toc{ clear:left; margin:3%; width:200px; border:3px dotted #AAA; float:right; display:hidden; }',
   { clear: 'left',
     margin: '3%',
     width: '200px',
     border: '3px dotted #AAA',
     float: 'right',
     display: 'hidden' },
  sel: '.toc' }

Looks good to me! It should be safe to use on the whole array of rules, then...

var parsedRules=rules.map(parseRule);

Finishing the job

I said I wanted a nice, machine readable format for the final data. For that, I'm going to use JSON. I just need to map over this final array, using each selector as a key, adding to that key's value as I go, or instantiating the value as necessary.

var CSS={}; // the object we'll be populating
parsedRules.map(function(R){ // for every rule object 'R'
  var sel=R.sel;

  if(!CSS[sel]){ // if the CSS selector does not exit
    CSS[sel]={}; // instantiate it

  Object.keys(R.rules).map(function(r){ // for every sub-rule

}); // that's it!

console.log(JSON.stringify(CSS)); // totally machine readable, no redundancy!

I called this script on the CSS for this website, and redirected it to a JSON file using the following command:

node cut.js pure.dark.css > pure.dark.css.json

You can check out the results here! Thanks for reading.


I forgot about the media queries!

var parseQuery=function(q){
  var qObj={};
  var temp={};
  var body=q.replace(/@media\s*/,"") // you can throw this away, we know what it is
      return "";
  var rules=splitByRule(body.slice(1,-1)).map(parseRule);

    var sel=R.sel;


  return qObj;

Having defined a function that we can safely map over an array of queries, let's output everything together!

})); // totally machine readable, no redundancy!

You can find the exact script I used to process my CSS here.

You can see its output here.