從 Mustache 的 render 函式了解模板系統如何解析並渲染傳入的資料

24 min readJun 30, 2023

大家好，今天我們來研究一下一個很經典的模板系統 Mustache，透過原始碼理解它本身的運作原理。

Mustache 是什麼？

Mustache 第一版於 2009 年釋出，是一個挺有年代的模板系統，JS 版本最後一次更新也是兩年多前。程式碼的範例如下：

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Document</title>
  </head>
  <body>
    <div id="app"></div>
    <script src="https://cdn.jsdelivr.net/npm/mustache@4.2.0/mustache.js"></script>
    <script>
      const view = {
        name: "Eason",
        age: () => new Date().getFullYear() - 2004,
      };

      const output = Mustache.render("Hello, I'm {{name}} and I'm {{age}} years old.", view);
      document.querySelector("#app").innerHTML = output;
    </script>
  </body>
</html>

畫面上會渲染出「Hello, I’m Eason and I’m 19 years old.」。 Mustache.render 這個函式在呼叫時夾帶了一個字串和包含 name 和 age 兩個鍵的鍵值對，如果仔細核對就會發現鍵值對的兩個鍵被套入了字串的 {{name}} 和 {{age}} 中。也可以處理陣列：

const view = {
  items: [
    {
      id: 1,
      fruit: "apple",
    },
    {
      id: 2,
      fruit: "banana",
    },
    {
      id: 3,
      fruit: "pear",
    },
  ],
};

const output = Mustache.render(
  "{{#items}}<p>{{id}}-{{fruit}}</p>{{/items}}",
  view
);
document.querySelector("#app").innerHTML = output;

Mustache 會直接遍歷 items 並依序渲染出 1-apple, 2-banana, 3-pear等 HTML。

了解它的基本用途後，我們就來看看它的原始碼。

理解 Mustache.render

以下我們就以字串 “{{#items}}{{id}}-{{fruit}}{{/items}}” 作為範例。由於看整個原始碼的話篇幅會太長，我們就只看它最重要的 render 函式。我們可以直接看到 mustache.render：

mustache.render = function render (template, view, partials, config) {
  if (typeof template !== 'string') {
    throw new TypeError('Invalid template! Template should be a "string" ' +
                        'but "' + typeStr(template) + '" was given as the first ' +
                        'argument for mustache#render(template, view, partials)');
  }

  return defaultWriter.render(template, view, partials, config);
};

在這邊可以看到它首先對 template 參數做了一個檢查，如果不是字串的話則會拋出 TypeError。這裡它使用了自定義的 typeStr 函式：

function typeStr (obj) {
  return isArray(obj) ? 'array' : typeof obj;
}

由於 typeof Array 會回傳 ‘object’，這裡對它多做了一層抽象，讓它可以區分 object 和 array 方便使用者除錯。isArray 則直接對應到 Array.isArray，並針對不支援此方法的舊型瀏覽器採用 Polyfill。接著我們看到 template 型別檢查通過時：

return defaultWriter.render(template, view, partials, config);

這裡我們回追 defaultWriter 的源頭可以發現：

var defaultWriter = new Writer();

順著 Writer 的 prototype 尋找的話，就能找到 Writer.prototype.render：

Writer.prototype.render = function render (template, view, partials, config) {
  var tags = this.getConfigTags(config);
  var tokens = this.parse(template, tags);
  var context = (view instanceof Context) ? view : new Context(view, undefined);
  return this.renderTokens(tokens, context, partials, template, config);
};

我們先來看：

var tags = this.getConfigTags(config);

這裡的 getConfigTags 會對應到 Writer.protoype.getConfigTags

Writer.prototype.getConfigTags = function getConfigTags (config) {
  if (isArray(config)) {
    return config;
  }
  else if (config && typeof config === 'object') {
    return config.tags;
  }
  else {
    return undefined;
  }
};

這裡如果對應 render 函式的註解看就會很清楚－－若傳入的是陣列，則做為 tags 直接回傳；若傳入的是物件則回傳它的 object.tags。在上面的例子：

const output = Mustache.render(
  "{{#items}}<p>{{id}}-{{fruit}}</p>{{/items}}",
  view
);

由於我們沒有傳入任何東西到第四個參數 config，這裡的 getConfigTags 會回傳 undefined。接著我們來看下一行 parse 函式：

Writer.prototype.parse = function parse (template, tags) {
  var cache = this.templateCache;
  var cacheKey = template + ':' + (tags || mustache.tags).join(':');
  var isCacheEnabled = typeof cache !== 'undefined';
  var tokens = isCacheEnabled ? cache.get(cacheKey) : undefined;

  if (tokens == undefined) {
    tokens = parseTemplate(template, tags);
    isCacheEnabled && cache.set(cacheKey, tokens);
  }
  return tokens;
};

在看懂 parse 函式之前，我們先花一些時間看一下 parseTemplate 這個函式：

function parseTemplate (template, tags) {
  if (!template)
    return [];
  var lineHasNonSpace = false;
  var sections = [];     // Stack to hold section tokens
  var tokens = [];       // Buffer to hold the tokens
  var spaces = [];       // Indices of whitespace tokens on the current line
  var hasTag = false;    // Is there a {{tag}} on the current line?
  var nonSpace = false;  // Is there a non-space char on the current line?
  var indentation = '';  // Tracks indentation for tags that use it
  var tagIndex = 0;      // Stores a count of number of tags encountered on a line

  // Strips all whitespace tokens array for the current line
  // if there was a {{#tag}} on it and otherwise only space.
  function stripSpace () {
    if (hasTag && !nonSpace) {
      while (spaces.length)
        delete tokens[spaces.pop()];
    } else {
      spaces = [];
    }

    hasTag = false;
    nonSpace = false;
  }

  var openingTagRe, closingTagRe, closingCurlyRe;
  function compileTags (tagsToCompile) {
    if (typeof tagsToCompile === 'string')
      tagsToCompile = tagsToCompile.split(spaceRe, 2);

    if (!isArray(tagsToCompile) || tagsToCompile.length !== 2)
      throw new Error('Invalid tags: ' + tagsToCompile);

    openingTagRe = new RegExp(escapeRegExp(tagsToCompile[0]) + '\\s*');
    closingTagRe = new RegExp('\\s*' + escapeRegExp(tagsToCompile[1]));
    closingCurlyRe = new RegExp('\\s*' + escapeRegExp('}' + tagsToCompile[1]));
  }

  compileTags(tags || mustache.tags);

  var scanner = new Scanner(template);

  var start, type, value, chr, token, openSection;
  while (!scanner.eos()) {
    start = scanner.pos;

    // Match any text between tags.
    value = scanner.scanUntil(openingTagRe);

    if (value) {
      for (var i = 0, valueLength = value.length; i < valueLength; ++i) {
        chr = value.charAt(i);

        if (isWhitespace(chr)) {
          spaces.push(tokens.length);
          indentation += chr;
        } else {
          nonSpace = true;
          lineHasNonSpace = true;
          indentation += ' ';
        }

        tokens.push([ 'text', chr, start, start + 1 ]);
        start += 1;

        // Check for whitespace on the current line.
        if (chr === '\n') {
          stripSpace();
          indentation = '';
          tagIndex = 0;
          lineHasNonSpace = false;
        }
      }
    }

    // Match the opening tag.
    if (!scanner.scan(openingTagRe))
      break;

    hasTag = true;

    // Get the tag type.
    type = scanner.scan(tagRe) || 'name';
    scanner.scan(whiteRe);

    // Get the tag value.
    if (type === '=') {
      value = scanner.scanUntil(equalsRe);
      scanner.scan(equalsRe);
      scanner.scanUntil(closingTagRe);
    } else if (type === '{') {
      value = scanner.scanUntil(closingCurlyRe);
      scanner.scan(curlyRe);
      scanner.scanUntil(closingTagRe);
      type = '&';
    } else {
      value = scanner.scanUntil(closingTagRe);
    }

    // Match the closing tag.
    if (!scanner.scan(closingTagRe))
      throw new Error('Unclosed tag at ' + scanner.pos);

    if (type == '>') {
      token = [ type, value, start, scanner.pos, indentation, tagIndex, lineHasNonSpace ];
    } else {
      token = [ type, value, start, scanner.pos ];
    }
    tagIndex++;
    tokens.push(token);

    if (type === '#' || type === '^') {
      sections.push(token);
    } else if (type === '/') {
      // Check section nesting.
      openSection = sections.pop();

      if (!openSection)
        throw new Error('Unopened section "' + value + '" at ' + start);

      if (openSection[1] !== value)
        throw new Error('Unclosed section "' + openSection[1] + '" at ' + start);
    } else if (type === 'name' || type === '{' || type === '&') {
      nonSpace = true;
    } else if (type === '=') {
      // Set the tags for the next time around.
      compileTags(value);
    }
  }

  stripSpace();

  // Make sure there are no open sections when we're done.
  openSection = sections.pop();

  if (openSection)
    throw new Error('Unclosed section "' + openSection[1] + '" at ' + scanner.pos);

  return nestTokens(squashTokens(tokens));
}

這個函式會將傳入的 template 變成一個 token 樹。token 是一個 Mustache 定義的資料型態，本質上就是一個至少包含四個元素的陣列。

第一個元素有點類似於它的 type，總共有三種可能性：符號本身、value、text。例如 {{#items}} 會取得 #、{{fruit}} 會取得 name、非上述狀況則會取得 text。

第二個元素為它的「值」。例如 {{#items}} 為 items、{{fruit}} 為 fruit、非上述狀況則為字串本身（例如 , -）。

三、四個元素則為它的起始和結束索引，例如  在字串的索引 10 到 13，就分別會對應到 10, 13。

而作為字串中最外層根節點的 {{#items}}{{/items}} 會包含第五、六個元素，分別是一樣以 tokens 資料型態儲存的子節點和節點的結束索引。

接著我們看看程式碼。一開始都是宣告變數或函式，直接從實際的函式呼叫開始：

compileTags(tags || mustache.tags);

由於先前我們的 tags 並沒有傳入自定義的符號，它會用 || 做一層防呆，採用 mustache.tags 也就是 ’{{‘, ‘}}’。這個函式會傳入 tags 並上面宣告的 var openingTagRe, closingTagRe, closingCurlyRe; 轉換為包含反斜線的正則。預設情況下三個值分別對應為：/\{\{\s*/, /\s*\}\}/, /\s*\}\}\}/。之所以包含 \s* 是因為 Mustache 也支援以包含空白的形式傳入，例如：{{# items }}，增加反斜線則是避免作為正則使用時，被判定為具有功能的符號。

接著我們可以看到：

var scanner = new Scanner(template);

這個 Scanner 是 Mustache 原始碼內部自定義的工具，用來掃出字串中的 token：

function Scanner (string) {
  this.string = string;
  this.tail = string;
  this.pos = 0;
}

這裡以 Scanner 原型中的 eos 函式作為條件進到 while， eos 會判定 Scanner 的 tail 是否為空字串。也就是說終止條件為 tail 是空字串時：

while (!scanner.eos()){}

接下來的動作為：

嘗試尋找 tags 之間的字串（例如 , - 等）
如果有發現 value 的話，會遍歷這個字串，並組成像這樣的 tokens 資料型態：[‘text’, ‘<’, 10, 11]
接著它會去在字串中掃瞄並尋找要處理的 type，並藉由判定 type 去判定並找出對應變數。此時已經可以拆分出 items, id, fruit等我們在字串中加入的變數。
將其轉為 token 資料型態，並傳入 tokens 陣列中

此時的 tokens 會包含數個種類為符號本身、name 和 text 的純文字 token。為了將其轉為符合 Mustache 本身定義的資料型態，最後會分別呼叫 squashTokens 和 nestTokens 並回傳結果。

我們可以先看看 squashTokens：

function squashTokens (tokens) {
  var squashedTokens = [];

  var token, lastToken;
  for (var i = 0, numTokens = tokens.length; i < numTokens; ++i) {
    token = tokens[i];

    if (token) {
      if (token[0] === 'text' && lastToken && lastToken[0] === 'text') {
        lastToken[1] += token[1];
        lastToken[3] = token[3];
      } else {
        squashedTokens.push(token);
        lastToken = token;
      }
    }
  }

  return squashedTokens;
}

它先遍歷傳入的 tokens，並檢查遍歷到的 tokens[i]。第一次必定會進 else，因為 lastToken 在 else 中才會被給值。Mustache 在上半部拆解時將種類為 text 的 token 分開儲存在 tokens 中，這個函式其實就是把它們結合在一起，變成例如：[‘text’, ‘’, 10, 13]。接著是 nestTokens：

function nestTokens (tokens) {
  var nestedTokens = [];
  var collector = nestedTokens;
  var sections = [];
  var token, section;
  for (var i = 0, numTokens = tokens.length; i < numTokens; ++i) {
    token = tokens[i];

    switch (token[0]) {
      case '#':
      case '^':
        collector.push(token);
        sections.push(token);
        collector = token[4] = [];
        break;
      case '/':
        section = sections.pop();
        section[5] = token[2];
        collector = sections.length > 0 ? sections[sections.length - 1][4] : nestedTokens;
        break;
      default:
        collector.push(token);
    }
  }

  return nestedTokens;
}

它一樣會先遍歷傳入的 tokens，對遍歷到的 tokens[i] 進行檢查：

第一個遍歷到的會是 [‘#’, ‘items’, 0, 10]，所以會被傳入 collector 和 sections，並將陣列第四個元素設為空陣列，用以放入 {{#items}}…{{/items} 內部的元素。
接下來的 token 如 [‘text’, ‘’, 10, 13] 都會被存入 collector 變數中，直到最後的 [‘/’, ‘items’, 33, 43]。
取得當時存入 sections 的 [‘#’, ‘items’, 0, 10, []]，將其第五個元素設為它的起始索引 33，並將 collector 中的 token 全部存入 [‘#’, ‘items’, 0, 10, []] 這個根 token 的空陣列中。

在 parseTemplate 完成後，它會得到像是這樣的資料：

接著我們可以回到 Writer.prototype.render 並看到：

var context = (view instanceof Context) ? view : new Context(view, undefined);

這裡就是檢查 view （也就是我們傳入的變數物件）是不是已經是 Context 實例，如果不是就將其作為變數並宣告為 Context 實例。接著我們看看 renderTokens：

Writer.prototype.renderTokens = function renderTokens (tokens, context, partials, originalTemplate, config) {
  var buffer = '';

  var token, symbol, value;
  for (var i = 0, numTokens = tokens.length; i < numTokens; ++i) {
    value = undefined;
    token = tokens[i];
    symbol = token[0];

    if (symbol === '#') value = this.renderSection(token, context, partials, originalTemplate, config);
    else if (symbol === '^') value = this.renderInverted(token, context, partials, originalTemplate, config);
    else if (symbol === '>') value = this.renderPartial(token, context, partials, config);
    else if (symbol === '&') value = this.unescapedValue(token, context);
    else if (symbol === 'name') value = this.escapedValue(token, context, config);
    else if (symbol === 'text') value = this.rawValue(token);

    if (value !== undefined)
      buffer += value;
  }
  
  return buffer;
};

我們最外層的 token 種類為 # 所以會先進 renderSection。renderSection 大致做的事情是：

找出 token 第一個元素（此時為 items）對應的值

[
    {
        "id": 1,
        "fruit": "apple"
    },
    {
        "id": 2,
        "fruit": "banana"
    },
    {
        "id": 3,
        "fruit": "pear"
    }
]

2. 遍歷上面的 value 陣列，將子 token 和遍歷到的物件作為參數再傳入 renderTokens。這裡就是用遞迴的方式將 {{#items}}…{{/items}} 內的內容依照陣列的長度依序傳入。

3. 我們子 token 陣列的第一個 token 為 [“text”, “”, 10, 13]，在遞迴中呼叫的 renderTokens 會直接進 Writer.prototype.rawValue 由於這邊本身就是回傳純文字，因此 rawValue 的邏輯很單純就是回傳 token 中的字串 。

4. 接著則是 token 種類為 name 的 [“name”, “id”, 13, 19]，這邊會對應到 Writer.prototype.escapedValue，這裡一樣會找到對應變數 id 的值並做一層檢查，如果為數字則轉為字串渲染，否則做一層特殊字符的跳脫（例如 & 會變轉為 & 避免 HTML 在解析時發生語法錯誤。

5. 完成解析後會拿到一組可以置入 Node 節點中的字串。Mustache 在回傳前會將模版作為鍵、解析出的 token 作為值快取起來，下一次在渲染時會嘗試從 cache 找是否已經快取過，藉以增加效能：

isCacheEnabled && cache.set(cacheKey, tokens);

6. 回傳取得的字串：

<p>1-apple</p><p>2-banana</p><p>3-pear</p>

最後藉由我們的

document.querySelector("#app").innerHTML = output;

就能將結果渲染到畫面上。

大致上來說， render 這個函式做的事情為：

將傳入的模板轉為 Mustache 自定義的 token 資料型態
將 token 和對應的資料變數結合轉為可傳入 HTML 的字串
快取結果方便下次呼叫 render 時使用，節省資源

Mustache 本身還有非常多功能，加上它本身是個非常成熟的模板系統，容錯率和彈性非常高（例如當資料變數在該層找不到時，它會嘗試往外層找），原始碼還是算蠻複雜的，有些地方也沒有抽象成可讀性較高的函式或實例，讀的過程中有時候會覺得卡卡的。

此外，其實 render 函式本身還包含了對於 Partial（在模版中置入模版）、invert（^）的處理，由於沒有投入許多時間研究，這部分我就沒有很完整地閱讀。

儘管如此，藉由閱讀這份原始碼還是讓我學到不少。

希望這篇文多少能幫助你理解 Mustache 和它 render 的原理。如果內容有任何錯誤也歡迎指出。

References:

GitHub - janl/mustache.js: Minimal templating with {{mustaches}} in JavaScript

Minimal templating with {{mustaches}} in JavaScript - GitHub - janl/mustache.js: Minimal templating with {{mustaches}}…

github.com

從 Mustache 的 render 函式了解模板系統如何解析並渲染傳入的資料

Mustache 是什麼？

理解 Mustache.render

GitHub - janl/mustache.js: Minimal templating with {{mustaches}} in JavaScript

Minimal templating with {{mustaches}} in JavaScript - GitHub - janl/mustache.js: Minimal templating with {{mustaches}}…

Written by Eason Lin

No responses yet