NaNoWriMo Promotes Manuscription

    tree-sitter-hast
    TypeScript icon, indicating that this package has built-in type declarations

    1.1.1 • Public • Published

    tree-sitter-hast

    Build Status Total alerts Language grade: JavaScript

    NPM package to convert tree-sitter parsed syntax trees to syntax-highlighted hast (Hypertext Abstract Syntax Trees).

    The main reason for this is so that tree-sitter can be used to syntax-highlight code in unified projects such as remark or rehype. Via plugins such as remark-tree-sitter.

    Installation

    npm install tree-sitter-hast

    or

    yarn add tree-sitter-hast

    tree-sitter-hast is written in TypeScript and includes type definitions, so there is no need to install a separate @types/tree-sitter-hast package if you are using TypeScript.

    Usage

    Scope Mappings

    For syntax highlighting, this package uses the same process that Atom uses with tree-sitter. The HTML classes that are used for syntax-highlighting do not correspond directly to nodes in the tree produced by tree-sitter, so scope mappings are used to specify which classes should be applied to which syntax nodes. (You can read mode in Atom's documentation on Creating a Grammar).

    Every Atom package that provides language support using the new tree-sitter mechanism also includes a scope mapping, and this package provides functionality to directly use these packages for highlighting.

    To use an atom language package, like any package you first need to install it using npm install or yarn add. Unfortunately most APM packages are not made available on NPM, so I've started to make some of them available under the NPM organization @atom-languages.

    After installing a language package, you can use loadLanguagesFromPackage to prepare them to be used with tree-sitter-hast.

    Example

    npm install tree-sitter-hast @atom-languages/language-typescript

    examples/example-1.js

    const treeSitterHast = require('tree-sitter-hast');
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        console.log(Array.from(languages.keys()));
    });

    Output:

    [ 'flow', 'tsx', 'typescript' ]

    Highlighting

    Highlighting is made available through the following functions:

    • highlightText(language, text, [options]) - highlight some plain text, using a language that's been made available by loadLanguagesFromPackage
    • highlightText(parser, scopeMappings, text, [options]) - highlight some plain text, and use a Parser that's already been prepared
    • highlightTree(scopeMappings, text, tree, [options]) - highlight a tree that's already been parsed by tree-sitter

    Example

    The following 3 examples all produce the same output.

    npm install tree-sitter-hast @atom-languages/language-typescript

    examples/example-2-1.js

    const treeSitterHast = require('tree-sitter-hast');
     
    const text = 'let v = 3';
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        const ts = languages.get('typescript');
        const highlighted = treeSitterHast.highlightText(ts, text);
        console.log(JSON.stringify(highlighted, null, 2));
    });

    examples/example-2-2.js

    const Parser = require('tree-sitter');
    const treeSitterHast = require('tree-sitter-hast');
     
    const text = 'let v = 3';
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        const ts = languages.get('typescript');
        const parser = new Parser();
        parser.setLanguage(ts.grammar);
        const highlighted = treeSitterHast.highlightText(parser, ts.scopeMappings, text);
        console.log(JSON.stringify(highlighted, null, 2));
    });

    examples/example-2-3.js

    const Parser = require('tree-sitter');
    const treeSitterHast = require('tree-sitter-hast');
     
    const text = 'let v = 3';
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        const ts = languages.get('typescript');
        const parser = new Parser();
        parser.setLanguage(ts.grammar);
        const tree = parser.parse(text);
        const highlighted = treeSitterHast.highlightTree(ts.scopeMappings, text, tree);
        console.log(JSON.stringify(highlighted, null, 2));
    });

    Output:

    {
      "type": "element",
      "tagName": "span",
      "properties": {
        "className": [
          "source",
          "ts"
        ]
      },
      "children": [
        {
          "type": "element",
          "tagName": "span",
          "properties": {
            "className": [
              "storage",
              "type"
            ]
          },
          "children": [
            {
              "type": "text",
              "value": "let"
            }
          ]
        },
        {
          "type": "text",
          "value": " v "
        },
        //...
      ]
    }

    Exporting HTML

    From this point, converting the HAST to an HTML can be done in a single call using hast-util-to-html (part of rehype):

    npm install hast-util-to-html tree-sitter-hast @atom-languages/language-typescript

    examples/example-3.js

    const toHtml = require('hast-util-to-html');
    const Parser = require('tree-sitter');
    const treeSitterHast = require('tree-sitter-hast');
     
    const text = 'let v = 3';
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        const ts = languages.get('typescript');
        const highlighted = treeSitterHast.highlightText(ts, text);
     
        // stringify to HTML
        console.log(toHtml(highlighted));
    });

    Output:

    <span class="source ts"><span class="storage type">let</span> v <span class="keyword operator js">=</span> <span class="constant numeric">3</span></span>

    Whitelisting Classes

    Sometimes including the full list of classes applied by the scope mappings can be too much, and you'd like to only include those that you have stylesheets for.

    To do this, you can pass in a classWhitelist via the options parameters to highlightText or highlightTree.

    npm install hast-util-to-html tree-sitter-hast @atom-languages/language-typescript

    examples/example-4.js

    const toHtml = require('hast-util-to-html');
    const Parser = require('tree-sitter');
    const treeSitterHast = require('tree-sitter-hast');
     
    const text = 'let v = 3';
     
    treeSitterHast
      .loadLanguagesFromPackage('@atom-languages/language-typescript')
      .then(languages => {
        const ts = languages.get('typescript');
        const highlighted = treeSitterHast.highlightText(ts, text, {classWhitelist: ['storage', 'numeric']});
     
        // stringify to HTML
        console.log(toHtml(highlighted));
    });

    Output:

    <span><span class="storage">let</span> v = <span class="numeric">3</span></span>

    TODO

    • Move prepare-language.ts over to highlight-tree-sitter
    • Flesh out documentation
    • Pull out HAST type definitions into own repo (DefinitelyTyped?)
    • Update highlight-tree-sitter to not produce HTML when not needed
    • Move over matches patch to highlight-tree-sitter

    Install

    npm i tree-sitter-hast

    DownloadsWeekly Downloads

    49

    Version

    1.1.1

    License

    MIT

    Unpacked Size

    17.5 kB

    Total Files

    11

    Last publish

    Collaborators

    • avatar