Have ideas to improve npm?Join in the discussion! »

    wordsoap

    0.2.0 • Public • Published

    wordsoap

    Build Status NPM version

    Clean up dirty HTML output from Microsoft Word

    Usage

    command line

    $ npm install -g wordsoap
    $ cat msword_garbage.html | wordsoap

    module

    $ npm install --save wordsoap
    var wordsoap = require('wordsoap')
     
    var dirty = "<p class=MsoNormal style='font-size:12pt'>Text</p>")
    var clean = wordsoap(dirty) // <p>Text</p>
     
    // access individual regex strings
    wordsoap.regexes.msoAttributes // <(\w+)(?: (?:class|lang|style|size|face|[ovwxp]))=(?:'[^']*'|""[^""]*""|[^\s>]+)(?:[^>]*)>
     
    // access individual regexes compiled with 'gi' flags
    wordsoap.regexesCompiled.msoAttributes // <(\w+)(?: (?:class|lang|style|size|face|[ovwxp]))=(?:'[^']*'|""[^""]*""|[^\s>]+)(?:[^>]*)>

    License

    ISC © Raine Lourie

    Install

    npm i wordsoap

    DownloadsWeekly Downloads

    7

    Version

    0.2.0

    License

    ISC

    Last publish

    Collaborators

    • avatar