Have ideas to improve npm?Join in the discussion! »

    string-similarity-algorithm
    TypeScript icon, indicating that this package has built-in type declarations

    1.1.0 • Public • Published

    string-similarity-algorithm

    A set of string similarity algorithm implementations.

    [TOC]

    Install

    npm i string-similarity-algorithm --save

    Usage

    import similarity from 'string-similarity-algorithm'
     
    const x = '赵丽颖否认产子'
    const y = '赵丽颖极力否认生子'
     
    const lcsScore = similarity(x, y, 'lcs') // 0.75
    const levenshteinScore = similarity(x, y, 'levenshtein') // 0.6666666666666667

    API

    similarity (x: string, y: string, type: Type = 'lcs', options?: SimhashOptions): SimilarityReturn

    Calculate the similarity of string x and string y.

    type Type = 'lcs' | 'levenshtein' | 'simhash'
    type SimilarityReturn = number | SimhashSimilarityReturn
    • type
      • lcs: use function lcs.
      • levenshtein: use function levenshtein.
      • simhash: use function simhashSimilarity.

    lcs (x: string, y: string): number

    Calculates the similarity between strings x and y using longest common subsequence.

    levenshtein (x: string, y: string): number

    Calculates the similarity between strings x and y using levenshtein distance(edit distance).

    lcslen (x: string, y: string): number

    Return the longest common subsequence length of string x and string y.

    levenshteinDistance (x: string, y: string): number

    Return the edit distance of string x and string y.

    simhash (s: string, options: SimhashOptions = {}): number

    Return simhash of string s.

    interface SimhashOptions {
      hashType?: HashType, // default is hashlittle
      kshinglesN?: number  // default is 3
    }

    hammingDistance (x: number, y: number): number

    Return hamming distance of x and y.

    hammingWeight (x: number): number

    Return hamming weight(number of 1 bits) of x.

    simhashSimilarity (x: string, y: string, options: SimhashOptions = {}): SimhashSimilarityReturn

    Calculates the similarity between strings x and y using simhash, hamming distance and hammingWeight).

    interface SimhashSimilarityReturn {
      score: number, // similar score of x and y, scope: [0, 1]
      hammingDistance: number // hamming distance of x and y
    }

    Others

    If strings x and y is short(eg x and y are doc titles), the best is lcs, worst is simhash.

    Install

    npm i string-similarity-algorithm

    DownloadsWeekly Downloads

    33

    Version

    1.1.0

    License

    ISC

    Unpacked Size

    31 kB

    Total Files

    18

    Last publish

    Collaborators

    • avatar