A Delphi Fuzzy String Compare Algorithm

FuzzyStringCompare calculates an approximate similarity percentage between two strings.
It uses a heuristic to count matching characters that are roughly in similar positions, allowing for a tolerance based on string lengths. The tolerance is about one-third of the longer string’s length plus the length difference.

The algorithm greedily matches characters from the shorter string to unique positions in the longer string, if within the tolerance distance and not previously matched. It advances positions and backtracks when necessary to skip unmatched sections.

Comparisons are case-sensitive, as in the original. You might want to lowecase your strings.

function FuzzyStringCompare(const S1, S2: string): Integer;
var
StrLong, StrShort: string;
Matches: Integer; // Number of matching characters found
Pos1, Pos2: Integer; // Current positions in StrLong and StrShort
LenLong, LenShort: Integer; // Lengths of the longer and shorter strings
I: Integer; // Loop counter for initialization
Tolerance: Integer; // Allowed position difference (fuzz factor)
Temp: string; // Temporary for swapping strings
Tested: TArray<Boolean>; // Tracks matched positions in StrLong (dynamic to handle any length)
begin
// Ensure StrLong is the longer string
if Length(S1) >= Length(S2) then
begin
StrLong := S1;
StrShort := S2;
end
else
begin
StrLong := S2;
StrShort := S1;
end;

LenLong := Length(StrLong);
LenShort := Length(StrShort);

// Early exit for empty short string
if LenShort = 0 then
Exit(0);

// Calculate tolerance: ~1/3 of longer length + length difference
Tolerance := LenLong div 3 + (LenLong – LenShort);

// Initialize dynamic array for tested positions
SetLength(Tested, LenLong + 1);
for I := 1 to LenLong do
Tested[I] := False;

Pos1 := 1;
Pos2 := 1;
Matches := 0;

repeat
if not Tested[Pos1] then
begin
// Check for match within tolerance
if (StrLong[Pos1] = StrShort[Pos2]) and (Abs(Pos1 – Pos2) <= Tolerance) then
begin
Tested[Pos1] := True;
Inc(Matches);
Inc(Pos1);
Inc(Pos2);
if Pos1 > LenLong then
Pos1 := 1;
end
else
begin
// No match: advance Pos1
Inc(Pos1);
if Pos1 > LenLong then
begin
// Backtrack to last matched or start
Pos1 := LenLong;
while (Pos1 > 1) and not Tested[Pos1] do
Dec(Pos1);
Inc(Pos2);
end;
end;
end
else
begin
// Position already matched: skip it
Inc(Pos1);
if Pos1 > LenLong then
begin
// Backtrack to last matched or start
Pos1 := LenLong;
repeat
Dec(Pos1);
until (Pos1 = 1) or Tested[Pos1];
Inc(Pos2);
end;
end;
until Pos2 > LenShort;

// Compute percentage based on longer string
Result := 100 * Matches div LenLong;
end;

 

Full code and test in my https://github.com/GabrielOnDelphi/Delphi-LightSaber

Leave a Comment

Scroll to Top