Download lib

Skrypt

Rev 2.1


Skrypt is a simple yet powerful declarative text transformation language, designed to easily transcribe, transliterate, or transform text in flexible ways, helping one avoid learning regular expressions and scripting languages. Whether you're adapting text between writing systems, cleaning up data, or restructuring it, Skrypt allows you to focus on what needs to be changed, rather than how to do it.

Skrypt lets you define rules that match specific patterns and their respective replacements. These can range from simple expressions like "replace A by B" to complex conditional substitutions with lookarounds and quantifiers. Under the hood, Skrypt compiles your rules into regular expressions and uses a special algorithm to resolve conflicts that you can export to include in your own websites or applications.

You can see a rules sample file in the Editor. This website is a concept demonstration to help you get a sense of the language and its capabilities. You can read an exhaustive documentation, test your rules by instantly transforming input text, and generate a JavaScript function online. Keep in mind, the project is still WIP and may contain bugs — please leave your Feedback ;)

Using the Editor

This web interface is divided by a slider and contains the Editor area to the left and a few tabs to the right, as you can see. The Editor component is powered by CodeMirror 6 and offers custom-made syntax highlighting, basic linting, autocompletion, and hover tooltips to aid you in writing your rules. Tab serves its traditional function of advancing the cursor to the next tab stop. The Editor supports all the common keybindings, such as Ctrl-Z, Ctrl-Shift-Right, Shift-Home, Ctrl-Backspace, as well as a few worth listing:

  • Shift-Alt-Up — copy line up
  • Shift-Alt-Down — copy line down
  • Ctrl-L — select line
  • Ctrl-I — select parent element
  • Ctrl-Shift-K — delete line
  • Ctrl-Shift-\ — move cursor to matching bracket
  • Ctrl-/ — toggle comment (comment/uncomment selection/line)
  • Ctrl-M — toggle tab-focus mode (allows you to use Tab to navigate out of the Editor

The Toolbar above the Editor has buttons to Open (upload) a file, Save (download) it, Parse the rules, Transform input text by applying the rules (see the Try tab), and Generate a JavaScript function, as described on the About tab. Parsing refers to the syntactic analysis of your rules by the computer to make them ready to be further applied or generate a function upon. To avoid the waste of resources, it is not done automatically, so make sure to press [Parse] every time you've done making some changes to the rules!

Finally, you can test your rules on a text by writing it (or pasting) in the Input field on the Try tab and pressing [Transform]. As the current version of Skrypt is still yet to be perfected, it may not be very performant with large inputs.

General

Skrypt is a character-based text transformation language. It operates by matching sequences of characters in text and replacing them according to the specified formal rules. A character is any single written symbol — a letter, digit, punctuation mark, emoji, shape, or even a space. A string is simply a sequence of such characters.

Any text after # to the end of a line is considered a comment, which is ignored by Skrypt and can be used to describe or annotate your rules. You are free to use them to leave notes for yourself or others reading or working on your rules. In fact, it may be useful to comment out a rule to temporarily disable it.

To include special characters like or ~ as regular ones, place a backslash \ in front of them (this is called escaping in computer science). For example, \~ matches a literal tilde. There are also some escape sequences with a special meaning:

  • \r: carriage return, used before \n on Windows
  • \n: newline
  • \t: tab
  • \v: vertical whitespace
  • \d: any digit (Arabic numeral); equivalent to <0-9>
  • \D: any non-digit
  • \s: any whitespace character; equivalent to <\r\n\t\f\v >
  • \S: any non-whitespace character
  • \0: null character

Furthermore, arbitrary Unicode characters can be written as \uXXXX, where XXXX is a hexadecimal number that represents the character's Unicode code point. This can be useful when you want to use a character that is hard to see or may be displayed incorrectly on the screen.

As of revision 2.0, in order to prevent breaking changes, the dollar sign $ , dot ., and quotation marks ' " are reserved for future uses and must be escaped.

New in 2.0 Alternatively, you can use `raw strings` that preserve most characters as they are written (literally), including most escape sequences (except for \r\n\t ), all the operators, and brackets.

New in 2.0 Directives

Directives are processing instructions that specify how certain statements should be interpreted or modify the code generation process. They can be used to define multiple functions in one file, split rules into several application stages, execute additional code, etc. They start with an exclamation mark and may contain zero or more values, separated by commas, after an equals sign: ! name = value1, value2.

Name Values Description
case sensitive None Marks following rules to match case strictly
case insensitive None Marks following rules to ignore case when matching
function name Defines a new function or renames the current one if it's empty
stage None Declares a new transformation stage within a function

Functions make it possible to define many independent sets of transformations in a single file. Each function contains its own options (the function's parameters), templates, and rules. A Skrypt file always defines at least one function, implicitly called "transform" by default. Using the ! function directive before any rules are defined in the current one doesn't leave it empty and create a new one, renaming the function instead. That means you can easily change the name of the first (default) function using the directive at the beginning of a file.

As Skrypt was designed primarily for transliteration, in most cases, one would expect rules to be resolved and applied simultaneously, such that A → B and B → C result in "AB" being transformed to "BC", not "CC". However, sometimes you may need to apply a few subsets of rules consecutively, e.g. to replace all As by Bs, and only then all Bs by Cs. This is where you use the ! stage directive — to separate such consecutive subsets of simultaneous rules.

Options

Options let you control whether certain rules are applied by making them conditional. They are also made available as parameters when Skrypt generates JavaScript functions. Use @ name = value to define an option, for example: @ soft = true. Possible values are currently backend-specific, and commonly include true (yes) and false (no) or integer numbers; other arbitrary strings must be enclosed in double quotes, e.g. `"disable"`.

Marking a rule as dependent on the values of some options is done by appending a so-called When clause to it. New in 2.0 There are three logical operators related to options: ~ not, & and, | or. New in 2.1 Additionally, for non-boolean options (with values other than true/false) you can use = and ~= to test for equality (e.g. devoicing = `"allowed"`); and for numeric options specifically, also <, <=, >, >=. Such a When clause starts with a question mark and contains an expression with options. For instance, ? ~silentH & nasal at the end of a rule would mean it should be active only when the option "silentH" is Not enabled (equals false) And "nasal" is enabled (equals true).

In Skrypt, statements (directives, options, templates, rules) are separated by newlines. Since whitespaces are normally ignored, it doesn't matter whether you write @a=b , @a = b, or even @ a =b — they will be treated the same.

Templates

Templates (also may be called constants or even variables) are reusable left-hand side expressions, defined with % name = expression. For instance: % vowel = <aeiou>. You can reference (substitute) a template by writing its name inside curly braces: {vowel}. This helps you avoid repeating the same long expressions across multiple rules.

The special right-hand side value {} (written also as ) is called void and means "nothing", used to remove a match completely from the text. For example, A → ∅ deletes all 'A's from the text.

Rules

A rule is the main "building block" that specifies what to replace and how. It consists of one or more patterns, separated by commas (and optionally newlines), that form a matching expression (LHS) and a replacement string (RHS): FROM → TO (or FROM -> TO). Matching patterns are case-insensitive by default.

Skrypt takes care of the bleeding and the feeding order problems. If there are conflicting rules, the first one that matches is used. Changed in 2.0 For definiteness, the length of rules no longer has any effect on their priority, so in many cases you should write the longest rules first.

As you should already know, it's possible to filter some rules out by adding an option check after them. However, each pattern can also include Changed in 2.0 any number of word delimiters and non-capturing pre- and post-conditions using square brackets before and after the primary expression:

  • [X]A: match A only if it comes after X — lookbehind
  • A[X]: match A only if it comes before X — lookahead
  • [X]A[Y]: match A only if it is between X and Y

Use _ within an expression to match any letter. The word boundary marks (delimiters) / can be explained as follows:

  • /A: match A only if it's at the start of a word
  • A/: match A only if it's at the end of a word
  • /A/: match only standalone A

These use your letters template if defined; otherwise, they match any character that is considered a letter (or a non-letter) character in Unicode.

New in 2.0 A block is a construction (statement) that can wrap several rules with common options and/or a repetitive expression. It starts like this: ? when_clause, expression: and terminates with a semicolon ;. A block must contain either a ? when_clause: , an expression:, or both with a comma in between. The expression, if provided, can be substituted using the caret ^ within the rules. The options of a block are combined with rule-local options using the And operator. Blocks can also be nested. You may read them as "if [when clause options], with [expression], do".

Operators

Expressions are composed of one or more terms — smallest inseparable units, like strings, charsets, _ or substitutions. Operators are used to build complex expressions:

Operator Name Description
A|B Or Either expression A or B; charset union
New in 2.0 A-B Difference Charset difference
<abc> Charset Any of the characters between <>, roughly, a shorthand for a|b|c
~A Not Negates term A, e.g. ~<ab> or [~A].
(A) Group Overrides precedence (the order of operations), e.g. a(b|c)
Due to the underlying implementation of Skrypt that doesn't have a Not operator, a negated character ~a it will be interpreted as an inverted set containing only this single character ~<a>. It's recommended to avoid matching "anything but this character" for clarity and performance.

Precedence from strongest to weakest is:

  1. Character concatenation (implicit)
  2. Difference (-)
  3. Not (~)
  4. Quantification (?, +, *, ×)
  5. Term concatenation (implicit)
  6. Or (|)

New in 2.1 Even though it is not considered an operator on its own in this context, - can also be used inside charsets to provide ranges that let you easily match any of the characters from the specified sequences, defined by their Unicode code points. For instance, <a-z0-9é> is a charset that would match any (lowercase) basic Latin letter (abcdefghijklmnopqrstuvwxyz), a digit, or the character é (e acute).

Quantifiers denote repetition of terms:

  • X?: zero or one X (optional X)
  • X+: one or more X
  • X*: zero or more X
  • New in 2.0 X×n: exactly n times; × can also be written as *=
  • New in 2.0 X×n+: n or more times
  • New in 2.0 X×n-m: from n to m times (both ends inclusive)
Generating JavaScript

Skrypt internally uses a custom UPLv1.0-licensed algorithm to transform the input by sequentially applying all rules and resolving match conflicts in-place, prioritizing the leftmost matches and rules defined earlier. This algorithm along with a function to build the result strings are provided in a self-contained small library that you may obtain by pressing [Download lib]. The Universal Permissive License version 1.0 is highly similar to MIT, but also provides an express patent grant and lets you include only a short-form link to the License instead of the full text.

The language offers you a way to compile your rules into an array of objects that can be fed into the algorithm and generate a JavaScript function to transform input using the very same library with options available as parameters. To do that, just press the respective button on the Toolbar after having parsed the rules. Such function will import the library file named Skrypt.js, which you should put into the same directory.

Input
Output

Skrypt © 2025 Mykhailo Stetsiuk

The Skrypt runtime library is available under the Universal Permissive License, Version 1.0, as described in the Generating JavaScript section of the docs. The Skrypt engine (includes ANTLR grammars, visitors and code generator) is provided under the Apache License, Version 2.0. The engine is written using the antlr-ng Parser Generator, licensed under the Revised BSD License. The Skrypt language support for CodeMirror (npm) is distributed under the MIT License. The source code of the library and the engine are hosted in the GitHub repository of this website under /public/Skrypt.js and /src respectively.

All other materials on this website are currently copyrighted (provided without an explicit license), but you are free to share, quote, and distribute the content in other ways, examine and learn from it.

Note that the website is built upon the following third-party components:

  • TypeScript — a programming language that adds static typing to JavaScript; Apache 2.0
  • Vite — a modern frontend build tool; MIT
  • Metro UI — a framework for creating web application interfaces; MIT
  • CodeMirror — a code editor component; MIT
  • @types/node — TypeScript definitions for Node.js; MIT