Class RegexUtil


  • public final class RegexUtil
    extends Object
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  RegexUtil.LeetSpeakPattern
      A Helper class for leet speak patterns.
      It produces a regex that can be used for matching.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static String escapeRegex​(String literal)
      Escapes a string into a regex that matches this string literally.
      Note: This method does not use the cheap \Q...\E variant that Pattern.quote(String) uses!
      static Pattern parseWildcardToPattern​(String wildcard)
      This method turns a string that might contain wildcards into a case insensitive regex Pattern.
      A string can either be treated as normal wildcard string or a regex string.

      Wildcard String
      All *s will match zero to infinite many characters.
      All ?s will match zero or one character.

      Regex String
      If a passed string starts with R=, then the rest will be interpreted as an actual regex.

      This is the same as StringUtil.parseWildcardToPattern(wildcard, Pattern.CASE_INSENSITIVE)
      static Pattern parseWildcardToPattern​(String wildcard, int flags)
      This method turns a string that might contain wildcards into a regex Pattern with the specified flags.
      A string can either be treated as normal wildcard string or a regex string.

      Wildcard String
      All *s will match zero to infinite many characters.
      All ?s will match zero or one character.

      Regex String
      If a passed string starts with R=, then the rest will be interpreted as an actual regex.

      This is the same as StringUtil.parseWildcardToPattern(wildcard, flags, false, false, false, false)
      static Pattern parseWildcardToPattern​(String wildcard, int flags, boolean freeMatching, boolean leetSpeak, boolean ignoreSpaces, boolean ignoreDuplicateLetters)
      This method turns a string that might contain wildcards into a regex Pattern with the specified flags.
      A string can either be treated as normal wildcard string or a regex string.

      Wildcard String
      All *s will match zero to infinite many characters.
      All ?s will match zero or one character.

      Additionally the following options may be applied (the string test will be used an an example here): freeMatching: The match will not be bound to word borders.
    • Field Detail

      • LEET_PATTERNS

        public static final Map<String,​RegexUtil.LeetSpeakPattern> LEET_PATTERNS
        A map containing all used leet speak alternatives. The key is a string of the uppercase letter.
        Current Mapping:
         A: 4 /\ @ /-\ ^ aye (L Д
         B: I3 8 13 |3 ß !3 (3 /3 )3 |-] j3 6
         C: [ ¢ { < ( ©
         D: ) |) (| [) I> |> ? T) I7 cl |} > |]
         E: 3 & £ € ë [- |=-
         F: |= ƒ |# ph /= v
         G: & 6 (_+ 9 C- gee (?, [, {, <- (.
         H: # /-/ [-] ]-[ )-( (-) :-: |~| |-| ]~[ }{ !-! 1-1 \-/ I+I /-\
         I: 1 [] | ! eye 3y3 ][
         J: ,_| _| ._| ._] _] ,_] ] ; 1
         K: >| |< /< 1< |c |( |{
         L: 1 £ 7 |_ |
         M: /\/\ /V\ JVI [V] []V[] |\/| ^^ <\/> {V} (v) (V) |V| nn IVI |\|\ ]\/[ 1^1 ITI JTI
         N: ^/ |\| /\/ [\] <\> {\} |V /V И ^ ท
         O: 0 Q () oh []
         P: <> Ø |* |o |º ? |^ |> |" 9 []D |° |7
         Q: (_,) 9 ()_ 2 0_ <| &
         R: I2 |` |~ |? /2 |^ lz |9 2 12 ® [z Я .- |2 |-
         S: 5 $ z § ehs es 2
         T: 7 + -|- '][' † "|" ~|~
         U: (_) |_| v L| µ บ
         V: \/ |/ \|
         W: \/\/ VV \N '// \\' \^/ (n) \V/ \X/ \|/ \_|_/ \_:_/ Ш Щ uu 2u \\//\\// พ v²
         X: >< Ж }{ ecks × ? )( ][
         Y: j `/ Ч 7 \|/ ¥ \//
         Z: 2 7_ -/_ % >_ s ~/_ -\_ -|_
         
    • Method Detail

      • escapeRegex

        public static String escapeRegex​(String literal)
        Escapes a string into a regex that matches this string literally.
        Note: This method does not use the cheap \Q...\E variant that Pattern.quote(String) uses!
        Parameters:
        literal - The string to be escaped
        Returns:
        A regex string that matches the passed string literally.
      • parseWildcardToPattern

        public static Pattern parseWildcardToPattern​(String wildcard,
                                                     int flags,
                                                     boolean freeMatching,
                                                     boolean leetSpeak,
                                                     boolean ignoreSpaces,
                                                     boolean ignoreDuplicateLetters)
                                              throws PatternSyntaxException
        This method turns a string that might contain wildcards into a regex Pattern with the specified flags.
        A string can either be treated as normal wildcard string or a regex string.

        Wildcard String
        All *s will match zero to infinite many characters.
        All ?s will match zero or one character.

        Additionally the following options may be applied (the string test will be used an an example here):
        • freeMatching: The match will not be bound to word borders. So abctestabc is still a (partial) match (Only test itself will be matched though).
        • leetSpeak: Expands all letters to also match leet speak: https://qntm.org/l33t. So +3$t is still a match. See LEET_PATTERNS. This flags also implied case insensitive matching for wildcard strings!
        • ignoreSpaces: This allows infinitely many whitespaces between the letters. So t es t is still a match.
        • ignoreDuplicateLetters: This allows letters to be duplicated infinitely many times. So tteeeeeeesttt is still a match.
        Regex String
        If a passed string starts with R=, then the rest will be interpreted as an actual regex.
        Parameters:
        wildcard - The string that is to be parsed.
        flags - Regex flags. See: Pattern.compile(String, int)
        freeMatching - Determines if the match has to be a free standing word.
        leetSpeak - Determines if leet speak also matches. Like a 5 for a S.
        ignoreSpaces - Determines if spaces may be ignored.
        ignoreDuplicateLetters - Determines if letters may be duplicated (infinitely many times) and the pattern still matches.
        Returns:
        A regex Pattern that has been compiled from the string and has the flags set and options applied if it is not a regex string.
        Throws:
        PatternSyntaxException - If the regex syntax is invalid.
        See Also:
        Pattern.compile(String, int)