Skip to main content

Posts

Showing posts from February, 2019

Accurate Word Counter for non-Latin characters in Javascript regex

There is a problem that involves Javascript and regular expressions. The JS implementation of regexp does not support Unicode properly, for example /\b\S+\b/g regular expression will not count words with Unicode characters of many national alphabets and scripts, such as Cyrillic, Greek and Hindi. Unfortunately \S is restricted to Latin-only characters of English alphabet. To solve this problem we must explicitly include all Unicode characters. My solution is to use /([\u0080-\uFFFF\w]\u0027?)+/g regular expression instead. It covers the wide range of Unicode characters (from 0080 to FFFF) that includes all national alphabets + apostrophe symbol (0027). This regex has been tested with the following sample text and it counts all 55 words accurately, ignoring all special characters and punctuation, I used https://regexr.com to test it with this sample text that includes words from several alphabets.

Switching between keyboard layouts in Openbox (Arch Linux)

Switching between two (or more) keyboard layouts in Openbox DE is a task that's quite easy to accomplish, although it might not be so obvious as in other desktop environments. This solution was tested on Arch Linux. You just need to edit this file (assuming you want to switch between English and Ukrainian Phonetic layouts with Alt-Shift): /etc/X11/xorg.conf.d/01-keyboard-layout.conf Section "InputClass" Identifier "keyboard-layout" Driver "evdev" MatchIsKeyboard "yes" Option "XkbLayout" "us,ua(phonetic)" Option "XkbModel" "pc105" Option "XkbOptions" "grp:alt_shift_toggle" EndSection If you have Nvidia card, don't forget to edit /etc/X11/xorg.conf.d/20-nvidia.conf and change Driver from "kbd" to "evdev" in InputDevice section: Section "InputDevice" Identifier "Keyboard0" Driver "evdev" EndSection Y