thefourtheye's weblog

opinions are my own; try code/suggestions at your own risk

TypedArrays and Canonical Numeric Index Strings

| Comments

This week, I saw a Tweet by Benedikt Meurer which was kind of a quiz.

At the first look, I thought all of them would be the properties of the object and so all are correct answers. This comes from my understanding of the Array objects. As per the specification’s 9.4.2 Array Exotic Objects section,

A String property name P is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 232-1.

Any String property name which when converted to an unsigned 32 bit integer and stringified again should be equal to the original String property name, to be qualified as a valid array index. Otherwise it will be treated as a normal property of the array object.

But this is not applicable to TypedArrays. To TypedArrays the array indices in string format, should be canonical string numeric indices. In ECMAScript 6 specification, value of a property retrieval from a typed array is defined like this

When the [[Get]] internal method of an Integer Indexed exotic object (Typed Arrays) O is called with property key P, as per the section 9.4.5.4 [[Get]] (P, Receiver) the following steps are taken:

  1. Assert: IsPropertyKey(P) is true.
  2. If Type(P) is String and if SameValue(O, Receiver) is true, then
    a. Let numericIndex be CanonicalNumericIndexString (P).
    b. Assert: numericIndex is not an abrupt completion.
    c. If numericIndex is not undefined, then
    i. Return IntegerIndexedElementGet (O, numericIndex).
  3. Return the result of calling the default ordinary object [[Get]] internal method (9.1.8) on O passing P and Receiver as arguments.

As we see here, if the property being retrieved is a String, it is first converted to its canonical numeric index string format. It is done as per the section 7.1.16 CanonicalNumericIndexString(argument)

  1. Assert: Type(argument) is String.
  2. If argument is “-0”, return −0.
  3. Let n be ToNumber(argument).
  4. If SameValue(ToString(n), argument) is false, return undefined.
  5. Return n.

This basically converts the string to a number object and then compares the stringified version of that number with the original string. If they are equal, then it is a canonical numeric index and the number is returned, otherwise undefined will be returned. For example, '3.14' will be converted to numeric 3.14 and then converted to string again, which will be '3.14' which is the same as the original string. So the number 3.14 it will be considered as an index of the typed array.

In the question we see above, the strings '.9' and '1.0' become '0.9' and 1 respectively and '1.1' and '1.2' will become the same. We can confirm that like this

['.9', '0.9', '1.0', '1.', '1.1', '1.2'].forEach(i => console.log(Number(i).toString()));
// will print the following
0.9
0.9
1
1
1.1
1.2

As the strings '1.1' and '1.2' are already in the canonical numeric index string format, they are considered as the array indices and the '0.9' and '1.0' become property names. As per the 9.4.5.8 IntegerIndexedElementGet section, the passed number should be an integer. So, 1.1 and 1.2 are also not valid array indices and they are ignored. That is why the answer to the question is, both array['.9'] and array['1.0'], as only '.9' and '1.0' are treated as property names during retrieval.

To confirm this behaviour I wrote the following programs.

function TestIntegerIndexedObjects(obj) {
    console.log(Object.prototype.toString.call(obj));
    const items = ['.9', '1.0', '1.1', '1.2'];
    items.forEach((i, idx) => obj[i] = idx);
    console.log(obj);
    console.log(Object.getOwnPropertyNames(obj));
    console.log('\n');
}

TestIntegerIndexedObjects(new Int8Array(10));
TestIntegerIndexedObjects(new String("abcd"));
TestIntegerIndexedObjects([]);

will print the following

[object Int8Array]
Int8Array [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, '.9': 0, '1.0': 1 ]
[ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.9', '1.0' ]


[object String]
[String: 'abcd']
[ '0', '1', '2', '3', 'length', '.9', '1.0', '1.1', '1.2' ]


[object Array]
[ '.9': 0, '1.0': 1, '1.1': 2, '1.2': 3 ]
[ 'length', '.9', '1.0', '1.1', '1.2' ]

CanonicalNumericIndexString

const assert = require('assert');

function CanonicalNumericIndexString(argument) {
    const NumericValue = +argument;                                          // The unary + operator converts to a number
    return NumericValue.toString() === argument ? NumericValue : undefined;
}

assert(CanonicalNumericIndexString('.9')  === undefined);
assert(CanonicalNumericIndexString('0.9') === 0.9);
assert(CanonicalNumericIndexString('1.0') === undefined);
assert(CanonicalNumericIndexString('1.')  === undefined);
assert(CanonicalNumericIndexString('1.1') === 1.1);
assert(CanonicalNumericIndexString('1.2') === 1.2);

Emacs - Cheat Sheet

| Comments

After having been in the vim camp in the vim vs Emacs editor war for eight years, I decided to peek into the enemy’s camp. The IDE users might look down upon the editor users, but I still enjoy using my keyboard to do all the coding work. That is the reason why I chose to learn Emacs over an IDE.

When I started with Emacs, it was kind of scary. Editing text is straight forward, as in, it didn’t require me to change the mode like in Vim. But the control key sequences are way too much to remember compared to Vim. So I decided to understand the keys which are commonly used to move around the text in the screen, with the help of Emacs Tutorial (Ctrl+h t).

This cheat sheet has a few more key combinations which are used to do common operations on Emacs.

Note: From now on, the Ctrl key will be represented as C and the Alt key (also called as Meta key) will be represented as M. For example, if the actual command is to press both Ctrl and h keys to invoke the help, then it would be written as C-h.

Common Operations

Key Combinations Functionality
C-x C-c Quit Emacs
C-h t Open built-in Emacs Tutorial file
C-x C-f Open a File
C-x k Close the currently opened file on screen
C-x C-s Save the currently opened file
C-x s Save all the open files
C-g Discard the current command

Split Window Operations

Key Combinations Functionality
C-x 2 Split the view horizontally
C-x 3 Split the view vertically
C-x o (English letter “o”) Switch to next split window
C-x 0 (Number Zero) Kill the current window
C-x 1 Kill all other split windows except current window
C-x C-b List all open files in a new split window
C-x b Bring any of the open files to the front, by name

Cursor Movement

Key Combinations Functionality
C-b Move one character backward
M-b Move one word backward
C-f Move one character forward
M-f Move one word forward
C-v Scroll one page full of information forward
M-v Scroll one page full of information backward
C-a Beginning of current line
M-a Beginning of current sentence
C-e Ending of current line
M-e Ending of current sentence
M-> End of file
M-< Beginning of file
C-n Next line
C-p Previous line
C-l Move the current line to bottom, middle, and start of screen

Edit Operations

Key Combinations Functionality
C-x u C-/ Undo last operation
C-k Delete till the end of the current line from current position
M-k Delete till the end of the current sentence from current position
C-d Delete the next character
M-d Delete the next word
C-<Del> Delete the previous character
M-<Del> Delete the previous word

Copy and Paste

Key Combinations Functionality
C-<Space> Set mark to select text. Now with movement keys select the text
C-x h Select the whole buffer
C-w Cut the selected text
M-w Copy the selected text
C-y Yank (paste) the selected text

JavaScript Lookbehind RegEx Alternative

| Comments

Recently one of my colleagues came to me with a problem. He had a string which looked like JSON but not a valid JSON string. For example,

[{\"name\":\"Kilua\"\\,\"age\":35}],{\"city\":\"SJC\"},\"US\"

and they wanted to split the string at all the commas, except when they are preceded by a backslash. That means, they wanted

[ '[{"name":"Kilua"\\,"age":35}]', '{"city":"SJC"},"US"' ]

as output. If the requirement was to split the string at commas followed by a certain string, then it would have been pretty straight forward. For example, if the string is a,1,b,2,c,3 and if they wanted to split the string at all commas except when they are not followed by numbers, then the solution would have been very simple, with negative lookahead, like this

console.log('a,1,b,2,c,3'.split(/,(?!\d)/));
// [ 'a,1', 'b,2', 'c,3' ]

Since JavaScript doesn’t have lookbehinds, we don’t have a straight forward solution to this problem. When I searched Stackoverflow, I found this solution

WTK's answer in Stackoverflowlink
'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv'.replace(/([^\\]),/g, '$1\u000B').split('\u000B')

This solves the exact same problem which my colleague was facing. It compensates the absence of lookbehinds by breaking down the problem in two steps.

  1. Replace all the commas matching the condition with a marker string which does not appear anywhere in the string. (\u000B in this case)
  2. Then split the original string with the marker string.

JS Quirks: Stringified Numbers

| Comments

When I was working on Node.js PR 9492, this comment

This should probably test for a wider range of values. test/parallel/test-net-internal.js has some good examples.

made me look at that file. As I was going through the test, few of the bad values were interesting. I normally test with stringified positive decimal numbers and negative decimal numbers. But I saw stringified negative octal, binary, and hexa decimal numbers.

const bad = [-1, 'a', {}, [], false, true, 0xFFFF + 1, Infinity,
             -Infinity, NaN, undefined, null, '', ' ', 1.1, '0x',
             '-0x1', '-0o1', '-0b1', '0o', '0b'];

I got curious as I have never used them before, I just wanted to see their corresponding negative values. So I wrote a program like this

[-0x1, '-0x1', -0o1, '-0o1', -0b1, '-0b1'].forEach(item => console.log(item, +item));

and I was expecting to see the result

-1 -1
-0x1 -1
-1 -1
-0o1 -1
-1 -1
-0b1 -1

but all I got was

-1 -1
-0x1 NaN
-1 -1
-0o1 NaN
-1 -1
-0b1 NaN

The unary - operator simply negates the magnitude of the numbers. The stringified numbers were not processed in the same way as their number counterparts. So I looked at the ECMAScript specification’s ToNumber Applied to the String Type section (which is actually responsible for converting strings to numbers).

StrNumericLiteral :::
    StrDecimalLiteral
    BinaryIntegerLiteral
    OctalIntegerLiteral
    HexIntegerLiteral
...
...
StrDecimalLiteral :::
    StrUnsignedDecimalLiteral
    + StrUnsignedDecimalLiteral
    - StrUnsignedDecimalLiteral

Only the StrDecimalLiteral production allows signed numbers. If we look at the definition of others in the Numeric Literals section,

BinaryIntegerLiteral ::
    0b BinaryDigits
    0B BinaryDigits

BinaryDigits ::
    BinaryDigit
    BinaryDigits BinaryDigit

BinaryDigit :: one of
    0 1

OctalIntegerLiteral ::
    0o OctalDigits
    0O OctalDigits

OctalDigits ::
    OctalDigit
    OctalDigits OctalDigit

OctalDigit :: one of
    0 1 2 3 4 5 6 7

HexIntegerLiteral ::
    0x HexDigits
    0X HexDigits

HexDigits ::
    HexDigit
    HexDigits HexDigit

HexDigit :: one of
    0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F

So, as per the specification, only the decimal numbers can have signs in the stringified number form. That is why the others are not considered as numbers.

DSL-2730U Router's Ipv6 Problem in Ubuntu

| Comments

Today I faced a very strange problem. I got a new Tata DOCOMO connection and I wanted to fetch upstream changes from a GitHub repository, with git fetch --all. But all I got was
fatal: unable to access 'https://github.com/<repo>': Failed to connect to github.com port 443: Network is unreachable
I thought that the internet connection has a problem. But then I was able to ping github.com and access that site in my browser, although curl still failed.
➜  io.js git:(master) ✗ curl https://github.com
curl: (7) Failed to connect to github.com port 443: Network is unreachable
At this point I became curious and tried out the verbose curl,
➜  io.js git:(master) ✗ curl -v https://github.com
* Rebuilt URL to: https://github.com/
* Hostname was NOT found in DNS cache
* Trying 192.30.252.128...
* Trying 64:ff9b::c01e:fc81...
* connect to 64:ff9b::c01e:fc81 port 443 failed: Network is unreachable
* Failed to connect to github.com port 443: Network is unreachable
* Closing connection 0
curl: (7) Failed to connect to github.com port 443: Network is unreachable
Now, it figures out both the IPv4 address and the IPv6 address but favors IPv6 over IPv4. And it looks like, either the modem or the ISP don't support IPv6 based communication. I don't know how to confirm what the actual problem is. I tried to upgrade the firmware of my DSL-2730U router, by logging into 192.168.1.1. But it kept failing, saying the the firmware image file is too big.

So, I decided to disable IPv6 in my Ubuntu machine and I followed the instructions given here and it worked perfectly. Basically, I edited /etc/sysctl.conf file to include the following lines
# disable IPv6 on this machine
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
and then executed sudo sysctl --system and sudo sysctl -p.