javascript

Escaping in JSON with Backslash

Escape characters are part of the syntax for many programming languages, data formats, and communication protocols. For a given alphabet an escape character’s purpose is to start character sequences (so named escape sequences), which have to be interpreted differently from the same characters occurring without the prefixed escape character.[^2]

JSON or JavaScript Object Notation is a data interchange format. It has an escape character as well.

In many programming languages such as C, Perl, and PHP and in Unix scripting languages, the backslash is an escape character, used to indicate that the character following it should be treated specially (if it would otherwise be treated normally), or normally (if it would otherwise be treated specially).[^3]

JavaScript also uses backslash as an escape character. JSON is based on a subset of the JavaScript Programming Language, therefore, JSON also uses backslash as the escape character:

A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes.[^1]

A character can be:

  • Any Unicode character except " or \ or control character
  • \"
  • \\
  • \/
  • \b
  • \f
  • \n
  • \r
  • \t
  • \u + four-hex-digits

Only a few characters can be escaped in JSON. If the character is not one of the listed:

1
2
$ cat data.json
"\a"

it returns a SyntaxError[^4]:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ node -e 'console.log(require("./data.json"))'
module.js:561
throw err;
^
SyntaxError: /home/chao/tmp/js/data.json: Unexpected token a in JSON at position 2
at Object.parse (native)
at Object.Module._extensions..json (module.js:558:27)
at Module.load (module.js:458:32)
at tryModuleLoad (module.js:417:12)
at Function.Module._load (module.js:409:3)
at Module.require (module.js:468:17)
at require (internal/module.js:20:19)
at [eval]:1:13
at ContextifyScript.Script.runInThisContext (vm.js:25:33)
at Object.exports.runInThisContext (vm.js:77:17)

Inherited Accessor Property Can Only Be Overridden by Accessor Property

In Javascript, define a property prop as an accessor property via getter/setter:

1
2
3
4
var obj = {
get prop() {},
set prop() {}
};

And the property will have the following attributes:

> Object.keys(obj)
[ 'prop' ]
> Object.getOwnPropertyDescriptor(obj, 'prop')
{ get: [Function: prop],
  set: [Function: prop],
  enumerable: true,
  configurable: true }

Now, create a new object that inherits obj, and attempt to overwrite the accessor property by a data property:

1
2
var foo = Object.create(obj);
foo.prop = 'data';

But, the same property of the new object cannot be created:

> Object.keys(foo)
[]
> Object.getOwnPropertyDescriptor(foo, 'prop')
undefined

That is because that the property is an accessor property and it cannot be overridden by a data property. It can only be overridden by an accessor property:

1
2
3
4
5
6
Object.defineProperty(foo, 'prop', {
get: function prop() {},
set: function prop() {},
enumerable: true,
configurable: true,
});

Then, the foo object will have its own property named prop:

> Object.keys(foo)
[ 'prop' ]
> Object.getOwnPropertyDescriptor(foo, 'prop')
{ get: [Function: prop],
  set: [Function: prop],
  enumerable: true,
  configurable: true }

Getter and Setter Methods without Function Identifier Restriction

A JavaScript identifier must start with either a letter, underscore, or dollar sign:

1
2
3
4
5
6
7
8
> function 7/11(){}
SyntaxError: Unexpected number
> function '7/11'(){}
SyntaxError: Unexpected string
> var 7/11 = function(){}
SyntaxError: Unexpected number
> var '7/11' = function(){}
SyntaxError: Unexpected string

But object property does have such a limitation:

1
2
3
> var obj = { '7/11': function(){} }
> obj['7/11']
function (){}

And it works with accessor properties or getter and setter methods:

1
2
3
> var obj = { get '7/11'(){ return '7/11'; } }
> obj['7/11']
"7/11"

The syntax get '7/11'(){} looks like function '7/11'(){}, but it is not, getter and setter are still object properties. That is why it works. That’s another reason to use JavaScript object.

Avoid Assigning undefined to an Object Property

A property value should be any JavaScript value except for undefined. If you do something like this (albeit it is legal):

1
var foo = { bar: undefined };

will leads to confusing code. Because when accessing the property value:

1
console.log(foo.bar); // undefined

It will returns undefined. But you are not sure if it means the property exists or not or the value of the property is set to undefined. Therefore, you should do:

1
var foo = { bar: null };

This indicates that the property is expected, and with the value of null.

You do be able to check the existence of a property by:

1
2
Object.keys(foo); // ['bar']
foo.hasOwnProperty('bar'); // true

But more importantly, if you serialize the object with JSON.stringify, properties with undefined will be omitted:

1
JSON.stringify({ bar: undefined }); // '{}'

According to JSON specification, a value can be a string in double quotes, or a number, or true or false or null, or an object or an array. undefined is not a valid JSON value.

null is fine:

1
JSON.stringify({ bar: null }); // '{"bar":null}'

So, for the best practice, avoid assigning undefined to a property of an object. Use null to indicate the expected property without a value. This will increase portability when using JSON to serialize.

Amazon S3 Delimiter and Prefix

Amazon S3 is an inexpensive online file storage service, and there is the JavaScript SDK to use. There are things puzzling me when using the SDK were:

  1. How to use parameters Delimiter and Prefix?
  2. What is the difference between CommonPrefixes and Contents?
  3. How to create a folder/directory with JavaScript SDK?

To retrieve objects in an Amazon S3 bucket, the operation is listObjects. The listObjects does not return the content of the object, but the key and meta data such as size and owner of the object.

To make a call to get a list of objects in a bucket:

1
2
3
s3.listObjects(params, function (err, data) {
// ...
});

Where the params can be configured with the following parameters:

  • Bucket
  • Delimiter
  • EncodingType
  • Marker
  • MaxKeys
  • Prefix

But what are Delimiter and Prefix? And how to use them?

Let’s start by creating some objects in an Amazon S3 bucket similar to the following file structure. This can be easily done by using the AWS Console.

1
2
3
4
5
6
7
8
.
├── directory
│ ├── directory
│ │ └── file
│ └── file
└── file
2 directories, 3 files

In Amazon S3, the objects are:

1
2
3
4
5
directory/
directory/directory/
directory/directory/file
directory/file
file

One thing to keep in mind is that Amazon S3 is not a file system. There is not really the concept of file and directory/folder. From the console, it might look like there are 2 directories and 3 files. But they are all objects. And objects are listed alphabetically by their keys.

To make it a little bit more clear, let’s invoke the listObjects method. Since the operation has only Bucket parameter is required:

1
2
3
params = {
Bucket: 'example'
};

The response data contains in the callback function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{ Contents:
[ { Key: 'directory/',
LastModified: ...,
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Size: 0,
Owner: [Object],
StorageClass: 'STANDARD' },
{ Key: 'directory/directory/',
LastModified: ...,
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Size: 0,
Owner: [Object],
StorageClass: 'STANDARD' },
{ Key: 'directory/directory/file',
LastModified: ...,
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Size: 0,
Owner: [Object],
StorageClass: 'STANDARD' },
{ Key: 'directory/file',
LastModified: ...,
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Size: 0,
Owner: [Object],
StorageClass: 'STANDARD' },
{ Key: 'file',
LastModified: ...,
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
Size: 0,
Owner: [Object],
StorageClass: 'STANDARD' } ],
CommonPrefixes: [],
Name: 'example',
Prefix: '',
Marker: '',
MaxKeys: 1000,
IsTruncated: false }

If this is a file structure, you might expect:

1
2
directory/
file

But it is not, because a bucket does not work like a folder or a directory, where the immediate files inside the directory is shown. The objects inside the bucket are laid out flat and alphabetically.

In UNIX, a directory is a file, but in Amazon S3, everything is an object, and can be identified by key.

So, how to make Amazon S3 behave more like a folder or a directory? Or how to just list the content of first level right inside the bucket?

In order to make it work like directory you have to use Delimiter and Prefix. Delimiter is what you use to group keys. It does have to be a single character, it can be a string of characters. And Prefix limits the response to keys that begin with the specified prefix.

Delimiter

Let’s start by adding the following delimiter:

1
2
3
4
params = {
Bucket: 'example',
Delimiter: '/'
};

You will get something more like a listing of a directory:

1
2
3
4
5
6
7
8
{ Contents:
[ { Key: 'file' } ],
CommonPrefixes: [ { Prefix: 'directory/' } ],
Name: 'example',
Prefix: '',
MaxKeys: 1000,
Delimiter: '/',
IsTruncated: false }

There are a directory directory/ and a file file. What happened was that the following objects except file are grouped by the delimiter /:

1
2
3
4
5
directory/
directory/directory/
directory/directory/file
directory/file
file

So, result in:

1
2
directory/
file

This feels more like a listing of a directory or folder. But if we change Delimiter to i, then, you get no Contents and just the prefixes:

1
2
3
4
5
6
7
{ Contents: [],
CommonPrefixes: [ { Prefix: 'di' }, { Prefix: 'fi' } ],
Name: 'example',
Prefix: '',
MaxKeys: 1000,
Delimiter: 'i',
IsTruncated: false }

All keys can be grouped into two prefixes: di and fi. Therefore, Amazon S3 is not a file system, but might act like one if using the right parameters.

As I have mentioned that Delimiter does not need to be a single character:

1
2
3
4
5
6
7
8
9
10
{ Contents:
[ { Key: 'directory/' },
{ Key: 'directory/file' },
{ Key: 'file' } ],
CommonPrefixes: [ { Prefix: 'directory/directory' } ],
Name: 'example',
Prefix: '',
MaxKeys: 1000,
Delimiter: '/directory',
IsTruncated: false }

If recall the bucket structure:

1
2
3
4
5
directory/
directory/directory/
directory/directory/file
directory/file
file

Both directory/directory/ and directory/directory/file are grouped into a common prefix: directory/directory, due to the common grouping string /directory.

Let’s try another one with Delimiter: 'directory':

1
2
3
4
5
6
7
8
{ Contents:
[ { Key: 'file' } ],
CommonPrefixes: [ { Prefix: 'directory' } ],
Name: 'example',
Prefix: '',
MaxKeys: 1000,
Delimiter: 'directory',
IsTruncated: false }

Okay, one more. Let’s try ry/fi:

1
2
3
4
5
6
7
8
9
10
11
12
{ Contents:
[ { Key: 'directory/' },
{ Key: 'directory/directory/' },
{ Key: 'file' } ],
CommonPrefixes:
[ { Prefix: 'directory/directory/fi' },
{ Prefix: 'directory/fi' } ],
Name: 'example,
Prefix: '',
MaxKeys: 1000,
Delimiter: 'ry/fi',
IsTruncated: false }

So, remember that Delimiter is just providing a grouping functionality for keys. If you want it to behave like a file system, use Delimiter: '/'.

Prefix

Prefix is much easier to understand, it is a filter that limits keys to be prefixed by the one specified.

With the same structure:

1
2
3
4
5
directory/
directory/directory/
directory/directory/file
directory/file
file

Let’s set the Prefix parameter to directory:

1
2
3
4
5
6
7
8
9
10
{ Contents:
[ { Key: 'directory/' },
{ Key: 'directory/directory/' },
{ Key: 'directory/directory/file' },
{ Key: 'directory/file' } ],
CommonPrefixes: [],
Name: 'example',
Prefix: 'directory',
MaxKeys: 1000,
IsTruncated: false }

How about directory/:

1
2
3
4
5
6
7
{ Contents:
[ { Key: 'directory/' },
{ Key: 'directory/directory/' },
{ Key: 'directory/directory/file' },
{ Key: 'directory/file' } ],
CommonPrefixes: [],
Prefix: 'directory/' }

Both directory and directory/ prefixes are the same.

If we try something slightly different, Prefix: 'directory/d':

1
2
3
4
5
{ Contents:
[ { Key: 'directory/directory/' },
{ Key: 'directory/directory/file' } ],
CommonPrefixes: [],
Prefix: 'directory/d' }

Putting all together with both Delimiter: 'directory' and Prefix: 'directory':

1
2
3
4
5
6
{ Contents:
[ { Key: 'directory/' },
{ Key: 'directory/file' } ],
CommonPrefixes: [ { Prefix: 'directory/directory' } ],
Prefix: 'directory',
Delimiter: 'directory' }

First, list the keys prefixed by directory:

1
2
3
4
directory/
directory/directory/
directory/directory/file
directory/file

Group them by the delimiter directory with prefix directory:

1
directory/directory

The result Contents are:

1
2
directory/
directory/file

and CommonPrefixes are:

1
directory/directory

Maybe changing Delimiter to i could give a better perspective:

1
2
3
4
5
{ Contents:
[ { Key: 'directory/' } ],
CommonPrefixes: [ { Prefix: 'directory/di' }, { Prefix: 'directory/fi' } ],
Prefix: 'directory',
Delimiter: 'i' }

as:

1
2
3
4
5
directory/ # key to show
directory/directory/ # group to 'directory/di'
directory/directory/file # group to 'directory/di'
directory/file # Group to 'directory/fi'
file # ignored due to prefix

One advantage of using Amazon S3 over listing a directory is that you don’t need to concern about nested directories, everything is being flattened. So, you can loop it through just by specifying the Prefix property.

Directory/Folder

If you’re using the Amazon AWS console to “Create Folder”, you can create a directory/folder and upload a file to inside the directory/folder. In effect, you are actually creating two objects with the following keys:

1
2
directory/
directory/file

If you use the following command to upload a file, the directory is not created:

1
$ aws s3 cp file s3://example/directory/file

Because, Amazon S3 is not a file system, but a key/value store. If you use listObjects method, you will just see one object. That is the same reason that you cannot copy a local directory:

1
2
$ aws s3 cp directory s3://example/directory
upload failed: aws/ to s3://example/directory [Errno 21] Is a directory: u'/home/chao/tmp/directory/'

But we can use the JavaScript SDK to create a directory/folder:

1
2
3
4
5
s3.putObject({ Bucket: 'example', Key: 'directory/' }, function (err, data) {
if (err) { return console.error(err); }
console.log(data);
});

Note that you must use directory/ with trailing slash instead the one without. Otherwise, it is just a regular file not a directory.

Immediately Invoked Function Expression

What’s difference between:

1
2
(function(){}());
(function(){})();

This is function declaration:

1
function foo(){}

The identifier of the function declaration is required.

This is function expression:

1
var foo = function(){};

The right hand side of the assignment must be an expression. And function identifier is optional. Check the previous article about the difference between function declaration and function expression.

Now, come in the grouping operator, which return the result of evaluating an expression:

1
(function(){});

If we do it with named function:

1
2
function foo(){}
(foo);

or

1
(function foo(){});

But we haven’t done anything here, in another word, we haven’t invoked the function. We merely evaluate the function in three occasions. Function is object in JavaScript, when you’re evaluating an object, the result is still an object:

1
2
3
4
5
console.log(function foo(){}); // [Function: foo]
console.log(function (){}); // [Function]
function bar(){}
console.log(bar); // [Function: bar]
console.log({}); // {}

Now, let’s apply function invocation, and do it the traditional way:

1
foo();

The function has been invoked, and it will return the result of the function.

Replace the identifier with function declaration:

1
function foo(){}();

However, this does not work. Because interpreter treats it as a function declaration, and ( will become an unexpected token. Function declaration cannot be invoked immediately. So, we need to make it an expression, by using the grouping operator:

1
(function foo(){})();

The function identifier is no longer necessary:

1
(function(){})();

Since grouping operator return the result of the expression, we can just drop it:

1
function(){}();

But this statement becomes a function declaration again, so we need to place the grouping operator around it:

1
(function(){}());

Okay, but what’s difference between:

1
2
(function(){}());
(function(){})();

There is no difference, they are both function expressions being invoked immediately. Function invocation can only be done via function expression. But first one is more align with foo(), the traditional way. According to the Code Convention for JavaScript Programming Language by Douglas Crockford:

When a function is to be invoked immediately, the entire invocation expression should be wrapped in parens so that it is clear that the value being produced is the result of the function and not the function itself.

The invocation expression is function(){}(), and wrap it around parenthesis becomes (function(){}());.

Therefore, this first method is more traditional and preferred.

Function Declaration and Function Expression

In JavaScript, what is function declaration?

function square(n) { return n * n; }

Here we declare a function and name it “square”. But what is function expression?

var square = function (n) { return n * n; };

When you start a statement with keyword function, then the statement is function declaration, everything else is function expression.

Function expression can either be named or anonymous. The above is anonymous function, and here is a named one:

var square = function square(n) { return n * n; };

There is a variable square and function name square, but they are not the same. The scopes are different. The function name square of the function expression can be only used inside the function, mainly for debugging purpose. Accessing from outside the function, will throw ReferenceError:

var foo = function bar() {};
bar(); // ReferenceError: bar is not defined

We can also use property ‘name’ to differentiate a named vs. an anonymous function:

console.log(square.name); // empty string for an anonymous function

However, you cannot create an anonymous function declaration, for a very obvious reason:

function (n) { return n * n; } // SyntaxError: Unexpected token (

Another difference between function declaration and function expression is hoisting:

square(10); // 100
function square(n) { return n * n; }

Function defined by declaration is hoisted, but not for function expression:

square(10); // TypeError: undefined is not a function
var square = function (n) { return n * n; }

The variable declaration is hoisted, but not for the definition. Therefore, result in TypeError. However, for best practice, functions should be defined before being used.

In conclusion:

  • Function declaration must start with keyword function in the beginning of the statement. Everything else is function expression.
  • Function expression can be anonymous and named, but function declaration can only be named.
  • Function declaration is hoisted, but not function expression.

This function declaration happens inside a unreachable block.

square(10); // 100
if (false) {
  function square(n) { return n * n; }
}

There is no block scope in JavaScript. Spaces in the beginning of keyword function do not count. This is still a valid function declaration.

A simple way to turn function declaration into function expression is by wrapping around parentheses:

(function square(n) { return n * n; });

We are applying the grouping operator () here, which evaluates the function expression.

Another way:

foo(err, function(){});

Just think about , as ().

Other good reads:

null

In JavaScript, you assign value null to an object property to indicate that the property is defined but with no value. But using null outside the JavaScript domain becomes tricky. For example, in URL query parameters:

?foo=null&bar=bar

after parsing the query:

{
  "foo": "null",
  "bar": "bar"
}

Should the value of foo be parsed from "null" string into null object? Unless you know the context.

Using null in both schema design and URL query gave me a few headaches. So, I have decided to use string and throw away null. As string is the most basic data type across many applications. For simplier design and greater compatibility and portability, use empty string instead of null.

Therefore, the value of query parameter foo should be:

?foo=null&bar=bar // "null"
?foo=&bar=bar     // ""

and:

"null" !== ""

Working with Big Number in JavaScript

JavaScript is only capable of handling 53-bit numbers, if you are working with a big number such as Twitter ID, which is using 64-bit number, then you need to find an external library to do that, otherwise, there will be precision lost:

> num = 420938523475451904
420938523475451900
> num = 420938523475451904 + 1
420938523475451900
> num = 420938523475451904 - 1
420938523475451900

Here is one library to use in Node environment, install Big.js:

$ npm install big.js

Load the module:

> BigNum = require('big.js')
{ [Function: Big] DP: 20, RM: 1 }

Use string to create the big number:

> num = BigNum('420938523475451904')
{ s: 1,
  e: 17,
  c: [ 4, 2, 0, 9, 3, 8, 5, 2, 3, 4, 7, 5, 4, 5, 1, 9, 0, 4 ] }
> num.toString()
'420938523475451904'

Perform addition:

> num.plus(1).toString()
'420938523475451905'

Perform substraction:

> num.minus(1).toString()
'420938523475451903'

There are other packages that yet to be tested:

Written in CoffeeScript, Required in JavaScript

If you are writing in CoffeeScript, then requiring another module written in CoffeeScript works the same as as both scripts are in JavaScript:

1
cm = require 'coffee-module'

But if you are writing in JavaScript, and the dependent module is in CoffeeScript, you have to include CoffeeScript as a dependency with require statement:

1
2
require('coffee-script');
var cm = require('coffee-module');

For better compatibility, source codes written in CoffeeScript should be compiled into JavaScript. But what is the best practice? You don’t want to maintain two languages. When to compile? Here is two options:

  1. Compile before publish the module
  2. Compile after module install

The advantage of the first approach is that there is no dependency on CoffeeScript. The module has been compiled into JavaScript prior to submitting to module registry. However, this approach requires two repositories, one is source code repository, and another one is the published module repository. If you are working on a private project, it is less likely that you will publish your module to an public NPM registry or running your own private one. It is more likely to have a single source code repository. Therefore, the second approach might be better in this situation. However, coffee-script must be added in dependency, or it must be installed globally with coffee command available during preinstall phase. Although this approach is not recommended in npm-scripts, before setting up a private NPM registry, this is the way to go.

Here is the required fields in package.json:

1
2
3
4
5
{
"scripts": {
"preinstall": "coffee --compile --bare --output lib/ src/"
}
}