Jun 21, 2014

Avoid Assigning undefined to an Object Property

A property value should be any JavaScript value except for undefined. If you do something like this (albeit it is legal):

1	var foo = { bar: undefined };

will leads to confusing code. Because when accessing the property value:

1	console.log(foo.bar); // undefined

It will returns undefined. But you are not sure if it means the property exists or not or the value of the property is set to undefined. Therefore, you should do:

1	var foo = { bar: null };

This indicates that the property is expected, and with the value of null.

You do be able to check the existence of a property by:

1 2	Object.keys(foo); // ['bar'] foo.hasOwnProperty('bar'); // true

But more importantly, if you serialize the object with JSON.stringify, properties with undefined will be omitted:

1	JSON.stringify({ bar: undefined }); // '{}'

According to JSON specification, a value can be a string in double quotes, or a number, or true or false or null, or an object or an array. undefined is not a valid JSON value.

null is fine:

1	JSON.stringify({ bar: null }); // '{"bar":null}'

So, for the best practice, avoid assigning undefined to a property of an object. Use null to indicate the expected property without a value. This will increase portability when using JSON to serialize.

Jun 16, 2014

Proxy GitHub Files for Hotlinking

I have a library that I would like to try in JSFiddle, for example, Marked, a markdown parser and compiler. But I cannot by hotlinking the minified JavaScript file directly:

1	<script src="https://raw.githubusercontent.com/chjj/marked/master/marked.min.js"></script>

Error:

Refused to execute script from ‘https://raw.githubusercontent.com/chjj/marked/master/marked.min.js‘ because its MIME type (‘text/plain’) is not executable, and strict MIME type checking is enabled.

Double check the headers:

$ curl -I https://raw.githubusercontent.com/chjj/marked/master/marked.min.js
HTTP/1.1 200 OK
Server: Apache
Content-Security-Policy: default-src 'none'
Access-Control-Allow-Origin: https://render.githubusercontent.com
X-XSS-Protection: 1; mode=block
X-Frame-Options: deny
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=31536000
Content-Type: text/plain; charset=utf-8
Cache-Control: max-age=300
Content-Length: 19160
Accept-Ranges: bytes
Via: 1.1 varnish
X-Served-By: cache-lax1429-LAX
X-Cache: MISS
X-Cache-Hits: 0
Vary: Authorization,Accept-Encoding
Source-Age: 0

The two headers that prevents hotlinking:

Content-Type is set to text/plain, we are expecting application/javascript.
Access-Control-Allow-Origin is not set to *, which prevents AJAX request.

The solution is to proxy the file by running your own server, or use RawGit:

$ curl -I https://rawgit.com/chjj/marked/master/marked.min.js
HTTP/1.1 200 OK
Server: nginx
Content-Type: application/javascript; charset=utf-8
Connection: keep-alive
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Robots-Tag: none
RawGit-Naughtiness: 0
Access-Control-Allow-Origin: *
Cache-Control: max-age=300
Vary: Accept-Encoding
RawGit-Cache-Status: MISS

The Content-Type has been updated to application/javascript and the CORS restricted has been relaxed via Access-Control-Allow-Origin.

Problem sovled and the fiddle works as expect:

Jun 14, 2014

Inventing on Principle

Bret Victor, “Inventing on Principle”, January 20, 2012:

Bret Victor’s principle:

Creator needs immediate connection.

Following a principle, following your passion, and doing things you love.

Jun 8, 2014

Understand Setup and Teardown with Jasmine Testing Framework

In Jasmine, the testing structure of setup, test, and teardown is done by using beforeEach/it+/afterEach. Global beforeEach and afterEach functions are called once before each spec in the describe.

Let’s walk through a few examples.

Before describing any example, set up a function to watch how the value changes from executing one spec to another. Object.observe is yet to be available in Firefox, but we can use something similar with Object.watch:

// Specific to Firefox only.
function watchHandler(prop, oldVal, newVal) {
  console.log(prop + ' changed from ' + oldVal + ' to ' + newVal);
  return newVal;
}

The first example is to perform the setup without using beforeEach:

describe('Without "beforeEach"', function () {
  var obj   = {};
  var prop  = 'foo1';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  obj[prop] += 1;
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
  });
  it(prop + ' should not be equal to 2', function () {
    expect(obj[prop]).not.toBe(2);
  });
});

The obj property foo1 was changed once:

1	LOG: 'foo1 changed from 0 to 1'

If using beforeEach to perform setup before each spec:

describe('With "beforeEach"', function () {
  var obj   = {};
  var prop  = 'foo2';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  beforeEach(function () {
    obj[prop] += 1; // Should be invoked twice.
  });
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
  });
  it(prop + ' should be equal to 2', function () {
    expect(obj[prop]).toBe(2);
  });
});

Then we should expect the value to be updated twice. There are two specs, and for each spec, the beforeEach will be invoked:

1 2	LOG: 'foo2 changed from 0 to 1' LOG: 'foo2 changed from 1 to 2'

If we change the values inside a spec without using beforeEach:

describe('Without "beforeEach" but updating in spec', function () {
  var obj   = {};
  var prop  = 'foo3';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  obj[prop] += 1;
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
    obj[prop] += 1;
  });
  it(prop + ' should be equal to 2', function () {
    expect(obj[prop]).toBe(2);
  });
});

The value has been incremented to 2 inside the first spec before the execution of the second spec:

1 2	LOG: 'foo3 changed from 0 to 1' LOG: 'foo3 changed from 1 to 2'

With using beforeEach:

describe('With "beforeEach" and updating in spec', function () {
  var obj   = {};
  var prop  = 'foo4';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  beforeEach(function () {
    obj[prop] += 1;
  });
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
    obj[prop] += 1;
  });
  it(prop + ' should be equal to 3', function () {
    expect(obj[prop]).toBe(3);
  });
});

Now, before each spec, the value was incremented, and inside the first spec, the value was also incremented:

1
2
3

LOG: 'foo4 changed from 0 to 1'
LOG: 'foo4 changed from 1 to 2'
LOG: 'foo4 changed from 2 to 3'

To ensure the value stays the same before each spec, we can use afterEach:

describe('Reset with "afterEach"', function () {
  var obj   = {};
  var prop  = 'foo5';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  beforeEach(function () {
    obj[prop] += 1;
  });
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
    obj[prop] += 1;
  });
  it(prop + ' should still be equal to 1', function () {
    expect(obj[prop]).toBe(1);
  });
  // Teardown
  afterEach(function () {
    obj[prop] = 0;
  });
});

After each spec, the value has been reset, and before the start of next spec, the value has been incremented from original value 0 to 1 again:

LOG: 'foo5 changed from 0 to 1'
LOG: 'foo5 changed from 1 to 2'
LOG: 'foo5 changed from 2 to 0'
LOG: 'foo5 changed from 0 to 1'
LOG: 'foo5 changed from 1 to 0'

The locations of beforeEach and afterEach do not matter:

describe('Reverse the order of "beforeEach" and "afterEach"', function () {
  var obj   = {};
  var prop  = 'foo6';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Teardown
  afterEach(function () {
    obj[prop] = 0;
  });
  it(prop + ' should be equal to 1', function () {
    expect(obj[prop]).toBe(1);
    obj[prop] += 1;
  });
  it(prop + ' should still be equal to 1', function () {
    expect(obj[prop]).toBe(1);
  });
  // Setup
  beforeEach(function () {
    obj[prop] += 1;
  });
});

Here we switch the setup and teardown, but the result is the same:

LOG: 'foo6 changed from 0 to 1'
LOG: 'foo6 changed from 1 to 2'
LOG: 'foo6 changed from 2 to 0'
LOG: 'foo6 changed from 0 to 1'
LOG: 'foo6 changed from 1 to 0'

Because the content of the suite is read before any of specs is executed.

Asynchronous setup, test, and teardown:

describe('Async with "beforeEach" and "afterEach"', function () {
  var obj   = {};
  var prop  = 'foo7';
  obj[prop] = 0;
  obj.watch(prop, watchHandler); // Firefox only
  // Setup
  beforeEach(function (done) {
    setTimeout(function () {
      obj[prop] += 1;
      done();
    }, 100);
  });
  it(prop + ' should be equal to 1', function (done) {
    setTimeout(function () {
      expect(obj[prop]).toBe(1);
      obj[prop] += 1;
      done();
    }, 100);
  });
  it(prop + ' should still be equal to 1', function (done) {
    setTimeout(function () {
      expect(obj[prop]).toBe(1);
      done();
    }, 100);
  });
  // Teardown
  afterEach(function () {
    setTimeout(function () {
      obj[prop] = 0;
    }, 100);
  });
});

Work the same:

LOG: 'foo7 changed from 0 to 1'
LOG: 'foo7 changed from 1 to 2'
LOG: 'foo7 changed from 2 to 0'
LOG: 'foo7 changed from 0 to 1'
LOG: 'foo7 changed from 1 to 0'

Here is the gist for all examples shown above.

May 24, 2014

Amazon S3 Delimiter and Prefix

Amazon S3 is an inexpensive online file storage service, and there is the JavaScript SDK to use. There are things puzzling me when using the SDK were:

How to use parameters Delimiter and Prefix?
What is the difference between CommonPrefixes and Contents?
How to create a folder/directory with JavaScript SDK?

To retrieve objects in an Amazon S3 bucket, the operation is listObjects. The listObjects does not return the content of the object, but the key and meta data such as size and owner of the object.

To make a call to get a list of objects in a bucket:

1
2
3

s3.listObjects(params, function (err, data) {
  // ...
});

Where the params can be configured with the following parameters:

Bucket
Delimiter
EncodingType
Marker
MaxKeys
Prefix

But what are Delimiter and Prefix? And how to use them?

Let’s start by creating some objects in an Amazon S3 bucket similar to the following file structure. This can be easily done by using the AWS Console.

.
├── directory
│   ├── directory
│   │   └── file
│   └── file
└── file
2 directories, 3 files

In Amazon S3, the objects are:

directory/
directory/directory/
directory/directory/file
directory/file
file

One thing to keep in mind is that Amazon S3 is not a file system. There is not really the concept of file and directory/folder. From the console, it might look like there are 2 directories and 3 files. But they are all objects. And objects are listed alphabetically by their keys.

To make it a little bit more clear, let’s invoke the listObjects method. Since the operation has only Bucket parameter is required:

1
2
3

params = {
  Bucket: 'example'
};

The response data contains in the callback function:

{ Contents: 
   [ { Key: 'directory/',
       LastModified: ...,
       ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
       Size: 0,
       Owner: [Object],
       StorageClass: 'STANDARD' },
     { Key: 'directory/directory/',
       LastModified: ...,
       ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
       Size: 0,
       Owner: [Object],
       StorageClass: 'STANDARD' },
     { Key: 'directory/directory/file',
       LastModified: ...,
       ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
       Size: 0,
       Owner: [Object],
       StorageClass: 'STANDARD' },
     { Key: 'directory/file',
       LastModified: ...,
       ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
       Size: 0,
       Owner: [Object],
       StorageClass: 'STANDARD' },
     { Key: 'file',
       LastModified: ...,
       ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
       Size: 0,
       Owner: [Object],
       StorageClass: 'STANDARD' } ],
  CommonPrefixes: [],
  Name: 'example',
  Prefix: '',
  Marker: '',
  MaxKeys: 1000,
  IsTruncated: false }

If this is a file structure, you might expect:

1 2	directory/ file

But it is not, because a bucket does not work like a folder or a directory, where the immediate files inside the directory is shown. The objects inside the bucket are laid out flat and alphabetically.

In UNIX, a directory is a file, but in Amazon S3, everything is an object, and can be identified by key.

So, how to make Amazon S3 behave more like a folder or a directory? Or how to just list the content of first level right inside the bucket?

In order to make it work like directory you have to use Delimiter and Prefix. Delimiter is what you use to group keys. It does have to be a single character, it can be a string of characters. And Prefix limits the response to keys that begin with the specified prefix.

Delimiter

Let’s start by adding the following delimiter:

params = {
  Bucket: 'example',
  Delimiter: '/'
};

You will get something more like a listing of a directory:

{ Contents: 
   [ { Key: 'file' } ],
  CommonPrefixes: [ { Prefix: 'directory/' } ],
  Name: 'example',
  Prefix: '',
  MaxKeys: 1000,
  Delimiter: '/',
  IsTruncated: false }

There are a directory directory/ and a file file. What happened was that the following objects except file are grouped by the delimiter /:

directory/
directory/directory/
directory/directory/file
directory/file
file

So, result in:

1 2	directory/ file

This feels more like a listing of a directory or folder. But if we change Delimiter to i, then, you get no Contents and just the prefixes:

{ Contents: [],
  CommonPrefixes: [ { Prefix: 'di' }, { Prefix: 'fi' } ],
  Name: 'example',
  Prefix: '',
  MaxKeys: 1000,
  Delimiter: 'i',
  IsTruncated: false }

All keys can be grouped into two prefixes: di and fi. Therefore, Amazon S3 is not a file system, but might act like one if using the right parameters.

As I have mentioned that Delimiter does not need to be a single character:

{ Contents: 
   [ { Key: 'directory/' },
     { Key: 'directory/file' },
     { Key: 'file' } ],
  CommonPrefixes: [ { Prefix: 'directory/directory' } ],
  Name: 'example',
  Prefix: '',
  MaxKeys: 1000,
  Delimiter: '/directory',
  IsTruncated: false }

If recall the bucket structure:

directory/
directory/directory/
directory/directory/file
directory/file
file

Both directory/directory/ and directory/directory/file are grouped into a common prefix: directory/directory, due to the common grouping string /directory.

Let’s try another one with Delimiter: 'directory':

{ Contents: 
   [ { Key: 'file' } ],
  CommonPrefixes: [ { Prefix: 'directory' } ],
  Name: 'example',
  Prefix: '',
  MaxKeys: 1000,
  Delimiter: 'directory',
  IsTruncated: false }

Okay, one more. Let’s try ry/fi:

{ Contents: 
   [ { Key: 'directory/' },
     { Key: 'directory/directory/' },
     { Key: 'file' } ],
  CommonPrefixes: 
   [ { Prefix: 'directory/directory/fi' },
     { Prefix: 'directory/fi' } ],
  Name: 'example,
  Prefix: '',
  MaxKeys: 1000,
  Delimiter: 'ry/fi',
  IsTruncated: false }

So, remember that Delimiter is just providing a grouping functionality for keys. If you want it to behave like a file system, use Delimiter: '/'.

Prefix

Prefix is much easier to understand, it is a filter that limits keys to be prefixed by the one specified.

With the same structure:

directory/
directory/directory/
directory/directory/file
directory/file
file

Let’s set the Prefix parameter to directory:

{ Contents: 
   [ { Key: 'directory/' },
     { Key: 'directory/directory/' },
     { Key: 'directory/directory/file' },
     { Key: 'directory/file' } ],
  CommonPrefixes: [],
  Name: 'example',
  Prefix: 'directory',
  MaxKeys: 1000,
  IsTruncated: false }

How about directory/:

{ Contents: 
   [ { Key: 'directory/' },
     { Key: 'directory/directory/' },
     { Key: 'directory/directory/file' },
     { Key: 'directory/file' } ],
  CommonPrefixes: [],
  Prefix: 'directory/' }

Both directory and directory/ prefixes are the same.

If we try something slightly different, Prefix: 'directory/d':

{ Contents: 
   [ { Key: 'directory/directory/' },
     { Key: 'directory/directory/file' } ],
  CommonPrefixes: [],
  Prefix: 'directory/d' }

Putting all together with both Delimiter: 'directory' and Prefix: 'directory':

{ Contents: 
   [ { Key: 'directory/' },
     { Key: 'directory/file' } ],
  CommonPrefixes: [ { Prefix: 'directory/directory' } ],
  Prefix: 'directory',
  Delimiter: 'directory' }

First, list the keys prefixed by directory:

directory/
directory/directory/
directory/directory/file
directory/file

Group them by the delimiter directory with prefix directory:

1	directory/directory

The result Contents are:

1 2	directory/ directory/file

and CommonPrefixes are:

1	directory/directory

Maybe changing Delimiter to i could give a better perspective:

{ Contents: 
   [ { Key: 'directory/' } ],
  CommonPrefixes: [ { Prefix: 'directory/di' }, { Prefix: 'directory/fi' } ],
  Prefix: 'directory',
  Delimiter: 'i' }

as:

directory/               # key to show
directory/directory/     # group to 'directory/di'
directory/directory/file # group to 'directory/di'
directory/file           # Group to 'directory/fi'
file                     # ignored due to prefix

One advantage of using Amazon S3 over listing a directory is that you don’t need to concern about nested directories, everything is being flattened. So, you can loop it through just by specifying the Prefix property.

Directory/Folder

If you’re using the Amazon AWS console to “Create Folder”, you can create a directory/folder and upload a file to inside the directory/folder. In effect, you are actually creating two objects with the following keys:

1 2	directory/ directory/file

If you use the following command to upload a file, the directory is not created:

1	$ aws s3 cp file s3://example/directory/file

Because, Amazon S3 is not a file system, but a key/value store. If you use listObjects method, you will just see one object. That is the same reason that you cannot copy a local directory:

1 2	$ aws s3 cp directory s3://example/directory upload failed: aws/ to s3://example/directory [Errno 21] Is a directory: u'/home/chao/tmp/directory/'

But we can use the JavaScript SDK to create a directory/folder:

s3.putObject({ Bucket: 'example', Key: 'directory/' }, function (err, data) {
  if (err) { return console.error(err); }
  console.log(data);
});

Note that you must use directory/ with trailing slash instead the one without. Otherwise, it is just a regular file not a directory.

May 20, 2014

Node Version Manager

Node Version Manager is a simple bash script to manage multiple active Node versions. Why do you want to use it? If you like me, you want to try out the latest unstable version of the Node, but you still need to use the older and stable versions to develop and maintain your projects, then you should use NVM.

You can find out more information on NVM GitHub page, below are just my take on installing and using it.

Install

You can install NVM via the install script, but always do it the hard way by installing it manually:

$ git clone https://github.com/creationix/nvm.git ~/.nvm

Check out the latest version:

$ cd ~/.nvm && git checkout v0.7.0

Enable NVM:

$ source ~/.nvm/nvm.sh

Add this to ~/.bashrc to make it available upon login:

$ echo "\n# Enable NVM"           >> ~/.bashrc
$ echo 'source $HOME/.nvm/nvm.sh' >> ~/.bashrc

Enable Bash completion:

$ echo 'source $HOME/.nvm/bash_completion' >> ~/.bashrc

Try it by opening a new terminal:

$ nvm

Usage

Get help:

$ nvm help

Show current NVM version:

$ nvm --version

Display the currently active Node version:

$ nvm current

Install Node v0.10.28:

$ nvm install 0.10.28

The installed Node version will reside in ~/.nvm/v0.10.28.

List installed versions:

$ nvm ls
    .nvm
v0.10.28

Use a specific Node version:

$ which node
/usr/local/bin/node
$ node -v
v0.11.13
$ nvm use 0.10.28
Now using node v0.10.28
$ node -v
v0.10.28
$ which node
/home/chao/.nvm/v0.10.28/bin/node

Basically, NVM modified the search path.

$ echo $PATH
/home/chao/.nvm/v0.10.28/bin

To roll back:

$ nvm deactivate
/home/chao/.nvm/*/bin removed from $PATH
/home/chao/.nvm/*/share/man removed from $MANPATH
/home/chao/.nvm/*/lib/node_modules removed from $NODE_PATH

List Node versions available to install:

$ nvm ls-remote

Use .nvmrc file:

$ echo '0.10.28' >> .nvmrc
$ nvm use
Found '/home/chao/.nvmrc' with version <0.10.28>
Now using node v0.10.28
$ node -v
v0.10.28

May 18, 2014

Split a Large JSON file into Smaller Pieces

In the previous post, I have written about how to split a large JSON file into multiple parts, but that was limited to the default behavior of mongoexport, where each line in the output file represents a JSON string. If you have to deal with a large JSON file, such as the one generated with --jsonArray option in mongoexport, you can to parse the file incrementally or streaming.

I have downloaded a large JOSN data set (about 144MB) from Data.gov. If you try to read the entire data set into memory:

> var json = require('./data.json')
Killed

The process is not able to handle it. Use streaming is necessary. And luckily, our command line JSON processing tool, jq, supports streaming.

The parts we are interested are encapsulated in an array under data property of the data set. We are going to split each element of the array into its own file.

Don’t try to use -f option in jq to read file from the command line, it will read everything into a memory. Instead, do cat data.json | jq.

$ mkdir parts
$ cat data.json | jq -c -M '.data[]' | \
  while read line; do echo $line > parts/$(date +%s%N).json; done

The entire data set is piped into jq to filter and compress each array element. Each element is printed in one line, and each line is saved into its own JSON file by using UNIX timestamp plus nanosecond as the filename. All pieces are saved into parts/ directory.

But there is one problem with embedded JSON string, which has to do with echo, due to backslash. For example, if echoing the following string:

{"name":"{\"first\":\"Foo\",\"last\":\"Foo\"}","username":"foo","id":1}

It will be printed as an invalid JSON:

{"name":"{"first":"Foo","last":"Foo"}","username":"foo","id":1}

Backslashes are stripped. To fix this problem, we can simply double backslash:

$ cat data.json | jq -c -M '.data[]' | sed 's/\\"/\\\\"/g' | \
  while read line; do echo $line > parts/$(date +%s%N).json; done

You can even try curl the remote JSON file instead using cat from the downloaded file. But you might want to try with a smaller file first, because, with my slow machine, it took me nearly an hour to finish splitting into 678,733 parts:

real    49m35.780s
user    2m42.888s
sys     6m48.048s

To take it a little bit further, the next step is to decide how many lines or array elements to write into a single file.

May 18, 2014

Command Line JSON Processing

What is the best command line tool to process JSON?

Hmm… Okay, let’s try different command line JSON processing tools with the following use case to decide which one is the best to use.

Here is the use case: JSON | filter | shell. A program outputs JSON data, pipes into a JSON command line processing tool to filter data, and then send to a shell command to do more work.

Here is a snippet of sample JSON data:

[
  {
    "id": 1,
    "username": "foo",
    "name": "Foo Foo"
  },
  {
    "id": 2,
    "username": "bar",
    "name": "Bar Bar"
  }
]

Or in one-liner data.json:

1	[{"id":1,"username":"foo","name":"Foo Foo"},{"id":2,"username":"bar","name":"Bar Bar"}]

The command line JSON processor should filter each element of the array and convert it into its own line:

1 2	{"name":"Foo Foo"} {"name":"Bar Bar"}

The result will be piped as the input line by line into a shell script echo.bash:

#!/usr/bin/env bash
while read line; do
  echo "ECHO: '"$line"'"
done

The final output should be:

1 2	ECHO: '{"name":"Foo Foo"}' ECHO: '{"name":"Bar Bar"}'

Custom Solution

Before start looking for existing tools, let’s see how difficult it is to write a custom solution.

// Filter and convert array element into its own line.
var rl = require('readline').createInterface({
  input : process.stdin,
  output: process.stdout,
});
rl.on('line', function (line) {
  JSON.parse(line).forEach(function (item) {
    console.log('{"name":"' + item.name + '"}');
  });
}).on('close', function () {
  // Shh...
});

Perform a test run:

1
2
3

$ cat data.json | node filter.js  | bash echo.bash 
ECHO: '{"name":"Foo Foo"}'
ECHO: '{"name":"Bar Bar"}'

Well, it works. In essence, we are writing a simple JSON parser. Unless you want to keep the footprint small, you don’t want to write another JSON parser. And why bother to reinvent the wheel? Let’s start look at the existing solutions.

Node Modules

Let’s start with the tools from NPM registry:

$ npm search json command

Here are a few candidates that appears to be matching from the description:

jku - Jku is a command-line tool to filter and/or modifiy a JSON stream. It is heavily inspired by jq. (2 stars and not active, last update 8 months ago).
json or json-command - JSON command line procesing toolkit. (122 stars and 14 forks, last update 9 months ago)
jutil - Command-line utilities for manipulating JSON. (88 stars and 2 forks, last update more than 2 years ago)

Not a lot of choice, and modules are not active. This might be that because there is already a really good solution, jq, which has 2493 stars and 145 forks, and the last update was 6 days ago.

jq

jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text. - jq

Instead of NPM install, do:

$ sudo apt-get -y install jq

Since we don’t need color or prettified, just line by line. So, here is the command chain:

1
2
3

$ cat data.json | jq -c -M '.[] | {name}'  | bash echo.bash
ECHO: '{"name":"Foo Foo"}'
ECHO: '{"name":"Bar Bar"}'

jq can do much more than just the example just shown. It has zero runtime dependencies, and flexible to deal with not just array but object as well.

Conclusion

jq is clearly the winner here, with the least dependency, the most functionality and more popularity, as well as a comprehensive documentation.

May 11, 2014

Sharpening the Ax Before Chopping Down a Tree

I was helping to examine a server that was impacted by Heartbleed. According to the developer who was patching the server, he had updated the OpenSSL library to the following:

$ openssl version -a
OpenSSL 1.0.1g 7 Apr 2014
built on: Fri Apr 18 11:04:34 EDT 2014
platform: linux-x86_64
options: bn(64,64) rc4(16x,int) des(idx,cisc,16,int) idea(int) 
blowfish(idx) 
compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
-Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DOPENSSL_IA32_SSE2
 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m 
-DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM 
-DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/usr/ssl"

And the developer claimed: “According to http://heartbleed.com/. OpenSSL 1.0.1g is NOT vulnerable. Also I have restarted all services on this server.”

So, OpenSSL has been updated and the all services have been restarted, but why does the problem still persist?

I took a look at the command history he ran:

wget http://www.openssl.org/source/openssl-1.0.1g.tar.gz
ls
tar xvzf openssl-1.0.1g.tar.gz
cd openssl-1.0.1g/
sudo ./config --prefix=/usr
sudo make
sudo make install
exit
openssl version -a
sudo reboot

The OpenSSL library has been built from the source, which is fine, but the problem is that the Nginx server was still using the old library distributed by Ubuntu:

$ ldd `which nginx` | grep ssl
        libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007fafe82a3000)
$ strings /lib/x86_64-linux-gnu/libssl.so.1.0.0 | grep '^OpenSSL '
OpenSSL 1.0.1c 10 May 2012

In effect, there were two versions of OpenSSL library installed in the system, one was built from the source, and another one was managed by dpkg:

$ dpkg -l openssl
||/ Name                          Version             Architecture
+++-=============================-===================-===================
ii  openssl                       1.0.1c-4ubuntu8.2   amd64

However, the bigger problem is the version of the operating system:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 13.04
Release:        13.04
Codename:       raring

Ubuntu 13.04 is not supported anymore according to https://wiki.ubuntu.com/Releases. The developer probably issued apt-get upgrade, but nothing to be updated, because Ubuntu stopped supporting the release. Therefore, no security update. And Ubuntu 13.04 is not listed in Ubuntu Security Notice USN-2165-1. So, the developer opted for building the library from the source. After installation from the source, the binary openssl was overridden by the source build, and the command openssl version showed the latest and patched version 1.0.1g.

To fix the problem, we need to reinstall the package first:

$ sudo apt-get install --reinstall openssl

Now, this will revert control back to apt-get and overwrite the binary /usr/bin/openssl:

$ openssl version -a
OpenSSL 1.0.1c 10 May 2012
built on: Wed Jan  8 20:51:55 UTC 2014
platform: debian-amd64
options:  bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx) 
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH=50 -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/usr/lib/ssl"

And then we must perform the distribution upgrade to the latest long term support version, in order to continue receiving updates.

The lesson I have learned from this is that if you are going the wrong direction, no matter how hard you work, you are not going to make it. Make sure to take the initial investment, and really understand the true cause of the problem before attempting to resolve the issue. And don’t blindly follow the procedure. Understand it first, and adapt to your specific situation. As Abraham Lincoln once said:

“If I have nine hours to chop down a tree, I’d spend the first six sharpening my ax.”

May 10, 2014

Anonymous Like

Facebook announced a new product called “Anonymous Login“ during F8 Conference on April 30, 2014. The concept is great, it allows you to try out an app without releasing your personal information. We believe that we can take it a step further with Anonymous Like.

So, during the TechCrunch Disrupt 2014 Hackathon in New York, we hacked together an Anonymous Like Button to allow users to Like it Anonymously, similar to Facebook Like Button. However, the underlying principle is vastly different.

We want to create something that is simply to use and has privacy in its core. People no longer need to sign in to a platform or a service in order express their opinions. We believe that the willness of people who will click the Anonymous Like Button will be much higher than the Facebook Like Button. Therefore, it represents a better metric on user opinions.

Check back later, as we slowly make improvement. In the meantime, feel free to click or press the Anonymous Like button on http://anonymouslike.net.

realguess

Don't afraid to try!