Recently, I have received a warning email from GitHub:
GitHub Pages recently underwent some improvements (https://github.com/blog/1715-faster-more-awesome-github-pages) to make your site faster and more awesome, but we’ve noticed that realguess.net isn’t properly configured to take advantage of these new features. While your site will continue to work just fine, updating your domain’s configuration offers some additional speed and performance benefits. Instructions on updating your site’s IP address can be found at https://help.github.com/articles/setting-up-a-custom-domain-with-github-pages#step-2-configure-dns-records, and of course, you can always get in touch with a human at [email protected]. For the more technical minded folks who want to skip the help docs: your site’s DNS records are pointed to a deprecated IP address.
What are the improvements on GitHub Pages? Here are the two major improvements [1]:
Pages are served via CDN (Content Delivery Network)
DoS (Denial of Service) protection
But my site wasn’t properly configured to take advantage of speed and performance due to that the “DNS records are pointed to a deprecated IP address”.
What are the IP addresses that GitHub Pages uses before but have been deprecated now?
The domain realguess.net is an custom zone apex domain (also called bare, naked, or root domain). The domain blog.realguess.net is not a zone apex domain, but a subdomain. “A custom subdomain will not be affected by changes in the underlying IP addresses of GitHub’s servers.” [2] But, I am not using a subdomain. I have configured the zone apex domain to point to currently deprecated IP addresses. “If you are using an A record that points to 207.97.227.245 or 204.232.175.78, you will need to update your DNS settings, as we no longer serve Pages directly from those servers.” [3] So, these IP addresses are deprecated, and need to update the current DNS from:
Using a subdomain is a better solution, so I don’t need to care about the changing GitHub Pages IP addresses in the future. But I don’t think the frequency of IP address updating is going to be very often. So, I will stick with my zone apex domain.
“Google Trends is a public web facility of Google Inc., based on Google Search, that shows how often a particular search-term is entered relative to the total search-volume across various regions of the world, and in various languages.” - Google Trends - Wikipedia
How you type your search term in Google Trends will determine the results you will see. As described in Google Trends help page, there are four different ways:
A B
“A B”
A + B
A - B
I am going to illustrate a few examples to explain the difference with the following fixed parameters:
Location: United States
Time range: 2013
Categories: All categories
Type: Web Search
A B
Search term format A B is the most common format:
Must contains both words
Words can be any order
Other words can be included, such as A B C
“No misspellings, spelling variations, synonyms, plural or singular versions” - Google Trends Help
Here, when people search either ‘ugg’ or ‘uggs’ on Google, most likely, they mean the same thing. So, the actual trend should include both search terms.
There are people who search for ‘“shoes tennis”‘ exactly, but that the volume is too small when comparing to the other two.
This example also shows that since search term format A B allows additional words to be added such as A B C. Therefore, its volume is higher than "A B" search term format.
The sum of tennis and shoes is equal to either tennis + shoes or shoes + tennis.
A - B
A - B search term format excludes searches containing B. The search words can be A C, but not A B. For example, ‘tennis’ and ‘tennis - shoes’:
1
2
3
4
Search term | Avg
--------------------
tennis | 35
tennis - shoes | 32
Search term tennis can include shoes, therefore, it is higher than tennis - shoes, which disallow shoes.
Special Characters
“Currently, search terms that contain special characters, such as an apostrophe, single quotes, and parentheses, will be ignored. For example, if you type women's tennis world ranking, you get results for womens tennis world ranking.” - Google Trends Help
Topics
Without variations such as misspellings, synonyms or special characters, we cannot truly capture the user’s online search intention. As explained by the blog post, An easier way to explore topics and entities in Google Trends, when we search for rice, are we looking for Rice University or the rice we eat? Or how to count all the variations when looking up Gwyneth Paltrow, such as Gwen Paltro or Lead actress in Iron Man?
That’s why Google Trends introduces topics, where the semantics of the search are used, not only variations, but actual meanings. Google Trends topics is still in beta: “Measuring search interest in topics is a beta feature which quickly provides accurate measurements of overall search interest. To measure search interest for a specific query, select the ‘search term’ option.” And “when you measure interest in a search topic (Tokyo - Capital of Japan) our algorithms (Google Trends) count many different search queries that may relate to the same topic (東京, Токио, Tokyyo, Tokkyo, Japan Capital, etc). When you measure interest in a search query (Toyko - Search term), our systems will count only searches including that string of text (“Tokyo”).”
Here is an example of the retail company Nordstrom, comparing between search term and topic (the dotted line is topic and the solid line is search term):
Indeed, when a search is apple, it means more than just the maker of iPhone and iPad:
1
2
3
4
5
Search term | Avg
--------------------
Apple | 13
Apple Inc. | 47
apple | 55
We can build our own topic by using the format A + B to include all variations, for example, topic Flip-flops (Garment) closely resembles the sum of flip flop and flip flops (see the OR operator):
For some search terms, if we do not use topic or omitting different variations, we might get a wrong impression of the search trend. For example, comparing flip flops and Bahamas, on the peak, people who search for flip-flops is almost the same as people look for Bahamas:
Conclusion
Using search term in Google Trends has its limitation because it does not capture all variations such as misspellings, spelling variations, synonyms, plural or singular version, or special characters. And more importantly, the semantics of the search term is not captured. That’s why Google introduces topic to improve the result.
Google Trends has a limited number of topics, for those search terms without a corresponding topics, we need to a taxonomy system to include all the variations in order to capture the true semantics of the search.
I’ve got a new laptop, and I need to install the SSH public key of the new machine to all my AWS EC2 instances in order to enable keyless access. I can use ssh-copy-id to install the public key one instance at a time, but I can also do it all at once:
$ aws ec2 describe-instances --output text \--query 'Reservations[*].Instances[*].{IP:PublicIpAddress}' \
while read host;do \
ssh-copy-id -i /path/to/key.pub $USER@$host; done
Somehow if using PublicIpAddress, some IP addresses in the response were cluttered in a single line. So, I use {IP:PublicIpAddress} instead.
The only problem is that it might install a duplicate key in ~/.ssh/authorized_keys file of the remote instance, if the key has already been installed. One way to solve this problem is to test the login from the new machine and generate only the IP addresses that the new machine does not have access to:
Users frequently have custom domain for their blogs in Tumblr. If you need to know the [Tumblr] username from the custom domain, just need to grab the value from the header field X-Tumblr-User:
Node has a simple module loading system by using require.
A module prefixed with '/' is an absolute path to the file. For example, require('/home/marco/foo.js') will load the file at /home/marco/foo.js.
A module prefixed with './' is relative to the file calling require(). That is, circle.js must be in the same directory as foo.js for require('./circle') to find it.
Without a leading '/' or './' to indicate a file, the module is either a “core module” or is loaded from a node_modules folder.
We can also use it to load JSON files by simply specifying the file path. File path can be relative or absolute. Giving the following file structure:
1
2
3
4
5
6
.
├── app.js
├── data
│ └── file.json
└── lib
└── util.js
Let’s say the root path of the application is /home/marco. Use relative path from the file lib/util.js:
1
var json = require('../data/file.json');
This will load the JSON content from data/file.json, because the require system knows that the file is relative to the calling file lib/util.js. But when using file system functions, relative path is not treat the same way:
Relative path to filename can be used, remember however that this path will be relative to process.cwd(). - File System
process.cwd() returns the current working directory, not the root directory of the application nor the directory of the calling file. For example:
1
var data = fs.readFileSync('../data/file.json');
If your current working directory is /home/marco/myapp, the application root directory, you will get the following error message:
Error: ENOENT, no such file or directory '../data/file.json'
at Object.fs.openSync (fs.js:410:18)
at Object.fs.readFileSync (fs.js:276:15)
Because process.cwd() is /home/marco/myapp, and reading file relative to this directory with ../data/file.json, Node will expect file path to be /home/marco/data/file.json. Therefore, the error is thrown.
The difference between require and fs in loading files is that if the relative path is used in require, it is relative to the file calling the require. If the relative path is used in fs, it is relative to process.cwd().
To fix the error in file system functions, we should always use absolute path. One thing we can do is:
1
var data = fs.readFileSync(__dirname + '/../data/file.json');
__dirname is “the name of the directory that the currently executing script is resides in” [dirname], and it has no trailing slash. The the structure looks odd, even though it works, but it might not work in all systems.
The correct solution here is to use path.join from Path:
1
var data = fs.readFileSync(path.join(__dirname, '../data/file.json'));
The path module “contains utilities for handling and transforming file paths”. The path.join function will call path.normalize to “normalize a string path, taking care of .. and . paths” [normalize], also see function normalizeArray in https://github.com/joyent/node/blob/master/lib/path.js on how the paths are normalized.
In conclusion:
1
2
3
4
5
6
// Use `require` to load JSON file.
var json = require('../data/file.json');
// Use `fs` to load JSON file.
var data = fs.readFileSync(path.join(__dirname, '../data/file.json'));
MESSAGE=Hello World
***** /bin/sh ~/script.sh >> /tmp/script.log
Now, can the same environment variable MESSAGE be passed to the shell script ~/script.sh that being scheduled to run every minute? Let’s just give a try by adding the following line to script.sh:
echo"ECHO: ${MESSAGE}"
tail -f /tmp/script.log:
ECHO: Hello WorldECHO: Hello WorldECHO: Hello World
Therefore, the shell script will pick up the environment variables defined in crontab. This really is a convenience.
The AWS Command Line Interface (CLI)] is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts. [2]
The point here is unified, one tool to run all Amazon AWS services.
Install
The installation procedure applies to Ubuntu Linux with Zsh and Bash.
You should see a list of all available AWS commands.
Usage
Before using aws-cli, you need to tell it about your AWS credentials. There are three ways to specify AWS credentials:
Environment variables
Config file
IAM Role
Using config file is preferred, which is a simple ini file format to be stored in ~/.aws/config. A soft link can be used to link it or just tell awscli where to find it:
$ export AWS_CONFIG_FILE=/path/to/config_file
It is better to use IAM roles with any of the AWS services:
The final option for credentials is highly recommended if you are using aws-cli on an EC2 instance. IAM Roles are a great way to have credentials installed automatically on your instance. If you are using IAM Roles, aws-cli will find them and use them automatically. [4]
The default output is in JSON format. Other formats are tab-delimited text and ASCII-formatted table. For example, using --query filter and table output:
This will print a nice looking table of all EC2 instances.
The command line options also accept JSON format. But when passing in large blocks of data, referring a JSON file is much easier. Both local file and remote URL can be used.
(generator-generator) Scaffolds out a new basic Yeoman generator with some sensible defaults.
So it creates some basic structure for writing your own custom generators. It is a generator for writing Node generator application.
First Install generator-generator:
$ sudo npm install -g generator-generator
Then generate some basic files for writing your own generator by issuing:
$ yo generator
You can use generator-generator to write both a custom generator and a custom sub-generator. What is sub-generator? For example, Angular directive or service is a sub-generator.
Use naming convention generator-[name].
Create a directory for your new generator:
$ mkdir -p ~/generators/generator-hello && cd $_
Make sure entering the newly created directory (cd $_), otherwise all files generated will be placed under the current directory. Then, scaffold your generator:
While in development, make it accessible globally:
$ sudo npm link
This is similar to:
$ sudo npm install -g [generator-name]
They both will be discoverable from /usr/local/lib/node_modules.
There isn’t much going on there, we need to customize our own generator. The main file is ./app/index.js.
1
2
3
var HelloGenerator = yeoman.generators.Base.extend({
});
module.exports = HelloGenerator;
The HelloGenerator extends Base generator. There are four properties (functions) that it has extended: init, askFor, app, and projectfiles.
All these functions will be invoked by Yeoman one by one in order (the order is important), meaning that init will be called first to perform initialization, and then askFor will be called to get user prompts, then app and projectfiles will be called afterward to build the application.
However, it is not a must to have four properties, it just brings more structure and modularity. In fact, you can merge all four functions into a single one.
The method name does not matter. Just don’t name it prompt because you will override the existing property inherited from yeoman.generators.Base.
If we need to declare a “private” property, just appending it with an underscore as a hidden property, e.g. _dontAskFor: function(){}, or if you just use it in the same file, you can simply define a global function myFunc().
Personal, I don’t think it is necessary to break it into many functions, three are enough, initialization, prompting user for input, and generating files.
When choosing property names, the Base object contains many properties which is not a good idea to be overridden, such as ["append", "help", "log", "prompt", "remote"]. This design has limitation on choosing a right property name. For example, I was thinking about using prompt instead of askFor, but prompt is an inherited property. Therefore, it is a good idea to keep fewer properties in HelloGenerator. Three are enough. All these properties can be accessible from the context this. From this object you can access everything from Base object to newly defined _dontAskFor hidden property.
If 30 hours on the train was long, how about 50 more hours? This is the second part of the train traveling journey from New York to Los Angeles. Traveling from New York to New Orleans took about 30 hours with one night sleep on the train, and this trip, from New Orleans to Los Angeles is about 50 hours, and two nights were spent on the train.
The Route
The train travels from New Orleans, Louisiana to Los Angeles, California, passing notable cities such as Houston, San Antonio, El Paso, and Tuscon. Not many states involved comparing to the first leg of the trip. There are only Louisiana, Texas, New Mexico, Arizona and California.
If you look at the Amtrak System map, it is a hub for three long distance Amtrak routes, Sunset Limited, Crescent and City of New Orleans (between Chicago and New Orleans). Next hub is San Antonio. The Sunset Limited is used to extend to Jacksonville, Florida. But due to Katrina, the service is suspended.
The train travels close to the United States and Mexico border, sometimes, it was very close, like in El Paso. I was able to see the Mexico side from the train.
The Train
The type of train in this route is called Superliner, a double decker train. We were staying on the second floor, on the left side, able to see the border between the United States and Mexico.
Superliner has a sightseeing lounge, where two decks of windows on each side of the car. And you can turn the chair facing the windows. This provides a really comfortable scenery view along the route.
The Room
The Superliner roomette we booked still has two beds, one is assembled by pulling the seats together, and the other one can be pulled down from the ceiling. But it has no sink and toilet like the one from the last leg. And we have to use the shared toilet on the end of car. By the way, there is a shower room to use. Feeling taking a shower on a moving train?
Our Superliner Roomette is ideal for one or two passengers, with two comfortable reclining seats on either side of a big picture window. At night, the seats convert to a comfortable bed, and an upper berth folds down from above. Roomettes are located on both upper and lower levels of our double-decker Superliner train cars.
No in-cabin toilet or shower; restrooms, showers nearby in same train car
The View
Leaving New Orleans, when the train moved on the [Huey P Long Bridge] across the Mississippi River, SuperDome and the Big Easy, Bayous of Louisiana behind us.
Bayous of Louisiana:
The first stop in Texas is Beaumont, TX.
In the South, it is truck dominant area:
The Conclusion
The biggest problem is the Internet access.
☑ Travel from the East Coast to the West Coast of the United States by train