google-cloud-datastore

Scaffold an Explicit Document to Avoid Error When Creating Indexes in Google Cloud Datastore

When working with Google Cloud Datastore, I would like to design each kind (collection or table) with its own schema and index files:

1
2
3
4
$ tree
.
├── index.yml
└── schema.yml

But not every kind has to have something in the index.yml file. So, what to do? Will the index creation command accept an empty index.yml file with zero byte? Let’s give try:

1
2
3
4
$ gcloud preview datastore create-indexes index.yml
ERROR: (gcloud.preview.datastore.create-indexes) An error occurred while parsing file: [/home/chao/kind/index.yml]
The file is empty

No, it does not like it. Then, what’s the minimum required? The answer is an explicit document with an empty document content:

1
---

The three dashes form a directives end marker line. YAML uses three dashes to separate directives from document content.

When creating indexes with the file, it will proceed without errors or warnings:

1
2
3
4
5
$ gcloud preview datastore create-indexes index.yml
You are about to update the following configurations:
- myproj/index From: [/home/chao/kind/index.yml]
Do you want to continue (Y/n)?

Therefore, to avoid error, when scaffold an index.yml file for indexing, use an explicit document with an empty document content.

Google Cloud Datastore Starter: Dataset

Google Cloud Datastore dataset starter, part of a starter collection:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Google Cloud Datastore Starter: Dataset
// =======================================
'use strict';
// Define a dataset.
var params = {
projectId: 'MY_PROJECT',
keyFilename: '/path/to/key/file.json',
namespace: 'MY_NAMESPACE'
};
var dataset = require('gcloud').datastore.dataset(params);
// Define an entity (including both key and data).
var kind = 'MY_KIND';
var key = dataset.key(kind);
var data = [
{
name: 'title',
value: 'Google Cloud Datastore Starter',
excludeFromIndexes: false
}
];
// var data = { title: 'Google Cloud Datastore Starter' }; // Simple version
var entity = {
key: key,
data: data
};
// Save a single entity.
dataset.save(entity, function (err) {
if (err) {
throw err;
}
});

Understand the Limitation of the String Property Type of Google Cloud Datastore

Google Cloud Datastore supports a variety of data types for property values:

  • Integers
  • Floating-point numbers
  • Strings
  • Dates
  • Binary data

Strings are likely the most frequently used. When working with strings, the most important question to ask is how many bytes can be inserted into the property/field. According to the table listed in properties and value types:

Up to 1500 bytes if property is indexed, up to 1MB otherwise

Let’s give a try:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// String Property Type
// ====================
//
// Investigate the string property type of Google Cloud Datastore.
//
// Dependencies tested:
//
// - `gcloud@v0.21.0`
// - `lorem-ipsum@1.0.3`
'use strict';
// Define dependencies.
var params = {
projectId: 'myproj',
keyFilename: '/path/to/key/file.json',
namespace: 'test'
};
var dataset = require('gcloud').datastore.dataset(params);
var kind = 'String';
var lorem = require('lorem-ipsum');
// Save an entity to the datastore.
function run(params) {
var size = params.size;
var index = params.index;
var str = lorem({ count: size }).substr(0, size);
var key = dataset.key(kind);
var data = [
{
name: 'title',
value: str,
excludeFromIndexes: !index
}
];
var entity = {
key: key,
data: data
};
dataset.save(entity, function (err) {
console.log(size, new Buffer(str).length, index,
err ? err.code + ' ' + err.message : '200 OK');
});
}
// Explanation of fields:
//
// - `size`: Total number of bytes to produce
// - `index`: Whether to index the string field
//
// In-line comment indicates the expected result.
[
{ size: 1500, index: true }, // OK
{ size: 1501, index: false }, // OK
{ size: 1501, index: true }, // Error
{ size: 1024 * 1024 - 92, index: false }, // OK
{ size: 1024 * 1024 - 91, index: false }, // Error
].forEach(run);

Don’t ask me where 91 or 92 is coming from. Apparently, this is somewhere closer to 1 MB, but not exactly. The result of the test script: