Computing

Decrypting Password Protected PDF Files

Receiving password protected PDF files? If you get annoyed to type the password every time, decrypt and save into a new file:

1
$ qpdf --password=secret --decrypt infile.pdf outfile.pdf

QPDF is a command-line tools and library for transforming PDF files, an alternative to pdftk.

Notes:

1
qpdf 7.1.1

Identifying Duplicate Files in the Current Directory

Here are some tools to find duplicates files:

  • duff: Quickly find duplicate files
  • fdupes: Finds duplicate files in a given set of directories

But we can also just cobble together with a few commonly used CLI tools:

1
2
$ find -type f | xargs -n 1 -I {} md5sum '{}' | sort | \
trueawk '{if(k==$1){printf("%s\n%s\n",v,$2)}else{print("")};k=$1;v=$2}' | uniq

The command finds all files in the current directory, computes the MD5 checksum of each file, sorts them by the checksum first (k), then the file name (v). Finally, prints the duplicates ones and runs through a unique filter to obtain the final result.

Put into a shell script (find-duplicates.sh) with some dummy files to test the command and the above mentioned tools:

1
2
3
4
5
6
7
8
9
10
#!/bin/sh
apt-get update && apt-get install -y duff fdupes
cd /tmp && seq 3 | xargs -I {} bash -c 'touch file{}; date > date{}; echo $RANDOM > rand{}'
find -type f | xargs -n 1 -I {} md5sum '{}' | sort | \
trueawk '{if(k==$1){printf("%s\n%s\n",v,$2)}else{print("")};k=$1;v=$2}' | uniq
duff -r .
fdupes -r .

Execute the script in a disposable environment:

1
$ cat find-duplicates.sh | docker run -i --rm debian:9.6

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Ign:1 http://cdn-fastly.deb.debian.org/debian stretch InRelease
Get:2 http://security-cdn.debian.org/debian-security stretch/updates InRelease [94.3 kB]
Get:3 http://cdn-fastly.deb.debian.org/debian stretch-updates InRelease [91.0 kB]
Get:4 http://cdn-fastly.deb.debian.org/debian stretch Release [118 kB]
Get:5 http://security-cdn.debian.org/debian-security stretch/updates/main amd64 Packages [467 kB]
Get:6 http://cdn-fastly.deb.debian.org/debian stretch Release.gpg [2434 B]
Get:7 http://cdn-fastly.deb.debian.org/debian stretch-updates/main amd64 Packages [5152 B]
Get:8 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 Packages [7089 kB]
Fetched 7868 kB in 1s (4119 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
duff fdupes
0 upgraded, 2 newly installed, 0 to remove and 3 not upgraded.
Need to get 52.4 kB of archives.
After this operation, 146 kB of additional disk space will be used.
Get:1 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 fdupes amd64 1:1.6.1-1+b1 [21.2 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 duff amd64 0.5.2-1.1+b2 [31.2 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 52.4 kB in 0s (110 kB/s)
Selecting previously unselected package fdupes.
(Reading database ... 6498 files and directories currently installed.)
Preparing to unpack .../fdupes_1%3a1.6.1-1+b1_amd64.deb ...
Unpacking fdupes (1:1.6.1-1+b1) ...
Selecting previously unselected package duff.
Preparing to unpack .../duff_0.5.2-1.1+b2_amd64.deb ...
Unpacking duff (0.5.2-1.1+b2) ...
Setting up duff (0.5.2-1.1+b2) ...
Setting up fdupes (1:1.6.1-1+b1) ...
+ xargs -n 1 -I '{}' md5sum '{}'
+ sort
+ find -type f
+ awk '{if(k==$1){printf("%s\n%s\n",v,$2)}else{print("")};k=$1;v=$2}'
+ uniq
./file1
./file2
./file3
./date1
./date2
./date3
+ duff -r .
3 files in cluster 1 (0 bytes, digest da39a3ee5e6b4b0d3255bfef95601890afd80709)
./file3
./file1
./file2
3 files in cluster 2 (29 bytes, digest c20aeceea6a6b4bb1903d9124ea69223da08c69c)
./date2
./date1
./date3
+ fdupes -r .
./date2
./date1
./date3
./file3
./file1
./file2

Looks similar, same result.

However, those tools provide additional configurations such as following symlink files. Worth to install.

sudo -s or sudo -i

sudo allows users to run programs with the security privileges of another user (superuser or other users). Of the supported options, what’s difference between -s and -i?

Both options run an interactive shell if no command is specified:

1
2
$ sudo -s
$ sudo -i

The difference is that when using -i:

sudo attempts to change to that user’s home directory before running the shell. The command is run with an environment similar to the one a user would receive at log in. - man sudo

Let’s assume the current directory is at /:

1
2
3
4
5
$ sudo -s pwd
/
$ sudo -i pwd
/root

Changing directory is not attempted when using -s option.

sudo is commonly used to elevate the privilege to execute as the superuser, and usually done in place rather than in the superuser’s home directory, such as:

1
$ sudo -s chown $USER:$GROUP file

Therefore, when running as superuser, use -s:

1
$ sudo -s

When running as another user, use -i:

1
$ sudo -u foo -i

Amazon API Gateway No Integration Defined for Method

Encountered the following problem when deploying with the Serverless Framework (v1.27.3) to Amazon API Gateway:

1
2
3
4
CloudFormation - CREATE_IN_PROGRESS - AWS::ApiGateway::Deployment - ApiGatewayDeployment
CloudFormation - CREATE_FAILED - AWS::ApiGateway::Deployment - ApiGatewayDeployment
An error occurred: ApiGatewayDeployment - No integration defined for method

The problem was not the bug from the Serverless side, but was originated with manually created resource without integration defined for the POST method:

1
2
3
$ aws apigateway get-integration --rest-api-id xxxxxxxxxx --resource-id 0xxxxx --http-method post
An error occurred (NotFoundException) when calling the GetIntegration operation: No integration defined for method

After removing the resource. The deployment works again.

In conclusion, if there is a resource and one of its method has no integration, the REST API cannot be deployed. Either removing the resource or creating an integration will resolve the problem.

The following script will look for a REST API for no integration error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#!/bin/sh
#
# Search Amazon API Gateway for resources that have no integration defined for
# method.
API_NAME=example.com
REST_API_ID=$(aws apigateway get-rest-apis --query='items[?name==`'$API_NAME'`].id | [0]' --output=text)
RESOURCES=$(aws apigateway get-resources --rest-api-id=$REST_API_ID --query='items[*].id' --output=text)
for resource in $(echo "$RESOURCES")
do
for method in $(aws apigateway get-resource \
--query='resourceMethods && keys(resourceMethods)' \
--output=text \
--rest-api-id=$REST_API_ID \
--resource-id=$resource)
do
if [ "$method" != "None" ]
then
aws apigateway get-integration \
--rest-api-id=$REST_API_ID \
--resource-id=$resource \
--http-method=$method > /dev/null
if [ $? -ne 0 ]
then
aws apigateway get-resource \
--rest-api-id=$REST_API_ID \
--resource-id=$resource
fi
fi
done
done

References:

A CLI Method to Check SSL Certificate Expiration Date

I know that browser does this automatically, but it might come in handy if you need to check the expiration date of a SSL certificate through CLI. The key is openssl, OpenSSL command line tool.

1
2
3
$ echo | openssl s_client -connect example.com:443 2> /dev/null | \
openssl x509 -noout -enddate
notAfter=Nov 28 12:00:00 2018 GMT

The command is consisted of two parts:

  • Retrieve SSL certificate from the server
  • Extract the expiration date data

The openssl program is a command line tool for using the various cryptography functions of OpenSSL’s crypto library from the shell. It can be used for[^1]

  • Creation and management of private keys, public keys and parameters
  • Public key cryptographic operations
  • Creation of X.509 certificates, CSRs and CRLs
  • Calculation of Message Digests
  • Encryption and Decryption with Ciphers
  • SSL/TLS Client and Server Tests
  • Handling of S/MIME signed or encrypted mail
  • Time Stamp requests, generation and verification

What we need here is to perform SSL/TLS Client and Server Tests.

s_client is one of the standard commands of openssl command line tool:

This implements a generic SSL/TLS client which can establish a transparent connection to a remote server speaking SSL/TLS. It’s intended for testing purposes only and provides only rudimentary interface functionality but internally uses mostly all functionality of the OpenSSL ssl library.[^1]

Dig deeper into s_client command:

The s_client command implements a generic SSL/TLS client which connects to a remote host using SSL/TLS. It is a very useful diagnostic tool for SSL servers.[^2]

Option -connect host:port:

This specifies the host and optional port to connect to. If not specified then an attempt is made to connect to the local host on port 4433.[^2]

And the format is:

1
$ openssl s_client -connect servername:443 > data

If a connection is established, openssl enters interactive mode:

If a connection is established with an SSL server then any data received from the server is displayed and any key presses will be sent to the server. When used interactively (which means neither -quiet nor -ign_eof have been given), the session will be renegotiated if the line begins with an R, and if the line begins with a Q or if end of file is reached, the connection will be closed down.[^2]

To quit, type Q or <ctr>+d (EOF).

1
2
3
4
5
6
7
$ openssl s_client -connect example.com:443 > /tmp/example.com
depth=1 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert SHA2 High Assurance
Server CA
verify error:num=20:unable to get local issuer certificate
verify return:0
Q
DONE

Dump the session data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
$ cat /tmp/example.com
CONNECTED(00000003)
---
Certificate chain
0 s:/C=US/ST=California/L=Los Angeles/O=Internet Corporation for Assigned Names and Numbers/OU=Technology/CN=www.example.org
i:/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA
1 s:/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA
i:/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert High Assurance EV Root CA
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIF8jCCBNqgAwIBAgIQDmTF+8I2reFLFyrrQceMsDANBgkqhkiG9w0BAQsFADBw
MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3
d3cuZGlnaWNlcnQuY29tMS8wLQYDVQQDEyZEaWdpQ2VydCBTSEEyIEhpZ2ggQXNz
dXJhbmNlIFNlcnZlciBDQTAeFw0xNTExMDMwMDAwMDBaFw0xODExMjgxMjAwMDBa
MIGlMQswCQYDVQQGEwJVUzETMBEGA1UECBMKQ2FsaWZvcm5pYTEUMBIGA1UEBxML
TG9zIEFuZ2VsZXMxPDA6BgNVBAoTM0ludGVybmV0IENvcnBvcmF0aW9uIGZvciBB
c3NpZ25lZCBOYW1lcyBhbmQgTnVtYmVyczETMBEGA1UECxMKVGVjaG5vbG9neTEY
MBYGA1UEAxMPd3d3LmV4YW1wbGUub3JnMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A
MIIBCgKCAQEAs0CWL2FjPiXBl61lRfvvE0KzLJmG9LWAC3bcBjgsH6NiVVo2dt6u
Xfzi5bTm7F3K7srfUBYkLO78mraM9qizrHoIeyofrV/n+pZZJauQsPjCPxMEJnRo
D8Z4KpWKX0LyDu1SputoI4nlQ/htEhtiQnuoBfNZxF7WxcxGwEsZuS1KcXIkHl5V
RJOreKFHTaXcB1qcZ/QRaBIv0yhxvK1yBTwWddT4cli6GfHcCe3xGMaSL328Fgs3
jYrvG29PueB6VJi/tbbPu6qTfwp/H1brqdjh29U52Bhb0fJkM9DWxCP/Cattcc7a
z8EXnCO+LK8vkhw/kAiJWPKx4RBvgy73nwIDAQABo4ICUDCCAkwwHwYDVR0jBBgw
FoAUUWj/kK8CB3U8zNllZGKiErhZcjswHQYDVR0OBBYEFKZPYB4fLdHn8SOgKpUW
5Oia6m5IMIGBBgNVHREEejB4gg93d3cuZXhhbXBsZS5vcmeCC2V4YW1wbGUuY29t
ggtleGFtcGxlLmVkdYILZXhhbXBsZS5uZXSCC2V4YW1wbGUub3Jngg93d3cuZXhh
bXBsZS5jb22CD3d3dy5leGFtcGxlLmVkdYIPd3d3LmV4YW1wbGUubmV0MA4GA1Ud
DwEB/wQEAwIFoDAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwdQYDVR0f
BG4wbDA0oDKgMIYuaHR0cDovL2NybDMuZGlnaWNlcnQuY29tL3NoYTItaGEtc2Vy
dmVyLWc0LmNybDA0oDKgMIYuaHR0cDovL2NybDQuZGlnaWNlcnQuY29tL3NoYTIt
aGEtc2VydmVyLWc0LmNybDBMBgNVHSAERTBDMDcGCWCGSAGG/WwBATAqMCgGCCsG
AQUFBwIBFhxodHRwczovL3d3dy5kaWdpY2VydC5jb20vQ1BTMAgGBmeBDAECAjCB
gwYIKwYBBQUHAQEEdzB1MCQGCCsGAQUFBzABhhhodHRwOi8vb2NzcC5kaWdpY2Vy
dC5jb20wTQYIKwYBBQUHMAKGQWh0dHA6Ly9jYWNlcnRzLmRpZ2ljZXJ0LmNvbS9E
aWdpQ2VydFNIQTJIaWdoQXNzdXJhbmNlU2VydmVyQ0EuY3J0MAwGA1UdEwEB/wQC
MAAwDQYJKoZIhvcNAQELBQADggEBAISomhGn2L0LJn5SJHuyVZ3qMIlRCIdvqe0Q
6ls+C8ctRwRO3UU3x8q8OH+2ahxlQmpzdC5al4XQzJLiLjiJ2Q1p+hub8MFiMmVP
PZjb2tZm2ipWVuMRM+zgpRVM6nVJ9F3vFfUSHOb4/JsEIUvPY+d8/Krc+kPQwLvy
ieqRbcuFjmqfyPmUv1U9QoI4TQikpw7TZU0zYZANP4C/gj4Ry48/znmUaRvy2kvI
l7gRQ21qJTK5suoiYoYNo3J9T+pXPGU7Lydz/HwW+w0DpArtAaukI8aNX4ohFUKS
wDSiIIWIWJiJGbEeIO0TIFwEVWTOnbNl/faPXpk5IRXicapqiII=
-----END CERTIFICATE-----
subject=/C=US/ST=California/L=Los Angeles/O=Internet Corporation for Assigned Names and Numbers/OU=Technology/CN=www.example.org
issuer=/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA ---
No client certificate CA names sent
---
SSL handshake has read 3393 bytes and written 421 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES128-GCM-SHA256
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES128-GCM-SHA256
Session-ID: C828441A824CE7B0F6A74BBE890AB23727445EAE8521E19F438E679C39E969B1
Session-ID-ctx:
Master-Key: 38BE4F754FBCB5F41650AD91AA5588ACD88B75D7939487052D9FD2790476E7C6D2512A6451A3FC102958488BF173CB54
Key-Arg : None
PSK identity: None
PSK identity hint: None
SRP username: None
TLS session ticket lifetime hint: 300 (seconds)
TLS session ticket:
0000 - 83 70 c4 28 23 ee 9c 9e-87 1b 96 bf 44 76 ee d3 .p.(#.......Dv..
0010 - 45 c9 be ee a5 c5 42 49-c9 08 35 10 ba 79 03 b4 E.....BI..5..y..
0020 - 46 99 9a f2 d3 7b b5 f2-ad 9e 10 5c 7a 61 c3 0e F....{.....\za..
0030 - e0 09 aa 7a 5e 2a 2e bb-42 6a 08 18 16 ae 56 66 ...z^*..Bj....Vf
0040 - 11 0c 96 1a 4a 20 9f 50-6d f7 e2 53 00 75 6f 07 ....J .Pm..S.uo.
0050 - 7f 94 bf 4a 5f e1 f6 3b-d5 b7 6c 11 bc 33 7b 10 ...J_..;..l..3{.
0060 - 78 e3 81 a0 0b 83 25 d6-e6 a5 64 90 59 24 a6 e9 x.....%...d.Y$..
0070 - 9b b6 4b be 9e 42 1b 03-e0 d7 76 e9 32 87 3e 0d ..K..B....v.2.>.
0080 - 3d 09 09 32 18 fd 04 63-93 fe 33 9f 47 50 d4 c1 =..2...c..3.GP..
0090 - e1 a9 21 cc 67 30 ea 03-7f c1 ee 2a 54 02 c8 11 ..!.g0.....*T...
Start Time: 1475971200
Timeout : 300 (sec)
Verify return code: 20 (unable to get local issuer certificate)
---

To avoid the interactive mode, we can pipe an empty string into the command:

1
$ echo | openssl s_client -connect example.com:443 > /tmp/example.com 2> /dev/null

Now we have retrieved the SSL certificate from the server. Next, extract the expiration date. This is done by using the standard command x509:

Randomizing an Array with Sort

How to randomize an array? Use the sort command, with the option:

1
2
-R, --random-sort
sort by random hash of keys

For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ seq 1 10 | sort -R
4
2
10
6
3
9
7
5
8
1
$ seq 1 10 | sort --random-sort
9
6
1
3
2
8
7
5
4
10

Fixing Authorization Failure in AWS CLI by Synchronizing the Clock

Running into an error when executing an AWS command:

1
2
3
4
$ aws ec2 describe-instances
An error occurred (AuthFailure) when calling the DescribeInstances operation: AWS
was not able to validate the provided access credentials

From the error message, it appears to be an error with access credentials. But after updating to a new credential, and even updated the AWS package, the error still persisted. After trying out other commands, there was an error message containing “signature not yet current” with timestamps. So, the actual problem was due to inaccurate local clock. Hence, the solution is to sync the local date and time by polling the Network Time Protocol (NTP) server:

1
$ sudo ntpdate pool.ntp.org

ntpdate can be run manually as necessary to set the host clock, or it can be run from the host startup script to set the clock at boot time. This is useful in some cases to set the clock initially before starting the NTP daemon ntpd. It is also possible to run ntpdate from a cron script. However, it is important to note that ntpdate with contrived cron scripts is no substitute for the NTP daemon, which uses sophisticated algorithms to maximize accuracy and reliability while minimizing resource use. Finally, since ntpdate does not discipline the host clock frequency as does ntpd, the accuracy using ntpdate is limited.[^1]

From the description, we can learn that we can make things even easier by installing NTP package:

1
$ sudo apt-get install -y ntp

Network Time Protocol daemon and utility programs NTP, the Network Time Protocol, is used to keep computer clocks accurate by synchronizing them over the Internet or a local network, or by following an accurate hardware receiver that interprets GPS, DCF-77, NIST or similar time signals.[^2]

Verify the installation and execution:

1
2
$ ps -e | grep ntpd
4964 ? 00:00:00 ntpd

with the environment:

1
2
$ aws --version
aws-cli/1.10.53 Python/2.7.6 Linux/3.13.0-92-generic botocore/1.4.43

[^1]: $ man nptdate
[^2]: $ apt-cache show ntp

HTTP Methods Truth Table

My take on on HTTP methods and resources:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+-----------------------------------------------------------+
| # | Request-URI | Method | RE | RNE |
+-----------------------------------------------------------+
| 0 | GET /resources | list | 200 | 200 |
| 1 | GET /resources/entity | load/insert | 200 | 404 |
|-----------------------------------------------------------|
| 2 | POST /resources | create | 201 | 409 |
| 3 | POST /resources/entity | N/A | N/A | N/A |
|-----------------------------------------------------------|
| 4 | PUT /resources | (batch) | 200 | 200 |
| 5 | PUT /resources/entity | replace/save | 204 | 201 |
|-----------------------------------------------------------|
| 6 | PATCH /resources | (batch) | 200 | 200 |
| 7 | PATCH /resources/entity | update | 204 | 404 |
|-----------------------------------------------------------|
| 8 | DELETE /resources | (batch) | 200 | 200 |
| 9 | DELETE /resources/entity | remove/delete | 204 | 404 |
+-----------------------------------------------------------+

Notes:

  1. RE: resource exists
  2. RNE: resource not exists
  3. For batch request, whether resource/entity exists or not, the resulting HTTP
    status code is always 200, because the code is used to indicate the status
    of the operation. The actual status code of each entity is enclosed in the
    response array. When there are no matching entities, the response is an empty
    array, therefore, status code 204 is not used.
  4. There are two situation, a new resource is being created, then the Location
    header must indicate the fully qualified resource URI.