Tim's questions -server

Tim

Asked: 2021-05-20 19:13:20 +0800 CST

How do I use AWS SAM to deploy an AWS API Gateway and AWS Java Lambda including POST caching and disabling logging

1

How do I deploy a lambda Java function with an API gateway REST interface including caching POST methods using AWS Serverless Application Model?

Tim

Asked: 2021-03-18 12:52:08 +0800 CST

How to automate certbot certificate renewal on Ubuntu 20.04

2

I'm running certbot on Ubuntu 20.04 in AWS, installed as a snap package. I'm not sure if certbot renewal is running properly. I'd appreciate some help working out how to best get it working.

This is a new server, which I turn on and off while I'm getting it ready for production. It runs about 8 - 10 hours a day at the moment. It's not often running at midnight, which I think is when the cron job runs. It will be on 24/7 in a few days once I finish the configuration.

One thing I found is this question answer saying

You shouldn't have to set up anything. Any recent Debian/Ubuntu install of certbot should install a systemd timer and a cron job (and the cron job will only run certbot if systemd is not active, so you don't get both running).

It looks to me like the certbot timer isn't running, and if it did it appears to be pointing at /dev/null. Because systemd is active I wonder if the cron job is running.

Timers and systemd

I found a comment there may be an issue with timers and snap so maybe this is a known issue.

systemctl list-timers

The timer doesn't seem to run

NEXT                        LEFT          LAST                        PASSED       UNIT                         ACTIVATES
Wed 2021-03-17 23:44:00 UTC 3h 24min left n/a                         n/a          snap.certbot.renew.timer     snap.certbot.renew.service

Certbot timer appears to point at /dev/null. This question indicates that's not how it should be.

> root@aws2:/etc/systemd/system# ls -l | grep certbot
lrwxrwxrwx 1 root root    9 Jan  9 06:38 certbot.timer -> /dev/null

I can see the following in syslog but I'm not sure what it means

Mar 17 16:51:02 aws2 systemd[1]: Started Timer renew for snap application certbot.renew.

Cron

In syslog I can see this the cron job is running but there's no output

Mar 16 00:00:01 aws2 CRON[2072]: (root) CMD (test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew)

The following is in /etc/cron.d/certbot (this was presumably put there by the certbot installation, I get the general idea of what it does but I don't know what the test / perl stuff does)

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew

When I run the whole command (root test /x etc) I get the message below.

Command 'root' not found, but can be installed with: snap install root-framework

When I run this part I get no output (note I have removed the "-q" from certbot for testing). I'm not sure what the test part is doing, but certbot doesn't seem to do anything when I run this command.

> test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot renew

Key questions

Any ideas what's up with the systemd timer, why it's pointing at /dev/null? Or should I just ignore this known issue?
Should I "snap install root-framework" to install "root" like Ubuntu is suggesting?
The cron job "root test" doesn't appear to be doing anything... any anyone explain what is being tested there and whether "cerbot renew" it's actually running?

Update - Proposed Solution

In the /etc/cron.daily folder I've created the following folder. I think it will do what I want, I'll check logs at some point to see. I'm still interested in the questions I asked above.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
/usr/bin/certbot -q renew

Tim

Asked: 2020-04-08 14:01:52 +0800 CST

How do I enforce encryption in transit with AWS RDS Oracle using CloudFormation YAML?

0

How do I enable and enforce / mandate encryption in transit for AWS RDS Oracle instances, when setting up the RDS database using CloudFormation YAML.

Tim

Asked: 2019-12-19 11:53:42 +0800 CST

Amazon Linux - yum update fails with HTTP Error 403 - Forbidden

2

I have an Amazon Linux v1 instance in us-west-2 (Oregon) that is failing yum update as per below. This is an old instance that's been working fine for a few years, updated to a t3a.nano a few months ago. It has an S3 gateway in the VPC.

I created an m3.large instance in the same region and had no issues with updates.

Any ideas how to resolve this? I don't have AWS support so I can't ask them, but if it persists I will try again to reproduce within an account that does have support.

sudo yum update  amazon-ssm-agent
Loaded plugins: update-motd, upgrade-helper
Resolving Dependencies
--> Running transaction check
---> Package amazon-ssm-agent.x86_64 0:2.3.662.0-1.amzn1 will be updated
---> Package amazon-ssm-agent.x86_64 0:2.3.714.0-1.amzn1 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================================================
 Package                      Arch               Version                        Repository                Size
===============================================================================================================
Updating:
 amazon-ssm-agent             x86_64             2.3.714.0-1.amzn1              amzn-updates              25 M

Transaction Summary
===============================================================================================================
Upgrade  1 Package

Total download size: 25 M
Is this ok [y/d/N]: y
Downloading packages:
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
To address this issue please refer to the below knowledge base article

https://access.redhat.com/solutions/69319

If above article doesn't help to resolve this issue please open a ticket with Red Hat Support.

amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-northeast-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.eu-central-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.sa-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-southeast-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-west-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-west-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-southeast-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-northeast-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.eu-west-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c&region=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.


Error downloading packages:
  amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64: [Errno 256] No more mirrors to try.

Update

Based on David's suggestion of "yum clean all" I logged in and tried it. After I did this yum no says no updates are available - even though Amazon Linux tells me there are three updates available when I log in via SSH.

Notes:

Current amazon-ssm-agent is 2.3.662.0
Previously yum said it would update from 0:2.3.662.0 to 0:2.3.714.0-1

Here's the SSH session

Last login: Thu Dec 19 09:28:01 2019 from (IP address removed)
3 package(s) needed for security, out of 8 available <-- ***
Run "sudo yum update" to apply all updates.
sudo yum clean all
Loaded plugins: update-motd, upgrade-helper
Cleaning repos: amzn-main amzn-updates epel-debuginfo epel-source
Cleaning up everything


sudo yum update -y
Loaded plugins: update-motd, upgrade-helper
amzn-main                                                | 2.1 kB     00:00
amzn-updates                                             | 2.5 kB     00:00
epel-debuginfo/x86_64/metalink                           |  17 kB     00:00
epel-debuginfo                                           | 3.0 kB     00:00
epel-source/x86_64/metalink                              |  17 kB     00:00
epel-source                                              | 4.1 kB     00:00
(1/8): amzn-main/latest/group_gz                           | 4.4 kB   00:00
(2/8): amzn-updates/latest/group_gz                        | 4.4 kB   00:00
(3/8): epel-source/x86_64/updateinfo                       | 792 kB   00:00
(4/8): amzn-updates/latest/updateinfo                      | 615 kB   00:00
(5/8): epel-source/x86_64/primary_db                       | 1.9 MB   00:00
(6/8): epel-debuginfo/x86_64/primary_db                    | 831 kB   00:00
(7/8): amzn-main/latest/primary_db                         | 4.0 MB   00:01
(8/8): amzn-updates/latest/primary_db                      | 2.5 MB   00:01
No packages marked for update

Tim

Asked: 2019-04-18 19:18:30 +0800 CST

Enforcing EBS Encryption within AWS Organization using SCP (Service Control Policy)

0

Is it possible to enforce that all accounts within an AWS organization can only create encrypted EBS volumes?

I know you can enforce it using IAM roles, but I want to know if it can be done with SCP.

Here's what I've come up with so far, but it doesn't work. I've attached this to an account within my organisation but I can create both encrypted and unencrypted volumes.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Action": "ec2:CreateVolume",
            "Resource": "*",
            "Condition": {
                "Bool": {
                    "ec2:Encrypted": "false"
                }
            }
        }
    ]
}

Tim

Asked: 2019-04-03 10:27:42 +0800 CST

AWS CodePipeline - how to deploy dozens of CloudFormation / Stackset / Lambda resources without manually creating a pipeline action per file

3

What's the best way to deploy dozens of resources such as CloudFormation templates, Stack Sets, and Lambda functions using Code Pipeline?

In AWS I have a multi-account architecture running an AWS Organization. I want a pipeline running in a single account. That pipeline will deploy CloudFormation templates to one or more accounts within the Organization.

The options I've found so far are:

Have a pipeline stage or action for each source file. This works quite well, but means every time you add a source file you need to modify your pipeline, which seems like overhead that could be automated or eliminated. You can't deploy StackSets with this approach. You also need a stage per template per account to deploy to, so it's impractical.
Use nested stacks. The problems with this are 1) Within the master stack I don't know what naming convention to use to call the other stacks direct from CodeCommit. I could work around that by having CodeBuild copy all the files to S3, but it seems inelegant. 2) Nested stacks are more difficult to debug, as they're torn down and deleted if they fail, so it's difficult to find the cause of the problem
Have CodeBuild to run a bash script that deploys all the templates using the AWS CLI.
Have CodeBuild run an Ansible playbook to deploy all the templates.
Have Lambda deploy each template, after being invoked by CodePipeline. This is likely not a great option as each invocation of Lambda would be for a single template, and there wouldn't be information about which account to deploy to. A single Lambda function that does all the deployments might be an option.

Ideally I'd like to have CodePipeline deploy every file with specific extensions in a CodeCommit repo, or even better deploy what's listed in a manifest file. However I don't think this is possible.

I'd prefer to avoid any technologies or services that aren't necessary. I would also prefer not to use Jenkins, Ansible, Teraform, etc, as this script could be deployed at multiple customer sites and I don't want to force any third party technology on them. If I have to use third party I'd rather have something that can run in a CodeBuild container than have to run on an instance like Jenkins.

--

Experience since I asked this question

Having to write Borne Shell (sh) scripts in CodeBuild is complex, painful and slow.
There needs to be some logic around creation or update of StackSets. If you simply call "create stackset" it will fail on update.
There's a reason the AWS Landing Zone pipeline is complex, using things like step functions.
If there was an easy way to write logic such as "if this stackset exists then update it" things would be a lot simpler. The ASW CDK is one possible solution to this, as it lets you create AWS infrastructure using Java, .Net, JavaScript, or TypeScript. Third party tools such as Teraform and such may also make help, but I don't know enough about them to comment.

I'm going to leave this question open in case someone comes up with a great answer.

--

Information from AWS Support

AWS have given the following advice (I've paraphrased it, filtered through my understanding, any errors are my own rather than incorrect advice from AWS):

CodePipeline can only deploy one artifact (eg CloudFormation template) per action
CodePipeline cannot directly deploy a StackSet, which would allow for deployment of templates across accounts. StackSets can be deployed by calling CodeBuild / Lambda.
CodePipeline can deploy to other accounts by specifying a role in that other account. This only deploys to one account at a time, so you would need one action per template per account
CodeBuild started as part of a CodePipeline running in a container gives more flexibility, you can do whatever you like here really
CodePipeline can start Lambda, which is very flexible. If you start Lambda from a CodePipeline action you get the URL of a single resource, which may be limiting. (My guess) You can probably invoke Lambda in a way that lets it do the whole deployment.

Tim

Asked: 2018-10-04 23:18:23 +0800 CST

Restricting S3 access to individual servers when using S3 endpoint in private VPC with no public IPs

1

Question

How can you restrict access to an S3 bucket to a single AWS instance when the instances have no public IP address and you're using an S3 endpoint in your VPC?

Background and Detail

We have a VPC that has a Virtual Private Gateway, which runs a VPN back to our on-premise data center. The VPC has no internet gateway. The servers have no public IP addresses. There's an S3 endpoint and a route table set up correctly. Let's say there's one subnet with three instances. We have two S3 buckets yet to be created, which will be encrypted.

We'd like to restrict access from instance A S3 bucket A and instance B to S3 bucket B. Instance C shouldn't have access to either bucket.

We also need to be able to upload information to the buckets from the public internet. This access should only be available through specific IPs, which is easy to do using bucket policies. This could make a difference to the solution though.

This page on AWS documentation says

You cannot use an IAM policy or bucket policy to allow access from a VPC IPv4 CIDR range (the private IPv4 address range). VPC CIDR blocks can be overlapping or identical, which may lead to unexpected results. Therefore, you cannot use the aws:SourceIp condition in your IAM policies for requests to Amazon S3 through a VPC endpoint. This applies to IAM policies for users and roles, and any bucket policies. If a statement includes the aws:SourceIp condition, the value fails to match any provided IP address or range. Instead, you can do the following:

Use your route tables to control which instances can access resources in Amazon S3 via the endpoint.

For bucket policies, you can restrict access to a specific endpoint or to a specific VPC. For more information, see Using Amazon S3 Bucket Policies.

We can't use route tables as we have two instances that need access to different buckets.

I've also read this page, which wasn't helpful for our situation.

Option: KMS keys

I suspect if we give each bucket its own KMS key and restrict access to the keys to the appropriate instance only that instance can access the decrypted data. I think this will work, but there's some complexity there, and I'm not 100% sure as I haven't done much with KMS.

An advantage of this approach is it denies by default, grants by configuration. Because of the number of users of the account we can't rely on configuration to deny by default, as was very helpfully suggested by Alex below.

Ideas Thoughts or Suggestions

Does anyone have any other suggestions or ideas we could follow up? Or any thoughts / further detail on the KMS idea?

Tim

Asked: 2017-10-05 16:30:46 +0800 CST

Create new EC2 instance with existing EBS volume as root device using CloudFormation

3

I'm trying to mount an existing volume to a new EC2 Windows instance using CloudFormation. This seems like something that should be possible.

Big Picture

I have a vendor provided AMI which installs some preconfigured software. We want to create a single instance, and we'll change the EC2 instance size occasionally for performance testing. We don't want to lose the data on the single EBS disk that we'll create from the AMI.

Since we're using CloudFormation, if we simply change the AWS::EC2::Instance.InstanceType property and upload the modified stack CloudFormation will a new instance and volume from the AMI. That's not helpful as we'll lose the data we'd have uploaded from the existing disk.

Volumes Method

I tried this script first.

WindowsVolume:
  Type: AWS::EC2::Volume
  Properties:
    AutoEnableIO: true
    AvailabilityZone: "ap-southeast-2b"
    Encrypted: true
    Size: 30
    SnapshotId: snap-0008f111111111
    Tags:
      - Key: Name
        Value:
          Ref: AWS::StackName
    VolumeType: gp2

EC2Instance:
  Type: AWS::EC2::Instance
  InstanceType: t2.micro
  ImageId: ami-663bdc04 # Windows Server stock image
  KeyName: removed
  IamInstanceProfile: removed
  InstanceInitiatedShutdownBehavior: stop
  SecurityGroupIds:
    Fn::Split: [",", "Fn::ImportValue": StackName-ServerSecurityGroup]
  SubnetId:
    !ImportValue StackName-Subnet1
  Volumes:
    - Device: "/dev/sda1"
      VolumeId:
        Ref: WindowsVolume

I got the error message

Invalid value '/dev/sda1' for unixDevice. Attachment point /dev/sda1 is already in use

BlockDeviceMappings Method

Next I tried using BlockDeviceMappings

BlockDeviceMappings:
  - DeviceName: "/dev/sda1"
    Ebs:
      Ref: WindowsVolume

The error message this time was

Value of property Ebs must be an object

VolumeAttachment Method

I've also tried using a VolumeAttachment instead of the Volumes property or a BlockDeviceMapping.

VolAttach:
  Type: AWS::EC2::VolumeAttachment
  Properties:
    Device: "/dev/sda1"
    InstanceId: !Ref EC2Instance
    VolumeId: !Ref WindowsVolume

This gave me the same message as above

Invalid value '/dev/sda1' for unixDevice. Attachment point /dev/sda1 is already in use

Key Question

Has anyone successfully mounted an existing root volume, or a snapshot, to a new EC2 instance? If it's possible what's the proper method?

Alternate Approaches

Happy to hear alternate approaches. For example options I've considered are:

Creating the VPC and related resources using CloudFormation, then create the instance manually using the console.
Creating the VPC, related resources, and EC2 instance using CloudFormation. From that point stop using CloudFormation and simply use the web console to change instance size.

Tim

Asked: 2017-08-09 18:54:17 +0800 CST

How to start and stop AWS EC2 instance based on a time based schedule

20

Is there an easy way to start and stop AWS EC2 instances at a given time each day? This could save me quite a lot of money for my development and test servers.

Tim

Asked: 2017-07-08 15:07:23 +0800 CST

How to run fail2ban on Amazon Linux - No module named fail2ban.version

2

I installed fail2ban using this command on Amazon Linux

yum install fail2ban

My epel repository is defined as

mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch

I got this error when I tried to start the service

service fail2ban start
Starting fail2ban: Traceback (most recent call last):
  File "/usr/bin/fail2ban-client", line 37, in <module>
    from fail2ban.version import version
ImportError: No module named fail2ban.version

I've tried this fix in this bug report using this diff, which isn't merged into the script I have. It didn't make any difference. I've tried also tried this but I have no idea how it's meant to work, if you're meant to run anything, etc.

Can anyone suggest how to get fail2ban to work on Amazon Linux?

Note below is what was installed with fail2ban

Tim

Asked: 2017-04-21 00:21:10 +0800 CST

Certificate Mismatch setting up Route53, CloudFront, Custom Origin

1

Background

My websites have been using CloudFlare with Let's Encrypt successfully for a year or two. The websites are hosted on EC2, they have valid Let's Encrypt certificates for the root, www, and all used subdomains. The website is run by Wordpress.

What I'm doing

As a learning exercise I decided to change one of my domains, wildphotography.co.nz, over to Route53 and CloudFront. It hasn't gone well.

The Problem

After moving from CloudFlare to Route53 with CloudFront, I can't view my website. Details are below. My desired end state is for Route53 to be my DNS server, and CloudFront my CDN.

Note that I have reverted back to CloudFlare, because I need my website to be online. I had Route53 as my DNS server for 3-4 hours, and I could see that it was resolving to R53.

Problem Details

After I set things up here's the problem I see in my browser

Due to the Route53 setup, the request for the domain is being sent to CloudFront. The certificate being presented by CloudFront is for the *.cloudfront.net domain. Hence the mismatch. I believe I understand the problem but I can't work out how to solve it.

If I go to the Cloudfront URL (d1b5f3w2vf82yc.cloudfront.net) I get this error. Of course, going to this URL wouldn't typically be helpful.

Here's an SSL diagnostic

Here's my CloudFront setup. Note that I took a screenshot after I changed something minor, which is why it shows "in progress". I let it propagate before I tested it.

First the CloudFront overview

CloudFront Origin Settings

CloudFront Root Behavior

Note that I forward from http to https on my Nginx web server, so I don't bother to have CloudFront do it. That gives me additional information in my logs, useful for diagnosis.

Route53 setup

I've removed some irrelevant records relating to email. Note that both the www and non-www domains are alias records pointing at the CloudFront distribution. It won't accept a cname alias - I'm not even sure if that's a valid combination.

What I've tried

I created a new subdomain, origin.wildphotography.co.nz, which is a cname to www.wildphotography.co.nz. I believe this is necessary so CloudFront can find the IP of the origin server.

I've tried CNAMEs, Alias and not Alias, all kinds of things.

One odd thing, is when it was still set up with R53/CloudFront some requests were getting through CloudFront. Not many, but some.

Any ideas would be appreciated. I suspect I have Route53 set up somehow incorrectly.

Tim

Asked: 2017-02-08 11:10:20 +0800 CST

How to fix dependency issue with partially completed yum update of glibc

2

I attempted to update my production web server this morning (t2 running Amazon Linux) but it failed because I ran out of RAM (php-fpm had it all). I stopped php-fpm to free up some RAM, but the yum update won't complete. The server is running ok, but I'd like to clean up this problem.

# yum update
Resolving Dependencies
--> Running transaction check
---> Package glibc-headers.x86_64 0:2.17-106.168.amzn1 will be updated
--> Processing Dependency: glibc-headers = 2.17-106.168.amzn1 for package: glibc-devel-2.17-106.168.amzn1.x86_64
---> Package glibc-headers.x86_64 0:2.17-157.169.amzn1 will be an update
--> Finished Dependency Resolution
Error: Package: glibc-devel-2.17-106.168.amzn1.x86_64 (@amzn-main)
       Requires: glibc-headers = 2.17-106.168.amzn1
       Removing: glibc-headers-2.17-106.168.amzn1.x86_64 (@amzn-main)
           glibc-headers = 2.17-106.168.amzn1
       Updated By: glibc-headers-2.17-157.169.amzn1.x86_64 (amzn-updates)
           glibc-headers = 2.17-157.169.amzn1
You could try using --skip-broken to work around the problem
** Found 6 pre-existing rpmdb problem(s), 'yum check' output follows:
glibc-devel-2.17-106.168.amzn1.x86_64 has missing requires of glibc(x86-64) = ('0', '2.17', '106.168.amzn1')
glibc-devel-2.17-157.169.amzn1.x86_64 is a duplicate with glibc-devel-2.17-106.168.amzn1.x86_64
glibc-devel-2.17-157.169.amzn1.x86_64 has missing requires of glibc-headers = ('0', '2.17', '157.169.amzn1')
glibc-headers-2.17-106.168.amzn1.x86_64 has missing requires of glibc(x86-64) = ('0', '2.17', '106.168.amzn1')
subversion-1.9.4-2.55.amzn1.x86_64 has missing requires of subversion-libs(x86-64) = ('0', '1.9.4', '2.55.amzn1')
subversion-1.9.5-1.56.amzn1.x86_64 is a duplicate with subversion-1.9.4-2.55.amzn1.x86_64

Here's the glibc packages that are installed

# rpm -qa | grep glibc
glibc-devel-2.17-157.169.amzn1.x86_64
glibc-devel-2.17-106.168.amzn1.x86_64
glibc-common-2.17-157.169.amzn1.x86_64
glibc-headers-2.17-106.168.amzn1.x86_64
glibc-2.17-157.169.amzn1.x86_64

One problem appears to be two different versions of glibc-devel are installed. It also looks like parts of glibc are on version 106 and others are on version 157.

I rebooted the server, which as expected made no difference, but it was worth a shot. I ran the following, with no effect

yum-complete-transaction
yum-complete-transaction --cleanup-only
yum clean all

In the past I've had similar problems but with less critical packages. I just remove them all and them install them again. I don't believe this is possible with glibc as so many things depend on it.

I've looked on the Centos forums and one option seems to be downgrading some packages to downgrade some packages, but I don't know if there's a better option. Since this is my production server I would appreciate some advice before I attempt this. If this is a good approach which packages should I downgrade? What do I do after I've downgraded, a regular yum update?

Note that I have regular backups and can restore from a recent backup if required, but I'd prefer not to as I'd have to redo some SSL certificate work that was a bit tricky. I plan to move to Ubuntu and use CloudFormation to build the server in the future, so if the server fails I can simply create another, but that's a future task.

Tim

Asked: 2016-12-27 10:44:43 +0800 CST

ReFS / storage spaces drive being dropped under heavy load

8

I have a Windows 10 workstation used within my business for things like image processing (Photoshop) and software development (Eclipse). It's an i7-2600K based computer, Gigabyte GA-B75M-D3H B75 motherboard, 16 GB RAM. OS is on Samsung 850 pro SSD, there's another 850 pro for data, WD Black for data, plus two 4GB HGST drives each on SATA 3 ports, formatted ReFS, in a storage spaces mirror. The array has 1.63GB used, 1.99GB free.

Recently the ReFS drives in the storage spaces mirror have started dropping - so far three times in a month. This usually occurs under moderate to heavy load, after an extended period. None of the other disks drop under load as far as I can tell, so I assume it's ReFS, Storage Spaces, or a problem with an underlying disk. A reboot brings the disk online.

I can see errors in the event viewer such as those below. These are not all in one place, and while there are NTFS and Storage Spaces log areas under "application and services log -> microsoft -> windows" there doesn't seem to be one for ReFS.

I'd appreciate help tracking down what's causing these problems, and resolving them, so my system stays up.

16:27.05 (under event viewer -> application and services log -> microsoft -> windows -> storagespaces-driver-operationsl
Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} requires a data integrity scan.                                    
Data on the disk is out-of-sync and a data integrity scan is required.                  To start the scan, run the following command:                  

Get-ScheduledTask -TaskName "Data Integrity Scan for Crash Recovery" | Start-ScheduledTask                  

Once you have resolved the condition listed above, you can online the disk by using the following commands in PowerShell:                  

Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsReadOnly $false                  
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsOffline  $false

16:27.05 (windows system event log): The file system was unable to write metadata to the media backing volume R:. A write failed with status "A device which does not exist was specified." ReFS will take the volume offline. It may be mounted again automatically.
16:27.06 (windows system event log): The file system detected a checksum error and was not able to correct it. The name of the file or folder is "<unable to determine file name>".
18:35.50 (windows system event log): Failed to connect to the driver: (-2147024894) The system cannot find the file specified. 
18:35.50 (Kernel PNP) The driver \Driver\WudfRd failed to load for the device SWD\WPDBUSENUM\_??_USBSTOR#Disk&Ven_Generic&Prod_STORAGE_DEVICE&Rev_9451#7&2a9fd895&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}.

18:35.58: Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} could not be repaired because there is not enough free space in the storage pool.                  
Replace any failed or disconnected physical disks. The virtual disk will then be repaired automatically or you can repair it by running this command in PowerShell:                  
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Repair-VirtualDisk

UPDATE as yagmoth points out this error includes something about USB. The scenarios where I recall this error happening are a) When backing up to an external USB disk b) When running CrashPlan backups to another internal SATA disk

Tim

Asked: 2016-07-19 13:00:51 +0800 CST

Refs: is fragmentation ok / is defragmentation needed?

1

I run ReFS in two configurations:

In a storage spaces mirror (2x4TB), for most valuable data
Standalone on a single disk, for offsite backups (NB: not my only backup)

In both of these configurations the data integrity features have been enabled in ReFS / Storage Spaces.

I run the Windows defragmentation tool monthly. However I ran defraggler straight after the windows defrag tool recently and it reported significant fragmentation, with some files having thousands of fragments. Defraggler can defragment the disk, but it's very slow - 2TB of data can take 16 - 24 hours. Is fragmentation a problem in ReFS?

TLDR;

The "problem" is: ReFS disks have significant fragmentation
The question: should I defragment these disks using defraggler, to remove all fragmentation, or let the Windows defrag tool do whatever it thinks is required?

Tim

Asked: 2016-04-15 17:24:02 +0800 CST

Does OCSP stapling increase performance for websites behind CloudFlare?

1

Should a website trying to increase performance that uses the CloudFlare CDN (or any CDN really) who already do OCSP stapling configure OCSP stapling on their instance of Nginx if "Full SSL" setting on CloudFlare is used?

In this setup when a browser requests a page from a CloudFlare protected/cached site they connect to CloudFlare using TLS, who then connects to the source web server using TLS to retrieve the freshly generated page. This means two sets of SSL negotiation are done, increasing the time required to retrieve the page. As an aside, HTTP/2 means the connection is only typically done once per website, regardless of the number of resources to download.

If CloudFlare checks the CRL for the source web server certificate I imagine OCSP stapling could reduce the checks required, and therefore the SSL setup time. However I'm not an expert in this area so I'd appreciate thoughts on this.

Some information from CloudFlare regarding whether it is helpful (which suggests it won't help performance)

Thanks for your question. At this time we don't do revocation checking on the certificates served by origin. We may at some point, however, so would suggest stapling OCSP if using a publicly trusted certificate (and not much difficulty).

Tim

Asked: 2016-02-14 00:59:31 +0800 CST

Can't delete AWS snapshot - refers to instance ID that doesn't exist

4

I have a snapshot in AWS Oregon I can't deleted. When I try it says

Snapshot is in use by AMI ami-d2d83cxx

I've checked every region, I have no instance with that ID. I used to run in the Sydney region, now I use Oregon. I only have one instance running anywhere, plus an RDS instance.

The description of the snapshot is

Copied for DestinationAmi ami-d2d83xx from SourceAmi ami-55cfbbxx
for SourceSnapshot snap-3bf220xx. Task created on 1,453,573,325,838.

When I click the volume link it goes to the volume page but there's no volume with that ID.

My best guess is AWS console has gotten confused. I did create the odd AMI for performance testing, but those AMIs were private and I only used them for a short time. I also moved things from Sydney to Oregan.

How do I delete this snapshot? It'll be costing me money. Not much money, but some.

Tim

Asked: 2016-01-16 00:39:57 +0800 CST

Nginx : Alternative to if inside location blocks for caching headers based on a variable

2

I'm trying to use Nginx page caching instead of Wordpress caching. The caching seems to work fine, but I'm having trouble setting conditional caching headers based on a variable - whether a user is logged into wordpress. If a user is logged in I want no-cache headers applied, if not the page can be cached for a day by both Wordpress and the CDN. I'm finding I can only add one header inside an if statement.

I have read (but not fully understood, because it's late here) [if is evil][1]. I also found an answer on stack exchange (on my laptop, can't find it now) that said inside an if block only one add_header works.

Can anyone give me ideas for an alternative that might work better? I know I can combine the expires with the cache-control, but I want more headers in there, plus I want to understand and learn.

Here's a significantly simplified config with the relevant parts in place.

server {
  server_name example.com;

  set $skip_cache 0;
  # POST requests and urls with a query string should always go to PHP
  if ($request_method = POST) {
    set $skip_cache 1;
  }
  if ($query_string != "") {
    set $skip_cache 1;
  }
  # Don't cache uris containing the following segments.
  if ($request_uri ~* "/wp-admin/|/admin-*|/xmlrpc.php|wp-.*.php|/feed/|index.php|sitemap(_index)?.xml") {
    set $skip_cache 1;
  }
  # Don't use the cache for logged in users or recent commenters
  if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in") {
    set $skip_cache 1;
  }

  location / {
    try_files $uri $uri/ /blog/index.php?args;
  }

  location ~ \.(hh|php)$ {
    fastcgi_keep_conn on;
    fastcgi_intercept_errors on;
    fastcgi_pass  php;
    include  fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;

    # Cache Stuff
    fastcgi_cache CACHE_NAME;
    fastcgi_cache_valid 200 1440m;
    add_header X-Cache $upstream_cache_status;

    fastcgi_cache_methods GET HEAD;
    fastcgi_cache_bypass $skip_cache;
    fastcgi_no_cache $skip_cache;

    add_header Z_ABCD "Test header";

    if ($skip_cache = 1) {
      add_header Cache-Control "private, no-cache, no-store";
      add_header CACHE_STATUS "CACHE NOT USED";
    }
    if ($skip_cache = 0) {
      add_header Cache-Control "public, s-maxage = 240";
      expires 1d;
      add_header CACHE_STATUS "USED CACHE";
    }

    add_header ANOTHER_HEADER "message";
    }
}

Tim

Asked: 2016-01-15 11:25:02 +0800 CST

Updating OpenSSH on Amazon Linux - Amazon repository out of date

2

I read today that there's a significant vulnerability in OpenSSH, which is fixed by the latest version, 7.1p2. According to this story your private key is vulnerable to disclosure.

I'm using the latest Amazon Linux AMI, and everything is up to date against Amazon's repository.

[root@aws /]# ssh -V
OpenSSH_6.6.1p1, OpenSSL 1.0.1k-fips 8 Jan 2015

Here's the list of what packages are available in the Amazon yum repository

yum list | grep openssh

openssh.x86_64                      6.6.1p1-22.58.amzn1            @amzn-updates
openssh-clients.x86_64              6.6.1p1-22.58.amzn1            @amzn-updates
openssh-server.x86_64               6.6.1p1-22.58.amzn1            @amzn-updates
openssh-keycat.x86_64               6.6.1p1-22.58.amzn1            amzn-updates
openssh-ldap.x86_64                 6.6.1p1-22.58.amzn1            amzn-updates

It seems like the Amazon repository is around two years behind on OpenSSH updates. I have read that some vendors back port updates to older versions of OpenSSH, so this might not be an issue, or Amazon may address it relatively soon.

Questions:

Is this really a problem?
If it's a problem, what's the best way to update? I would typically find another yum repository, increase its priority, and update from that.

Tim

Asked: 2016-01-11 00:44:07 +0800 CST

Nginx HHVM Wordpress issue with PHP Execution in one intermediate subdirectory

3

I have a weird situation with one of my websites, still in development in AWS. I have nginx 1.9.9 with HHVM 3.6.6-1.amzn1.x86_64 on a t2.micro. It's not publicly accessible.

I have a custom written website in the root of the domain, I have Wordpress in the /blog directory, and wordpress admin is in /blog/wp-admin. The custom site has various files including index.php. Wordpress has index.php and all sorts of other things in the blog directory, wp-admin uses index.php as well.

I can load the custom website, it fully works. Wordpresss admin fully works. The Wordpress blog home screen / story list fully works. The problem is when I click on any of the blog article links to view it in full it shows the custom website home index. So, to say it another way

http://www.example.com/index.php  - custom website works
http://www.example.com/blog/index.php  - blog index works
http://www.example.com/blog/2015/storyname - story load doesn't work with permalink %postname% regardless of text in post name - http://www.example.com/index.php loads
http://www.example.com/blog/2015/?p=96 - story load works
http://www.example.com/blog/wp-admin/ - admin works

When I click the story link I get the same page content as if I'd clicked http://www.example.com/index.php except the images don't load as they're done with relative URLs

http://www.example.com/blog/2015/storyname

When I load the site root /index.php I get the following debug headers back (see my config below for how they're generated)

Z_LOCATION: PHP MAIN
URI: /index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /index.php
Z_REQUEST_FILENAME: /var/www/hr/index.php

When I load /wp-admin/ I get these headers back

Z_LOCATION: PHP MAIN
URI: /blog/wp-admin/index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /blog/wp-admin/index.php
Z_REQUEST_FILENAME: /var/www/hr/blog/wp-admin/index.php

When I load the blog home /blog/index.php I get these headers back

Z_LOCATION: PHP MAIN
URI: /blog/index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /blog/index.php
Z_REQUEST_FILENAME: /var/www/hr/blog/index.php

When I try to load this URL http://www.example.com/blog/2015/storyname I get the following headers back. Z_REQUEST_FILENAME (above) shows the wrong URL being loaded.

Z_LOCATION: PHP MAIN
URI: /index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /index.php
Z_REQUEST_FILENAME: /var/www/hr/index.php

I have no idea why it tries to load the site root index.php when I click that URL. Clues:

Changing the Wordpress permalink structure from %postname% to ?p=123 fixes the issue
None of the other permalink structures helps at all

Why would this be a problem just for viewing blog articles???? I wonder if it's something to do with the try_files?

There's nothing in the hhvm error log, there's nothing in the nginx error log. The access log shows the following when I request that last URL

(IP removed) - - [10/Jan/2016:08:22:19 +0000] "GET /blog/2015/storyname HTTP/1.1" 200 4424 "http://www.example.com/blog/" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0" "-" "0.050"

Here's my nginx site config. I haven't included the main nginx.conf as I don't think it's relevant. NB I have updated this work the working code.

server {
  server_name www.example.com;
  root /var/www/hr;
  access_log      /var/log/nginx/hr.access.log main;

  # Default location to serve
  location / {
    try_files $uri $uri/ /blog/index.php?$args;
    add_header Z_LOCATION "hr_root"; add_header URI $uri; # DEBUG
  }

  location ~*      \.(jpg|jpeg|png|gif|css|js)$ {
    log_not_found off; access_log off;
    add_header Z_LOCATION "STATIC RESOURCES REGEX"; add_header URI $uri;      # DEBUG
  }

  # Send HipHop and PHP requests to HHVM
  location ~ \.(hh|php)$ {
    fastcgi_keep_conn on;
    fastcgi_intercept_errors on;
    fastcgi_pass       php;
    include                        fastcgi_params;
    fastcgi_param      SCRIPT_FILENAME $document_root$fastcgi_script_name;

    # DEBUGGING
    add_header Z_LOCATION "PHP MAIN"; add_header URI $uri;
    add_header Z_DOCUMENT_ROOT "$document_root"; add_header Z_FASTCGI_SCRIPT_NAME "$fastcgi_script_name";
    add_header Z_REQUEST_FILENAME "$request_filename";
  }
}

# Forward non-www requests to www
server {
  listen 0;
  server_name example.com;
  return 302 http://www.example.com$request_uri;
}

Any thoughts, ideas, or help appreciated. This is a fairly curly one, for me, but I suspect it will be a simple change to fix it.

Tim

Asked: 2016-01-09 14:47:08 +0800 CST

Starting HHVM automatically on Amazon Linux server boot

3

I'm having trouble getting hhvm to start when my Amazon Linux (which is apparently very similar to Centos) EC2 instance starts. When I reboot the server hhvm doesn't come up, and there's nothing in the error logs. When I use

sudo service hhvm start

it comes up just fine. Stop/restart works fine too. When I try running the following as ec2-user

service hhvm start

I get these errors

[ec2-user@ip-x ~]$ service hhvm start
Starting hhvm: [Fri Jan  8 22:35:13 2016] [hphp] [2451:7fe8751566c0:0:000001] [] Cannot open log file: /var/log/hhvm/error.log [  OK  ]
touch: cannot touch ‘/var/lock/subsys/hhvm’: Permission denied

I deleted my /var/log/hhvm/error.log and restarted the server. There was nothing in the error log.

As background, I installed hhvm using 'yum install nginx' from the amazon repository. I'm using the /etc/init.d/hhvm that was installed by yum.

When hhvm is running after being started by root I get this from ps -ef | grep hhvm

[root@ip-x init.d]# service hhvm restart
Stopping hhvm:                                             [  OK  ]
Starting hhvm:                                             [  OK  ]
[root@ip-x init.d]# ps -ef | grep hhvm
tim       2555     1  3 22:41 ?        00:00:00 hhvm --config /etc/hhvm/server.ini -d pid=/var/run/hhvm.pid --user tim --mode daemon
root      2560  2458  0 22:42 pts/0    00:00:00 grep --color=auto hhvm

nginx comes up just fine, with its own config file. hhvm package is hhvm-3.6.6-1.amzn1.x86_64.

Any ideas? Any information anyone can give me? I understand the startup script runs as root but starts as the user specified - in my case "tim". "tim" is a member of the root group, which I did recently to try to fix the issue.

I reference this question, which is for Ubuntu. I tried it, but it didn't work.

Here's the startup file in /etc/init.d/hhvm

How do I use AWS SAM to deploy an AWS API Gateway and AWS Java Lambda including POST caching and disabling logging

How to automate certbot certificate renewal on Ubuntu 20.04

How do I enforce encryption in transit with AWS RDS Oracle using CloudFormation YAML?

Amazon Linux - yum update fails with HTTP Error 403 - Forbidden

Enforcing EBS Encryption within AWS Organization using SCP (Service Control Policy)

AWS CodePipeline - how to deploy dozens of CloudFormation / Stackset / Lambda resources without manually creating a pipeline action per file

Restricting S3 access to individual servers when using S3 endpoint in private VPC with no public IPs

Create new EC2 instance with existing EBS volume as root device using CloudFormation

How to start and stop AWS EC2 instance based on a time based schedule

How to run fail2ban on Amazon Linux - No module named fail2ban.version

Certificate Mismatch setting up Route53, CloudFront, Custom Origin

How to fix dependency issue with partially completed yum update of glibc

ReFS / storage spaces drive being dropped under heavy load

Refs: is fragmentation ok / is defragmentation needed?

Does OCSP stapling increase performance for websites behind CloudFlare?

Can't delete AWS snapshot - refers to instance ID that doesn't exist

Nginx : Alternative to if inside location blocks for caching headers based on a variable

Updating OpenSSH on Amazon Linux - Amazon repository out of date

Nginx HHVM Wordpress issue with PHP Execution in one intermediate subdirectory

Starting HHVM automatically on Amazon Linux server boot

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?