How do I deploy a lambda Java function with an API gateway REST interface including caching POST methods using AWS Serverless Application Model?
Tim's questions
I'm running certbot on Ubuntu 20.04 in AWS, installed as a snap package. I'm not sure if certbot renewal is running properly. I'd appreciate some help working out how to best get it working.
This is a new server, which I turn on and off while I'm getting it ready for production. It runs about 8 - 10 hours a day at the moment. It's not often running at midnight, which I think is when the cron job runs. It will be on 24/7 in a few days once I finish the configuration.
One thing I found is this question answer saying
You shouldn't have to set up anything. Any recent Debian/Ubuntu install of certbot should install a systemd timer and a cron job (and the cron job will only run certbot if systemd is not active, so you don't get both running).
It looks to me like the certbot timer isn't running, and if it did it appears to be pointing at /dev/null. Because systemd is active I wonder if the cron job is running.
Timers and systemd
I found a comment there may be an issue with timers and snap so maybe this is a known issue.
systemctl list-timers
The timer doesn't seem to run
NEXT LEFT LAST PASSED UNIT ACTIVATES
Wed 2021-03-17 23:44:00 UTC 3h 24min left n/a n/a snap.certbot.renew.timer snap.certbot.renew.service
Certbot timer appears to point at /dev/null. This question indicates that's not how it should be.
> root@aws2:/etc/systemd/system# ls -l | grep certbot
lrwxrwxrwx 1 root root 9 Jan 9 06:38 certbot.timer -> /dev/null
I can see the following in syslog but I'm not sure what it means
Mar 17 16:51:02 aws2 systemd[1]: Started Timer renew for snap application certbot.renew.
Cron
In syslog I can see this the cron job is running but there's no output
Mar 16 00:00:01 aws2 CRON[2072]: (root) CMD (test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew)
The following is in /etc/cron.d/certbot (this was presumably put there by the certbot installation, I get the general idea of what it does but I don't know what the test / perl stuff does)
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew
When I run the whole command (root test /x etc) I get the message below.
Command 'root' not found, but can be installed with: snap install root-framework
When I run this part I get no output (note I have removed the "-q" from certbot for testing). I'm not sure what the test part is doing, but certbot doesn't seem to do anything when I run this command.
> test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot renew
Key questions
- Any ideas what's up with the systemd timer, why it's pointing at /dev/null? Or should I just ignore this known issue?
- Should I "snap install root-framework" to install "root" like Ubuntu is suggesting?
- The cron job "root test" doesn't appear to be doing anything... any anyone explain what is being tested there and whether "cerbot renew" it's actually running?
Update - Proposed Solution
In the /etc/cron.daily folder I've created the following folder. I think it will do what I want, I'll check logs at some point to see. I'm still interested in the questions I asked above.
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
/usr/bin/certbot -q renew
How do I enable and enforce / mandate encryption in transit for AWS RDS Oracle instances, when setting up the RDS database using CloudFormation YAML.
I have an Amazon Linux v1 instance in us-west-2 (Oregon) that is failing yum update as per below. This is an old instance that's been working fine for a few years, updated to a t3a.nano a few months ago. It has an S3 gateway in the VPC.
I created an m3.large instance in the same region and had no issues with updates.
Any ideas how to resolve this? I don't have AWS support so I can't ask them, but if it persists I will try again to reproduce within an account that does have support.
sudo yum update amazon-ssm-agent
Loaded plugins: update-motd, upgrade-helper
Resolving Dependencies
--> Running transaction check
---> Package amazon-ssm-agent.x86_64 0:2.3.662.0-1.amzn1 will be updated
---> Package amazon-ssm-agent.x86_64 0:2.3.714.0-1.amzn1 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
===============================================================================================================
Package Arch Version Repository Size
===============================================================================================================
Updating:
amazon-ssm-agent x86_64 2.3.714.0-1.amzn1 amzn-updates 25 M
Transaction Summary
===============================================================================================================
Upgrade 1 Package
Total download size: 25 M
Is this ok [y/d/N]: y
Downloading packages:
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
To address this issue please refer to the below knowledge base article
https://access.redhat.com/solutions/69319
If above article doesn't help to resolve this issue please open a ticket with Red Hat Support.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-northeast-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.eu-central-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.sa-east-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-southeast-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-west-2.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.us-west-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-southeast-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.ap-northeast-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
amazon-ssm-agent-2.3.714.0-1.a FAILED
http://packages.eu-west-1.amazonaws.com/2018.03/updates/5444ecdf4764/x86_64/Packages/amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64.rpm?instance_id=i-863eaf5c®ion=us-west-2: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
Error downloading packages:
amazon-ssm-agent-2.3.714.0-1.amzn1.x86_64: [Errno 256] No more mirrors to try.
Update
Based on David's suggestion of "yum clean all" I logged in and tried it. After I did this yum no says no updates are available - even though Amazon Linux tells me there are three updates available when I log in via SSH.
Notes:
- Current amazon-ssm-agent is 2.3.662.0
- Previously yum said it would update from 0:2.3.662.0 to 0:2.3.714.0-1
Here's the SSH session
Last login: Thu Dec 19 09:28:01 2019 from (IP address removed)
3 package(s) needed for security, out of 8 available <-- ***
Run "sudo yum update" to apply all updates.
sudo yum clean all
Loaded plugins: update-motd, upgrade-helper
Cleaning repos: amzn-main amzn-updates epel-debuginfo epel-source
Cleaning up everything
sudo yum update -y
Loaded plugins: update-motd, upgrade-helper
amzn-main | 2.1 kB 00:00
amzn-updates | 2.5 kB 00:00
epel-debuginfo/x86_64/metalink | 17 kB 00:00
epel-debuginfo | 3.0 kB 00:00
epel-source/x86_64/metalink | 17 kB 00:00
epel-source | 4.1 kB 00:00
(1/8): amzn-main/latest/group_gz | 4.4 kB 00:00
(2/8): amzn-updates/latest/group_gz | 4.4 kB 00:00
(3/8): epel-source/x86_64/updateinfo | 792 kB 00:00
(4/8): amzn-updates/latest/updateinfo | 615 kB 00:00
(5/8): epel-source/x86_64/primary_db | 1.9 MB 00:00
(6/8): epel-debuginfo/x86_64/primary_db | 831 kB 00:00
(7/8): amzn-main/latest/primary_db | 4.0 MB 00:01
(8/8): amzn-updates/latest/primary_db | 2.5 MB 00:01
No packages marked for update
Is it possible to enforce that all accounts within an AWS organization can only create encrypted EBS volumes?
I know you can enforce it using IAM roles, but I want to know if it can be done with SCP.
Here's what I've come up with so far, but it doesn't work. I've attached this to an account within my organisation but I can create both encrypted and unencrypted volumes.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Action": "ec2:CreateVolume",
"Resource": "*",
"Condition": {
"Bool": {
"ec2:Encrypted": "false"
}
}
}
]
}
What's the best way to deploy dozens of resources such as CloudFormation templates, Stack Sets, and Lambda functions using Code Pipeline?
In AWS I have a multi-account architecture running an AWS Organization. I want a pipeline running in a single account. That pipeline will deploy CloudFormation templates to one or more accounts within the Organization.
The options I've found so far are:
Have a pipeline stage or action for each source file. This works quite well, but means every time you add a source file you need to modify your pipeline, which seems like overhead that could be automated or eliminated. You can't deploy StackSets with this approach. You also need a stage per template per account to deploy to, so it's impractical.
Use nested stacks. The problems with this are 1) Within the master stack I don't know what naming convention to use to call the other stacks direct from CodeCommit. I could work around that by having CodeBuild copy all the files to S3, but it seems inelegant. 2) Nested stacks are more difficult to debug, as they're torn down and deleted if they fail, so it's difficult to find the cause of the problem
Have CodeBuild to run a bash script that deploys all the templates using the AWS CLI.
Have CodeBuild run an Ansible playbook to deploy all the templates.
Have Lambda deploy each template, after being invoked by CodePipeline. This is likely not a great option as each invocation of Lambda would be for a single template, and there wouldn't be information about which account to deploy to. A single Lambda function that does all the deployments might be an option.
Ideally I'd like to have CodePipeline deploy every file with specific extensions in a CodeCommit repo, or even better deploy what's listed in a manifest file. However I don't think this is possible.
I'd prefer to avoid any technologies or services that aren't necessary. I would also prefer not to use Jenkins, Ansible, Teraform, etc, as this script could be deployed at multiple customer sites and I don't want to force any third party technology on them. If I have to use third party I'd rather have something that can run in a CodeBuild container than have to run on an instance like Jenkins.
--
Experience since I asked this question
Having to write Borne Shell (sh) scripts in CodeBuild is complex, painful and slow.
There needs to be some logic around creation or update of StackSets. If you simply call "create stackset" it will fail on update.
There's a reason the AWS Landing Zone pipeline is complex, using things like step functions.
If there was an easy way to write logic such as "if this stackset exists then update it" things would be a lot simpler. The ASW CDK is one possible solution to this, as it lets you create AWS infrastructure using Java, .Net, JavaScript, or TypeScript. Third party tools such as Teraform and such may also make help, but I don't know enough about them to comment.
I'm going to leave this question open in case someone comes up with a great answer.
--
Information from AWS Support
AWS have given the following advice (I've paraphrased it, filtered through my understanding, any errors are my own rather than incorrect advice from AWS):
CodePipeline can only deploy one artifact (eg CloudFormation template) per action
CodePipeline cannot directly deploy a StackSet, which would allow for deployment of templates across accounts. StackSets can be deployed by calling CodeBuild / Lambda.
CodePipeline can deploy to other accounts by specifying a role in that other account. This only deploys to one account at a time, so you would need one action per template per account
CodeBuild started as part of a CodePipeline running in a container gives more flexibility, you can do whatever you like here really
CodePipeline can start Lambda, which is very flexible. If you start Lambda from a CodePipeline action you get the URL of a single resource, which may be limiting. (My guess) You can probably invoke Lambda in a way that lets it do the whole deployment.
Question
How can you restrict access to an S3 bucket to a single AWS instance when the instances have no public IP address and you're using an S3 endpoint in your VPC?
Background and Detail
We have a VPC that has a Virtual Private Gateway, which runs a VPN back to our on-premise data center. The VPC has no internet gateway. The servers have no public IP addresses. There's an S3 endpoint and a route table set up correctly. Let's say there's one subnet with three instances. We have two S3 buckets yet to be created, which will be encrypted.
We'd like to restrict access from instance A S3 bucket A and instance B to S3 bucket B. Instance C shouldn't have access to either bucket.
We also need to be able to upload information to the buckets from the public internet. This access should only be available through specific IPs, which is easy to do using bucket policies. This could make a difference to the solution though.
This page on AWS documentation says
You cannot use an IAM policy or bucket policy to allow access from a VPC IPv4 CIDR range (the private IPv4 address range). VPC CIDR blocks can be overlapping or identical, which may lead to unexpected results. Therefore, you cannot use the aws:SourceIp condition in your IAM policies for requests to Amazon S3 through a VPC endpoint. This applies to IAM policies for users and roles, and any bucket policies. If a statement includes the aws:SourceIp condition, the value fails to match any provided IP address or range. Instead, you can do the following:
- Use your route tables to control which instances can access resources in Amazon S3 via the endpoint.
- For bucket policies, you can restrict access to a specific endpoint or to a specific VPC. For more information, see Using Amazon S3 Bucket Policies.
We can't use route tables as we have two instances that need access to different buckets.
I've also read this page, which wasn't helpful for our situation.
Option: KMS keys
I suspect if we give each bucket its own KMS key and restrict access to the keys to the appropriate instance only that instance can access the decrypted data. I think this will work, but there's some complexity there, and I'm not 100% sure as I haven't done much with KMS.
An advantage of this approach is it denies by default, grants by configuration. Because of the number of users of the account we can't rely on configuration to deny by default, as was very helpfully suggested by Alex below.
Ideas Thoughts or Suggestions
Does anyone have any other suggestions or ideas we could follow up? Or any thoughts / further detail on the KMS idea?
I'm trying to mount an existing volume to a new EC2 Windows instance using CloudFormation. This seems like something that should be possible.
Big Picture
I have a vendor provided AMI which installs some preconfigured software. We want to create a single instance, and we'll change the EC2 instance size occasionally for performance testing. We don't want to lose the data on the single EBS disk that we'll create from the AMI.
Since we're using CloudFormation, if we simply change the AWS::EC2::Instance.InstanceType property and upload the modified stack CloudFormation will a new instance and volume from the AMI. That's not helpful as we'll lose the data we'd have uploaded from the existing disk.
Volumes Method
I tried this script first.
WindowsVolume:
Type: AWS::EC2::Volume
Properties:
AutoEnableIO: true
AvailabilityZone: "ap-southeast-2b"
Encrypted: true
Size: 30
SnapshotId: snap-0008f111111111
Tags:
- Key: Name
Value:
Ref: AWS::StackName
VolumeType: gp2
EC2Instance:
Type: AWS::EC2::Instance
InstanceType: t2.micro
ImageId: ami-663bdc04 # Windows Server stock image
KeyName: removed
IamInstanceProfile: removed
InstanceInitiatedShutdownBehavior: stop
SecurityGroupIds:
Fn::Split: [",", "Fn::ImportValue": StackName-ServerSecurityGroup]
SubnetId:
!ImportValue StackName-Subnet1
Volumes:
- Device: "/dev/sda1"
VolumeId:
Ref: WindowsVolume
I got the error message
Invalid value '/dev/sda1' for unixDevice. Attachment point /dev/sda1 is already in use
BlockDeviceMappings Method
Next I tried using BlockDeviceMappings
BlockDeviceMappings:
- DeviceName: "/dev/sda1"
Ebs:
Ref: WindowsVolume
The error message this time was
Value of property Ebs must be an object
VolumeAttachment Method
I've also tried using a VolumeAttachment instead of the Volumes property or a BlockDeviceMapping.
VolAttach:
Type: AWS::EC2::VolumeAttachment
Properties:
Device: "/dev/sda1"
InstanceId: !Ref EC2Instance
VolumeId: !Ref WindowsVolume
This gave me the same message as above
Invalid value '/dev/sda1' for unixDevice. Attachment point /dev/sda1 is already in use
Key Question
Has anyone successfully mounted an existing root volume, or a snapshot, to a new EC2 instance? If it's possible what's the proper method?
Alternate Approaches
Happy to hear alternate approaches. For example options I've considered are:
- Creating the VPC and related resources using CloudFormation, then create the instance manually using the console.
- Creating the VPC, related resources, and EC2 instance using CloudFormation. From that point stop using CloudFormation and simply use the web console to change instance size.
Is there an easy way to start and stop AWS EC2 instances at a given time each day? This could save me quite a lot of money for my development and test servers.
I installed fail2ban using this command on Amazon Linux
yum install fail2ban
My epel repository is defined as
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch
I got this error when I tried to start the service
service fail2ban start
Starting fail2ban: Traceback (most recent call last):
File "/usr/bin/fail2ban-client", line 37, in <module>
from fail2ban.version import version
ImportError: No module named fail2ban.version
I've tried this fix in this bug report using this diff, which isn't merged into the script I have. It didn't make any difference. I've tried also tried this but I have no idea how it's meant to work, if you're meant to run anything, etc.
Can anyone suggest how to get fail2ban to work on Amazon Linux?
Note below is what was installed with fail2ban
Background
My websites have been using CloudFlare with Let's Encrypt successfully for a year or two. The websites are hosted on EC2, they have valid Let's Encrypt certificates for the root, www, and all used subdomains. The website is run by Wordpress.
What I'm doing
As a learning exercise I decided to change one of my domains, wildphotography.co.nz, over to Route53 and CloudFront. It hasn't gone well.
The Problem
After moving from CloudFlare to Route53 with CloudFront, I can't view my website. Details are below. My desired end state is for Route53 to be my DNS server, and CloudFront my CDN.
Note that I have reverted back to CloudFlare, because I need my website to be online. I had Route53 as my DNS server for 3-4 hours, and I could see that it was resolving to R53.
Problem Details
After I set things up here's the problem I see in my browser
Due to the Route53 setup, the request for the domain is being sent to CloudFront. The certificate being presented by CloudFront is for the *.cloudfront.net domain. Hence the mismatch. I believe I understand the problem but I can't work out how to solve it.
If I go to the Cloudfront URL (d1b5f3w2vf82yc.cloudfront.net) I get this error. Of course, going to this URL wouldn't typically be helpful.
Here's an SSL diagnostic
Here's my CloudFront setup. Note that I took a screenshot after I changed something minor, which is why it shows "in progress". I let it propagate before I tested it.
First the CloudFront overview
CloudFront Origin Settings
CloudFront Root Behavior
Note that I forward from http to https on my Nginx web server, so I don't bother to have CloudFront do it. That gives me additional information in my logs, useful for diagnosis.
Route53 setup
I've removed some irrelevant records relating to email. Note that both the www and non-www domains are alias records pointing at the CloudFront distribution. It won't accept a cname alias - I'm not even sure if that's a valid combination.
What I've tried
I created a new subdomain, origin.wildphotography.co.nz, which is a cname to www.wildphotography.co.nz. I believe this is necessary so CloudFront can find the IP of the origin server.
I've tried CNAMEs, Alias and not Alias, all kinds of things.
One odd thing, is when it was still set up with R53/CloudFront some requests were getting through CloudFront. Not many, but some.
Any ideas would be appreciated. I suspect I have Route53 set up somehow incorrectly.
I attempted to update my production web server this morning (t2 running Amazon Linux) but it failed because I ran out of RAM (php-fpm had it all). I stopped php-fpm to free up some RAM, but the yum update won't complete. The server is running ok, but I'd like to clean up this problem.
# yum update
Resolving Dependencies
--> Running transaction check
---> Package glibc-headers.x86_64 0:2.17-106.168.amzn1 will be updated
--> Processing Dependency: glibc-headers = 2.17-106.168.amzn1 for package: glibc-devel-2.17-106.168.amzn1.x86_64
---> Package glibc-headers.x86_64 0:2.17-157.169.amzn1 will be an update
--> Finished Dependency Resolution
Error: Package: glibc-devel-2.17-106.168.amzn1.x86_64 (@amzn-main)
Requires: glibc-headers = 2.17-106.168.amzn1
Removing: glibc-headers-2.17-106.168.amzn1.x86_64 (@amzn-main)
glibc-headers = 2.17-106.168.amzn1
Updated By: glibc-headers-2.17-157.169.amzn1.x86_64 (amzn-updates)
glibc-headers = 2.17-157.169.amzn1
You could try using --skip-broken to work around the problem
** Found 6 pre-existing rpmdb problem(s), 'yum check' output follows:
glibc-devel-2.17-106.168.amzn1.x86_64 has missing requires of glibc(x86-64) = ('0', '2.17', '106.168.amzn1')
glibc-devel-2.17-157.169.amzn1.x86_64 is a duplicate with glibc-devel-2.17-106.168.amzn1.x86_64
glibc-devel-2.17-157.169.amzn1.x86_64 has missing requires of glibc-headers = ('0', '2.17', '157.169.amzn1')
glibc-headers-2.17-106.168.amzn1.x86_64 has missing requires of glibc(x86-64) = ('0', '2.17', '106.168.amzn1')
subversion-1.9.4-2.55.amzn1.x86_64 has missing requires of subversion-libs(x86-64) = ('0', '1.9.4', '2.55.amzn1')
subversion-1.9.5-1.56.amzn1.x86_64 is a duplicate with subversion-1.9.4-2.55.amzn1.x86_64
Here's the glibc packages that are installed
# rpm -qa | grep glibc
glibc-devel-2.17-157.169.amzn1.x86_64
glibc-devel-2.17-106.168.amzn1.x86_64
glibc-common-2.17-157.169.amzn1.x86_64
glibc-headers-2.17-106.168.amzn1.x86_64
glibc-2.17-157.169.amzn1.x86_64
One problem appears to be two different versions of glibc-devel are installed. It also looks like parts of glibc are on version 106 and others are on version 157.
I rebooted the server, which as expected made no difference, but it was worth a shot. I ran the following, with no effect
yum-complete-transaction
yum-complete-transaction --cleanup-only
yum clean all
In the past I've had similar problems but with less critical packages. I just remove them all and them install them again. I don't believe this is possible with glibc as so many things depend on it.
I've looked on the Centos forums and one option seems to be downgrading some packages to downgrade some packages, but I don't know if there's a better option. Since this is my production server I would appreciate some advice before I attempt this. If this is a good approach which packages should I downgrade? What do I do after I've downgraded, a regular yum update?
Note that I have regular backups and can restore from a recent backup if required, but I'd prefer not to as I'd have to redo some SSL certificate work that was a bit tricky. I plan to move to Ubuntu and use CloudFormation to build the server in the future, so if the server fails I can simply create another, but that's a future task.
I have a Windows 10 workstation used within my business for things like image processing (Photoshop) and software development (Eclipse). It's an i7-2600K based computer, Gigabyte GA-B75M-D3H B75 motherboard, 16 GB RAM. OS is on Samsung 850 pro SSD, there's another 850 pro for data, WD Black for data, plus two 4GB HGST drives each on SATA 3 ports, formatted ReFS, in a storage spaces mirror. The array has 1.63GB used, 1.99GB free.
Recently the ReFS drives in the storage spaces mirror have started dropping - so far three times in a month. This usually occurs under moderate to heavy load, after an extended period. None of the other disks drop under load as far as I can tell, so I assume it's ReFS, Storage Spaces, or a problem with an underlying disk. A reboot brings the disk online.
I can see errors in the event viewer such as those below. These are not all in one place, and while there are NTFS and Storage Spaces log areas under "application and services log -> microsoft -> windows" there doesn't seem to be one for ReFS.
I'd appreciate help tracking down what's causing these problems, and resolving them, so my system stays up.
16:27.05 (under event viewer -> application and services log -> microsoft -> windows -> storagespaces-driver-operationsl
Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} requires a data integrity scan.
Data on the disk is out-of-sync and a data integrity scan is required. To start the scan, run the following command:
Get-ScheduledTask -TaskName "Data Integrity Scan for Crash Recovery" | Start-ScheduledTask
Once you have resolved the condition listed above, you can online the disk by using the following commands in PowerShell:
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsReadOnly $false
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsOffline $false
16:27.05 (windows system event log): The file system was unable to write metadata to the media backing volume R:. A write failed with status "A device which does not exist was specified." ReFS will take the volume offline. It may be mounted again automatically.
16:27.06 (windows system event log): The file system detected a checksum error and was not able to correct it. The name of the file or folder is "<unable to determine file name>".
18:35.50 (windows system event log): Failed to connect to the driver: (-2147024894) The system cannot find the file specified.
18:35.50 (Kernel PNP) The driver \Driver\WudfRd failed to load for the device SWD\WPDBUSENUM\_??_USBSTOR#Disk&Ven_Generic&Prod_STORAGE_DEVICE&Rev_9451#7&2a9fd895&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}.
18:35.58: Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} could not be repaired because there is not enough free space in the storage pool.
Replace any failed or disconnected physical disks. The virtual disk will then be repaired automatically or you can repair it by running this command in PowerShell:
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Repair-VirtualDisk
UPDATE as yagmoth points out this error includes something about USB. The scenarios where I recall this error happening are a) When backing up to an external USB disk b) When running CrashPlan backups to another internal SATA disk
I run ReFS in two configurations:
- In a storage spaces mirror (2x4TB), for most valuable data
- Standalone on a single disk, for offsite backups (NB: not my only backup)
In both of these configurations the data integrity features have been enabled in ReFS / Storage Spaces.
I run the Windows defragmentation tool monthly. However I ran defraggler straight after the windows defrag tool recently and it reported significant fragmentation, with some files having thousands of fragments. Defraggler can defragment the disk, but it's very slow - 2TB of data can take 16 - 24 hours. Is fragmentation a problem in ReFS?
TLDR;
- The "problem" is: ReFS disks have significant fragmentation
- The question: should I defragment these disks using defraggler, to remove all fragmentation, or let the Windows defrag tool do whatever it thinks is required?
Should a website trying to increase performance that uses the CloudFlare CDN (or any CDN really) who already do OCSP stapling configure OCSP stapling on their instance of Nginx if "Full SSL" setting on CloudFlare is used?
In this setup when a browser requests a page from a CloudFlare protected/cached site they connect to CloudFlare using TLS, who then connects to the source web server using TLS to retrieve the freshly generated page. This means two sets of SSL negotiation are done, increasing the time required to retrieve the page. As an aside, HTTP/2 means the connection is only typically done once per website, regardless of the number of resources to download.
If CloudFlare checks the CRL for the source web server certificate I imagine OCSP stapling could reduce the checks required, and therefore the SSL setup time. However I'm not an expert in this area so I'd appreciate thoughts on this.
Some information from CloudFlare regarding whether it is helpful (which suggests it won't help performance)
Thanks for your question. At this time we don't do revocation checking on the certificates served by origin. We may at some point, however, so would suggest stapling OCSP if using a publicly trusted certificate (and not much difficulty).
I have a snapshot in AWS Oregon I can't deleted. When I try it says
Snapshot is in use by AMI ami-d2d83cxx
I've checked every region, I have no instance with that ID. I used to run in the Sydney region, now I use Oregon. I only have one instance running anywhere, plus an RDS instance.
The description of the snapshot is
Copied for DestinationAmi ami-d2d83xx from SourceAmi ami-55cfbbxx
for SourceSnapshot snap-3bf220xx. Task created on 1,453,573,325,838.
When I click the volume link it goes to the volume page but there's no volume with that ID.
My best guess is AWS console has gotten confused. I did create the odd AMI for performance testing, but those AMIs were private and I only used them for a short time. I also moved things from Sydney to Oregan.
How do I delete this snapshot? It'll be costing me money. Not much money, but some.
I'm trying to use Nginx page caching instead of Wordpress caching. The caching seems to work fine, but I'm having trouble setting conditional caching headers based on a variable - whether a user is logged into wordpress. If a user is logged in I want no-cache headers applied, if not the page can be cached for a day by both Wordpress and the CDN. I'm finding I can only add one header inside an if statement.
I have read (but not fully understood, because it's late here) [if is evil][1]. I also found an answer on stack exchange (on my laptop, can't find it now) that said inside an if block only one add_header works.
Can anyone give me ideas for an alternative that might work better? I know I can combine the expires with the cache-control, but I want more headers in there, plus I want to understand and learn.
Here's a significantly simplified config with the relevant parts in place.
server {
server_name example.com;
set $skip_cache 0;
# POST requests and urls with a query string should always go to PHP
if ($request_method = POST) {
set $skip_cache 1;
}
if ($query_string != "") {
set $skip_cache 1;
}
# Don't cache uris containing the following segments.
if ($request_uri ~* "/wp-admin/|/admin-*|/xmlrpc.php|wp-.*.php|/feed/|index.php|sitemap(_index)?.xml") {
set $skip_cache 1;
}
# Don't use the cache for logged in users or recent commenters
if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in") {
set $skip_cache 1;
}
location / {
try_files $uri $uri/ /blog/index.php?args;
}
location ~ \.(hh|php)$ {
fastcgi_keep_conn on;
fastcgi_intercept_errors on;
fastcgi_pass php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
# Cache Stuff
fastcgi_cache CACHE_NAME;
fastcgi_cache_valid 200 1440m;
add_header X-Cache $upstream_cache_status;
fastcgi_cache_methods GET HEAD;
fastcgi_cache_bypass $skip_cache;
fastcgi_no_cache $skip_cache;
add_header Z_ABCD "Test header";
if ($skip_cache = 1) {
add_header Cache-Control "private, no-cache, no-store";
add_header CACHE_STATUS "CACHE NOT USED";
}
if ($skip_cache = 0) {
add_header Cache-Control "public, s-maxage = 240";
expires 1d;
add_header CACHE_STATUS "USED CACHE";
}
add_header ANOTHER_HEADER "message";
}
}
I read today that there's a significant vulnerability in OpenSSH, which is fixed by the latest version, 7.1p2. According to this story your private key is vulnerable to disclosure.
I'm using the latest Amazon Linux AMI, and everything is up to date against Amazon's repository.
[root@aws /]# ssh -V
OpenSSH_6.6.1p1, OpenSSL 1.0.1k-fips 8 Jan 2015
Here's the list of what packages are available in the Amazon yum repository
yum list | grep openssh
openssh.x86_64 6.6.1p1-22.58.amzn1 @amzn-updates
openssh-clients.x86_64 6.6.1p1-22.58.amzn1 @amzn-updates
openssh-server.x86_64 6.6.1p1-22.58.amzn1 @amzn-updates
openssh-keycat.x86_64 6.6.1p1-22.58.amzn1 amzn-updates
openssh-ldap.x86_64 6.6.1p1-22.58.amzn1 amzn-updates
It seems like the Amazon repository is around two years behind on OpenSSH updates. I have read that some vendors back port updates to older versions of OpenSSH, so this might not be an issue, or Amazon may address it relatively soon.
Questions:
- Is this really a problem?
- If it's a problem, what's the best way to update? I would typically find another yum repository, increase its priority, and update from that.
I have a weird situation with one of my websites, still in development in AWS. I have nginx 1.9.9 with HHVM 3.6.6-1.amzn1.x86_64 on a t2.micro. It's not publicly accessible.
I have a custom written website in the root of the domain, I have Wordpress in the /blog directory, and wordpress admin is in /blog/wp-admin. The custom site has various files including index.php. Wordpress has index.php and all sorts of other things in the blog directory, wp-admin uses index.php as well.
I can load the custom website, it fully works. Wordpresss admin fully works. The Wordpress blog home screen / story list fully works. The problem is when I click on any of the blog article links to view it in full it shows the custom website home index. So, to say it another way
http://www.example.com/index.php - custom website works
http://www.example.com/blog/index.php - blog index works
http://www.example.com/blog/2015/storyname - story load doesn't work with permalink %postname% regardless of text in post name - http://www.example.com/index.php loads
http://www.example.com/blog/2015/?p=96 - story load works
http://www.example.com/blog/wp-admin/ - admin works
When I click the story link I get the same page content as if I'd clicked http://www.example.com/index.php except the images don't load as they're done with relative URLs
http://www.example.com/blog/2015/storyname
When I load the site root /index.php I get the following debug headers back (see my config below for how they're generated)
Z_LOCATION: PHP MAIN
URI: /index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /index.php
Z_REQUEST_FILENAME: /var/www/hr/index.php
When I load /wp-admin/ I get these headers back
Z_LOCATION: PHP MAIN
URI: /blog/wp-admin/index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /blog/wp-admin/index.php
Z_REQUEST_FILENAME: /var/www/hr/blog/wp-admin/index.php
When I load the blog home /blog/index.php I get these headers back
Z_LOCATION: PHP MAIN
URI: /blog/index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /blog/index.php
Z_REQUEST_FILENAME: /var/www/hr/blog/index.php
When I try to load this URL http://www.example.com/blog/2015/storyname I get the following headers back. Z_REQUEST_FILENAME (above) shows the wrong URL being loaded.
Z_LOCATION: PHP MAIN
URI: /index.php
Z_DOCUMENT_ROOT: /var/www/hr
Z_FASTCGI_SCRIPT_NAME: /index.php
Z_REQUEST_FILENAME: /var/www/hr/index.php
I have no idea why it tries to load the site root index.php when I click that URL. Clues:
- Changing the Wordpress permalink structure from %postname% to ?p=123 fixes the issue
- None of the other permalink structures helps at all
Why would this be a problem just for viewing blog articles???? I wonder if it's something to do with the try_files?
There's nothing in the hhvm error log, there's nothing in the nginx error log. The access log shows the following when I request that last URL
(IP removed) - - [10/Jan/2016:08:22:19 +0000] "GET /blog/2015/storyname HTTP/1.1" 200 4424 "http://www.example.com/blog/" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0" "-" "0.050"
Here's my nginx site config. I haven't included the main nginx.conf as I don't think it's relevant. NB I have updated this work the working code.
server {
server_name www.example.com;
root /var/www/hr;
access_log /var/log/nginx/hr.access.log main;
# Default location to serve
location / {
try_files $uri $uri/ /blog/index.php?$args;
add_header Z_LOCATION "hr_root"; add_header URI $uri; # DEBUG
}
location ~* \.(jpg|jpeg|png|gif|css|js)$ {
log_not_found off; access_log off;
add_header Z_LOCATION "STATIC RESOURCES REGEX"; add_header URI $uri; # DEBUG
}
# Send HipHop and PHP requests to HHVM
location ~ \.(hh|php)$ {
fastcgi_keep_conn on;
fastcgi_intercept_errors on;
fastcgi_pass php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
# DEBUGGING
add_header Z_LOCATION "PHP MAIN"; add_header URI $uri;
add_header Z_DOCUMENT_ROOT "$document_root"; add_header Z_FASTCGI_SCRIPT_NAME "$fastcgi_script_name";
add_header Z_REQUEST_FILENAME "$request_filename";
}
}
# Forward non-www requests to www
server {
listen 0;
server_name example.com;
return 302 http://www.example.com$request_uri;
}
Any thoughts, ideas, or help appreciated. This is a fairly curly one, for me, but I suspect it will be a simple change to fix it.
I'm having trouble getting hhvm to start when my Amazon Linux (which is apparently very similar to Centos) EC2 instance starts. When I reboot the server hhvm doesn't come up, and there's nothing in the error logs. When I use
sudo service hhvm start
it comes up just fine. Stop/restart works fine too. When I try running the following as ec2-user
service hhvm start
I get these errors
[ec2-user@ip-x ~]$ service hhvm start
Starting hhvm: [Fri Jan 8 22:35:13 2016] [hphp] [2451:7fe8751566c0:0:000001] [] Cannot open log file: /var/log/hhvm/error.log [ OK ]
touch: cannot touch ‘/var/lock/subsys/hhvm’: Permission denied
I deleted my /var/log/hhvm/error.log and restarted the server. There was nothing in the error log.
As background, I installed hhvm using 'yum install nginx' from the amazon repository. I'm using the /etc/init.d/hhvm that was installed by yum.
When hhvm is running after being started by root I get this from ps -ef | grep hhvm
[root@ip-x init.d]# service hhvm restart
Stopping hhvm: [ OK ]
Starting hhvm: [ OK ]
[root@ip-x init.d]# ps -ef | grep hhvm
tim 2555 1 3 22:41 ? 00:00:00 hhvm --config /etc/hhvm/server.ini -d pid=/var/run/hhvm.pid --user tim --mode daemon
root 2560 2458 0 22:42 pts/0 00:00:00 grep --color=auto hhvm
nginx comes up just fine, with its own config file. hhvm package is hhvm-3.6.6-1.amzn1.x86_64.
Any ideas? Any information anyone can give me? I understand the startup script runs as root but starts as the user specified - in my case "tim". "tim" is a member of the root group, which I did recently to try to fix the issue.
I reference this question, which is for Ubuntu. I tried it, but it didn't work.
Here's the startup file in /etc/init.d/hhvm