Preparation
Obtain RedHat Pull Secret
Use this URL to download your RedHat Pull Secret: https://console.redhat.com/openshift/install/pull-secret
If your organization does not have an existing RedHat account, you can create a RedHat trial account for a temporary OCP deployment (60 days). Instructions here under the expandable section “Obtain a RedHat Trial Account”
Obtain the AWS IAM credentials
If you can use your permanent security credentials for the AWS account, you will need an Access Key ID and Secret Access Key for the deployer to setup an OpenShift cluster on AWS.
- Go to https://aws.amazon.com/
- Login to the AWS console
- Click on your user name at the top right of the screen
- Select Security credentials. You can also reach this screen via https://console.aws.amazon.com/iam/home?region=us-east-2#/security_credentials.
- If you do not yet have an access key (or you no longer have the associated secret), create an access key
- Store your Access Key ID and Secret Access Key in safe place
Configure AWS
Run the following command and then enter your Access Key ID, Secret Access Key, Region, and Output.
aws configure
Redhat pull secret
The Red Hat pull secret must be downloaded from https://console.redhat.com/openshift/downloads#tool-pull-secret. Rename the file from pull-secret.txt
to pull-secret.json
Create s3 bucket
# name must be unique in region
BUCKET="bucket-customer-date"
aws s3api create-bucket --bucket $BUCKET --region us-east-1
Create pull-secrets folder in s3 bucket
aws s3api put-object --bucket $BUCKET --key "pull-secrets" --region us-east-1
Upload pull-secret.json to s3bucket/pull-secrets
aws s3 cp pull-secret.json s3://$BUCKET/pull-secrets/pull-secret.json
Create AWS KeyPair
The following commands create the KeyPair, outputs the file to the default ssh folder. The second command modifies the permissions of the file.
KEYPAIR_NAME=ec2_ocp
aws ec2 create-key-pair --key-name $KEYPAIR_NAME --key-type rsa --key-format pem --query "KeyMaterial" --output text > ~/.ssh/$KEYPAIR_NAME.pem
chmod 400 ~/.ssh/$KEYPAIR_NAME.pem
Preparing the Installation Files
The default “AvailabilityZones” in cluster.yml are “us-east-2a,us-east-2b,us-east-2c” but can be changed.
To change which Availability Zones are used, search cluster.yml for us-east-2a,us-east-2b,us-east-2c
and replace with the preferred Availability Zones.
Search cluster.yml for REPLACE_ME_WITH_SUBNET_ID
and replace with one of the public subnetID. This will be where the bastion host will live.
Preparing Parameters Override File
Review parameters-override.yaml, the following changes will need to be made - Add API Key (for accessing cp.icr.io) - Add KeyPairName - Add Private Subnets - Add Public Subnets - Add Red Hat Pull Secret s3 location - Add VPC ID - Add Bucket Name for s3 bucket that is being used by Red Hat Pull secret - Add Domain Name - Add Hosted Zone ID for Private Hosted Zone associated with VPC - Add Cluster Name - Add GI External Registry
Hosted Zones
If you require both public and private access, then create both a private and public hosted zone. The private hosted zone should be associated with the VPC and the private hosted zone ID should be configured in the parameters-override.yaml
file.
The public hosted zone should be a child of the registered domain name in use. For example, if the registered domain name is frwd-labs.link
, then the public hosted zone should be at guardium.frwd-labs.link
and that is the DomainName
that is specified in parameters-override.yaml
.
Deployment Steps
Create OCPInstall Role
Download the OCPInstallRole.yaml
Note: This template assumes that roles BootNodeRole
and DevOps-Role
already exist. - Download DefaultRoles.yaml and apply if needed - Edit the OCPInstall role yaml and add account or add substitution and set role names appropriately - Set optional tags on the command
Create the role by running the following command:
aws cloudformation deploy --stack-name OCPInstall-role-1 --template-file OCPInstallRole.yaml --capabilities CAPABILITY_NAMED_IAM --tags *add Key=Value tag here*
Create LambdaExecution Role
Download the LambdaExecutionRole.yaml Create the role by running the following command:
aws cloudformation deploy --stack-name LambdaExecutionRole --template-file LambdaExecutionRole.yaml --capabilities CAPABILITY_NAMED_IAM --tags *add Key=Value tag here*
Deploy CloudFormation templates using AWS CLI
If you’re creating a new VPC
export region=us-east-2 # This should be whatever region you are deploying in
export CLUSTERNAME=GI-OCP # This should be whatever you want to call your cluster
Download the following file:
Create a file called vpc-params.json
cat << EOF > vpc-params.json
[
{
"ParameterKey": "ClusterName",
"ParameterValue": "${CLUSTERNAME}"
},
{
"ParameterKey": "AvailabilityZones",
"ParameterValue": "${region}a,${region}b,${region}c"
},
{
"ParameterKey": "CreateNATGateways",
"ParameterValue": "true"
},
{
"ParameterKey": "CreatePublicSubnets",
"ParameterValue": "true"
},
{
"ParameterKey": "CreatePrivateSubnets",
"ParameterValue": "true"
},
{
"ParameterKey": "NumberOfAZs",
"ParameterValue": "3"
},
{
"ParameterKey": "PrivateSubnet1CIDR",
"ParameterValue": "192.168.0.0/19"
},
{
"ParameterKey": "PrivateSubnet2CIDR",
"ParameterValue": "192.168.32.0/19"
},
{
"ParameterKey": "PrivateSubnet3CIDR",
"ParameterValue": "192.168.64.0/19"
},
{
"ParameterKey": "PrivateSubnet1Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-private-subnet-1"
},
{
"ParameterKey": "PrivateSubnet2Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-private-subnet-2"
},
{
"ParameterKey": "PrivateSubnet3Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-private-subnet-3"
},
{
"ParameterKey": "PublicSubnet1CIDR",
"ParameterValue": "192.168.96.0/21"
},
{
"ParameterKey": "PublicSubnet1Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-public-subnet-1"
},
{
"ParameterKey": "PublicSubnet2CIDR",
"ParameterValue": "192.168.104.0/21"
},
{
"ParameterKey": "PublicSubnet2Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-public-subnet-2"
},
{
"ParameterKey": "PublicSubnet3CIDR",
"ParameterValue": "192.168.112.0/21"
},
{
"ParameterKey": "PublicSubnet3Tag1",
"ParameterValue": "Name=${CLUSTERNAME}-public-subnet-3"
},
{
"ParameterKey": "VPCCIDR",
"ParameterValue": "192.168.0.0/16"
},
{
"ParameterKey": "VPCDNSNameResolution",
"ParameterValue": "true"
},
{
"ParameterKey": "VPCDNSNameCreation",
"ParameterValue": "true"
}
]
EOF
Using the OCPInstall role arn, run the following command to create the VPC:
aws cloudformation create-stack \
--stack-name ${CLUSTERNAME} \
--template-body file://vpc-template.yaml \
--parameters file://vpc-params.json
Using the OCPInstall role arn, run the following command to start the main cloudformation deployment:
If you are developing, add --disable-rollback
flag to the following command to disable the automatic rollback and deletion of the bastion node on failures. This will allow you to troubleshoot in the event that the stack does not complete successfully.
aws cloudformation deploy \
--stack-name stack-deployment-1 \
--template-file cluster.yaml \
--parameter-overrides file://parameters-override.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--tags *add Key=Value tag here* \
--role-arn arn:aws:iam::<ACCOUNT>:role/OCPInstall
Check the AWS Console to see when the cloudformation template has progressed far enough that the bootnode is online.
Using the ssh key from the Key Pair name used in parameters-override.yaml, ssh to the bootnode. The status of the install will be found in the log files under ~ec2-user/cpd-status/logs
.
If the install fails but some resources were already created, some clean up might be required in addition to deleting the stack. Find and delete the following before running the installation again.
- DNS A records on the hosted Zones
- EC2 instances (control plane and worker nodes)
- EC2 load balancers
- EFS file systems
Monitoring
SSM into bootnode
Add AmazonSSMManagedInstanceCore
policy to role used to execute the cloudform template and the user/role that will be connecting to the instance.
The cloudform template creates a boot node that will begin executing commands. One set of commands installs, enables, and starts amazon-ssm-agent
. It may take up to 20 minutes before this agent comes online in the boot node.
Once the instance has started the ssm agent a connection can be initiated with following command:
aws ssm start-session --target $InstanceID
Once a connection has been opened, you may need to change users to the ‘ec2-user’. This can be accomplished with the following commands:
Become root
sudo su
Become ec2-user
su ec2-user
You will now be able to review deployment logs.
Fixing aws command in SSM
SSM does not work exactly the same as SSH. If you intend to use any additional commands, such as aws
, then you need to do the following:
Check the output of running the aws
command, If there is an error message like this:
[47863] Error loading Python lib '/usr/bin/libpython3.11.so.1.0': dlopen: /usr/bin/libpython3.11.so.1.0: cannot open shared object file: No such file or directory
Another possible error message:
$ aws
Python path configuration:
PYTHONHOME = '/usr/bin'
PYTHONPATH = (not set)
program name = '/usr/bin/aws'
isolated = 0
environment = 0
user site = 0
safe_path = 0
import site = 0
is in build tree = 0
stdlib dir = ''
sys._base_executable = '/usr/bin/aws'
sys.base_prefix = ''
sys.base_exec_prefix = ''
sys.platlibdir = 'lib'
sys.executable = '/usr/bin/aws'
sys.prefix = ''
sys.exec_prefix = ''
sys.path = [
'/usr/bin/base_library.zip',
'/usr/bin/lib-dynload',
'/usr/bin',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00007fed39a06c00 (most recent call first):
<no Python frame>
You may not have the correct $PATH.
Incorrect \(PATH: ```\) echo $PATH /home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/bin:/usr/sbin
How to Correct $PATH:
export PATH=“/home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin”
How to persist the change to $PATH:
echo ‘export PATH=“\(HOME/.local/bin:\)HOME/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin”’ >> .bashrc
Confirm this change works:
$ echo \(PATH /home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin\) aws
usage: aws [options]
aws help aws
aws: error: the following arguments are required: command
</details>
#### Monitor the deployment
Check what folders exist in the ec2-home directory. if "cpd-status" has not been created yet, then wait a few minutes. Once "cpd-status" directory appears, run the following command:
tail -f ~/cpd-status/log/cloud-pak-deployer.log ```
This command will show the log file from the cp-deployer process.