Log 8 π«
Β· 2 min read
Objectivesβ
- Deploy watsonx.ai on self-managed AWS infrastructure.
Accomplishmentsβ
AWS
- Fixes to cluster-sts.yaml and other deployment resources.
- Fixed error in cluster-sts.yml by commenting out lines 590-599.
- Changed
IamInstanceProfile: !Ref BootnodeInstanceProfile
toIamInstanceProfile: <InstanceProfileName>
- Changed
SubnetId: !Ref PublicSubnet1ID
toSubnetId: <PrivateSubnetID>
to account for private deployments - Updated LambdaExecutionRole.json line 14: from
ec2.aws.com
tolambda.aws.com
and addedcloudformation.aws.com
of allowed services. - Fixed LambdaExecutionRole ARN to proper role name.
- Commented out
/bin/bash ./cp-deploy.sh env apply -e env_id=${ClusterName} [--accept-all-licenses]
- Added VPC and Subnet IDs to the βCleanupLambdaβ lambda function in cluster-sts, which then required adding βec2:CreateNetworkInterfaceβ permission to LambdaExecutionRole
- Adding tags to CleanupLambda with Application IDs.
- Successful deployment of BootNode instance.
RAG
- Creation of cronjob to capture logs from Python app.
- Enabled metadata insertion into chunks in vector store -> (hopefully) increases retrieval accuracy
- Return context to user (shows sources used to generate responses)
- Added mixtral model support
- Enable functionality for user to give custom rag parameters
- Migrated vector DB from FAISS to chromaDB to enable the metadata functionality
- Script written to easily test rag implementation and save results in csv
- Implemented cache logic to make sure it considers combination of parameters as well before chosing to send a cached response
- Added better logic for caching to improve performance
- Remove unwanted parameters from request body
In Progressβ
- End-to-end deployment of OCP, CP4D, and watsonx.ai (with GPU node)
- Tagging cp-deployer.sh generated resources.
- Updating solution docs with better asset linking.
- Exploring WatsonX Discovery
Next Stepsβ
- Continue over the shoulder working sessions
- Kick off CloudFormation template install with updated STS templates.
- Compilation of required endpoints
- Deploy latest RAG version on AWS
- Build out actions & flow in Watsonx Assistant after properly defining personas & objectives.
- Kick off Cloud Pak for Deployment entitlement key.
- Build RAG application using WatsonX Discovery.
- Compare WatsonX Discovery RAG with existing RAG results.
Tracking (Issues)β
- Require sign-off on final CloudFormation template.
- Red Hat CoreOS AMI pending approval.
- LambdaCleanup error from not being able to assume role.
- Double checking role names in Cloudformation template.