Introduction
In the age of digital transformation, data integration and management are critical for enterprises aiming to harness the power of AI. This blog explores how Airbyte, a leading data integration platform, and Qdrant, a powerful vector database, can revolutionize your internal knowledge bases through Retrieval-Augmented Generation (RAG). We will also introduce our Knowledge-bases Integration Platform for Enterprise AI, available on AWS Marketplace, which seamlessly integrates these technologies.
Please follow our blog on RAG Data Platform for Enterprise AI for more information on RAG, Vector Databases and Data Integration.
Get Started:
Get started with our Marketplace offering in under 10 mins.
How to Access our product?
To access our Knowledge-bases Integration platform, please visit the AWS Marketplace offering of our product.
The deployment of the product involves these steps:
- Subscribe to Marketplace product
- Select the Fulfillment Method according to the needs (We offer Isolated-VPC (recommended) and Existing-VPC methods)
- Enter all the required details asked in the CloudFormation UI and create the stack.
Subscribe to our AWS Marketplace product:
Our product comes in the AMI + CloudFormation delivery method, in which the whole infrastructure is made using CloudFormation and the Software comes in an EC2 instance.
Follow these steps to subscribe to our product:
- Visit this AWS Marketplace offering page and Click on Continue to subscribe.
- Some Terms and Conditions are shown. Review and accept them.
- Pending Subscription. Wait till it gets activated.
- After activation, configure the software. We provide this solution in 2 Fulfillment methods:
- Isolated VPC (Recommended).
- Existing VPC.
- Choose the Fulfillment method and region (All US-based regions enabled)
- Click on Continue to Launch.
- Choose the action as Launch CloudFormation and click Launch.
- You will be redirected to the CloudFormation console.
Understanding the Fulfillment Methods:
We are providing 2 Fulfillment methods for our solution:
- Isolated VPC (Recommended): This Delivery method is designed to meet your internal security requirements. Below are the created resources:
- Automated VPC Setup: Our CloudFormation template automates the creation of a dedicated VPC, subnet, route table, and necessary networking components, ensuring a secure and isolated environment for your application. (1 Public subnet, Internet-Gateway etc.).
- EC2 instance with Application: In the above created VPC, all the application related resources are created such as EC2, Security Group, Key-Pair etc.
- Default Auth Mechanism: While launching the CloudFormation template, the user must enter the Username and Password for the Airbyte application. Which are then stored in the Secrets Manager for future accessing.
- Pricing: This architecture bills around $40 per Month, see this estimate. (Additional charges may occur according to the usage of any other resources such as CloudFront or Route 53 which are optional in this architecture). Refer the below architecture diagram for better understanding:
- Private Existing VPC: A cost-saving architecture of the above fulfillment method, which uses your existing infrastructure to create the whole application and can align with your already-existing infrastructure if needed. Below are the created resources.
- Integration with Existing Infrastructure: Leverage your existing VPC and subnets by providing their details during setup, ensuring seamless integration with your current AWS environment. Such as VPC ID, Subnet ID etc.
- Flexible Networking Configuration: With the ability to input your existing VPC and subnet details, you have full control over the networking configuration, allowing you to align the setup with your specific requirements.
- Enhanced Security: By utilizing your existing VPC infrastructure, you maintain the security and isolation benefits already established within your AWS environment.
- Default Auth Mechanism: While launching the CloudFormation template, the user must enter the Username and Password for the Airbyte application. Which are then stored in the Secrets Manager for future accessing.
- Pricing: This architecture also bills around $40 per Month see this estimate (Additional charges may occur according to the usage of any other resources such as CloudFront or Route 53 which are optional in this architecture). Refer architecture below:
Filling the CloudFormation:
Isolated VPC (Recommended):
- Application Configuration: This section contains the credential details
- UIUserName: This expects a username for the application.
- UIPassword: This expects a password for the application. We suggest you keep a strong and unique password for the security of the application.
- Instance Configuration: This contains the details of the EC2 instance. We recommend using the default value for optimum performance.
- Instance Type: This expects the Instance type of the EC2. We suggest m5.xlarge type to ensure performance and t3.xlarge for saving costs.
- Network & Security Configuration: This contains the details of the Network parameters that form the infrastructure. All the parameters expect CIDR notation; an error will be caused if it is incorrect.
- VpcCidrBlock: This is the IPv4 network range for the VPC
- PublicSubnetCIDR: IPv4 address of the Public Subnet.
- Security Group CIDR: Provide a CIDR range such as x.x.x.x/32 which will have access to this application. If there is no such specific requirement, provide CIDR as 0.0.0.0/0, which makes the application available to all the users.
Existing VPC:
In the Private Existing VPC setup, the application configuration parameters remain identical to the Isolated VPC setup. The Instance & Network Configuration details are shown below:
- Instance Configuration: This contains the details of the EC2 instance. We recommend using the default value for optimum performance.
- Instance Type: This expects the Instance type of the EC2. We suggest m5.xlarge type to ensure performance and t3.xlarge for saving costs.
- Key Pair: Select a Key-Pair from the drop-down to access the EC2 instance.
- Network Configuration: This contains the details of the Network parameters that form the infrastructure. The user should have a VPC, 2 public and 2 private subnets created in advance, to select them here in the parameters.
- VpcId: Select the VPC (by ID), in which the application launches.
- PublicSubnet: Strictly select the First Public Subnet (by ID) here. (Error may occur if Public Subnet is not selected)
- InternetGateway: InternetGateway ID, for the Public Subnet to use in order to access the Internet.
- Security Group CIDR: Provide a CIDR range such as x.x.x.x/32 which will have access to this application. If there is no such specific requirement, provide CIDR as 0.0.0.0/0, which makes the application available to all the users.
Contact Us:
- Support: Need help setting up? Our team is here for you. After your purchase, enjoy one week of free installation support. Just email us at support@digital-alpha.com and we’ll take care of any issues you encounter.
- Authentication for Qdrant: The Qdrant instance in this AMI comes without authentication, making it accessible based on the Security Group CIDR settings.
- However, you have the option to implement API-KEY based authentication for added security. Qdrant is currently configured as a Docker image with minimal security settings.
- You can follow the Qdrant Security blog for more information.
- Or contact us for assistance with the authentication setup.
- HTTPS connectivity for Airbyte and Qdrant: In the current AMI, Airbyte and Qdrant are running over HTTP.
- To secure Qdrant with HTTPS, you can follow Qdrant Security blog.
- For Airbyte, you will need a domain name and an SSL certificate, which can be obtained through Route 53 and ACM. If you need help with this process, please reach out to us for support.
- Airbyte Connector availability: Airbyte includes a wide range of source and destination connectors. The version included in this AMI (0.58.0) supports numerous file-system based connectors such as OneDrive, Google Drive, and Confluence, enabling you to sync your knowledgebase to vector databases like Qdrant and Pinecone. If you require a custom connector for a specific source or destination, please contact us. We are available to assist with developing custom connectors to meet your needs.
Important Notes:
- Secrets Management: User-provided usernames and passwords for the Airbyte application are securely stored as secrets within AWS Secrets Manager, seamlessly integrated into the CloudFormation template. This ensures that sensitive credentials are safeguarded and easily manageable, enhancing the overall security posture of the deployment.
- Application Access: According to the Security Group CIDR, if the user is under the provided CIDR range, they may access:
- Data Integration Platform at: http://<PUBLIC-IP-OF-EC2>/
- Vector Database at http://<PUBLIC-IP-OF-EC2>:6333/dashboard/
- Clean Up made simple: When finished with their tasks and confident in data backups, users can delete the entire CloudFormation Stack from the console UI. Select the stack by name and click delete. Note: Deletion may fail if there are explicit dependencies. In such cases, delete dependencies first, then stack. This streamlined process ensures efficient resource management and system cleanliness.