Run on Kubernetes Cluster
A comprehensive Kubernetes-based deployment infrastructure for blockchain indexing and data services, managed with ArgoCD and Terraform.
Architecture Overviewβ
This project deploys a complete blockchain indexing platform on Google Cloud Platform (GCP) using:
- GKE Cluster: Multi-node pool Kubernetes cluster
- ArgoCD: GitOps-based continuous deployment
- Terraform: Infrastructure as Code for GCP resources
- Kustomize: Kubernetes manifest management
Core Servicesβ
Data Layerβ
- TimescaleDB: Time-series database with PostgreSQL extensions and AI capabilities
- Indexer Database: Dedicated database for blockchain indexing operations
Application Servicesβ
- GraphQL Engine: Hasura GraphQL API for data access
- IPFS Node: InterPlanetary File System for decentralized storage
- Safe Content Service: Content validation and processing
- TimescaleDB Vectorizer Worker: Vector processing for AI/ML workloads
- Histocrawler: Historical data crawling and indexing service
- Image Guard: Image validation and security service
- RPC Proxy: Blockchain RPC request routing and caching
Consumer Servicesβ
- Decoded Consumer: Blockchain event decoding and processing
- IPFS Upload Consumer: IPFS content upload and management
- Resolver Consumer: Data resolution and lookup services
Management Toolsβ
- pgAdmin: PostgreSQL administration interface
- Ingress Controller: Traffic routing and load balancing
Infrastructure Componentsβ
GKE Cluster Suggested Configurationβ
- Region:
us-west2 - Project:
be-cluster - Network: Custom VPC with private/public subnets
- Node Pools:
db-pool: n2-standard-16 (dedicated for databases)app-pool: e2-standard-2 (application services)consumer-pool: custom-4-8192 (data processing)
Storageβ
- Persistent Volumes: GCP Persistent Disk with resizable storage class
- IPFS Storage: 50Gi persistent volume for IPFS data
- Database Storage: 50Gi for TimescaleDB
Project Structureβ
gcp-deployment/
βββ apps/ # Kubernetes applications
β βββ consumers/ # Data processing consumers
β β βββ decoded/ # Blockchain event decoder
β β βββ ipfs-upload/ # IPFS upload processor
β β βββ resolver/ # Data resolver service
β βββ graphql/ # Hasura GraphQL engine
β βββ histocrawler/ # Historical data crawler
β βββ image-guard/ # Image validation service
β βββ indexer-db/ # Indexer database
β βββ ipfs/ # IPFS node
β βββ pgadmin/ # PostgreSQL admin
β βββ rpc-proxy/ # RPC request proxy
β βββ safe-content/ # Content validation service
β βββ timescale_db/ # TimescaleDB instance
β βββ timescale_db_vectorizer/ # Vector processing
β βββ ingress/ # Ingress configuration
βββ argocd/ # ArgoCD configuration
β βββ coreapps/ # Core application definitions
β βββ namespacedapps/ # Namespace-specific apps
β βββ projects/ # ArgoCD project definitions
β βββ repos/ # Repository secrets
βββ terraform/ # Infrastructure as Code
β βββ debug-gke/ # GKE cluster provisioning
βββ test-kustomize/ # Kustomize testing
Quick Startβ
Prerequisitesβ
- Google Cloud SDK
- Terraform >= 1.0
- kubectl
- ArgoCD CLI
1. Deploy Infrastructureβ
cd terraform/debug-gke
terraform init
terraform plan
terraform apply
2. Configure ArgoCDβ
# Get GKE credentials
gcloud container clusters get-credentials debug-cluster --region us-west2
# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Apply ArgoCD configuration
kubectl apply -f argocd/
3. Deploy Applicationsβ
Applications are automatically deployed via ArgoCD GitOps. The system monitors the Git repository and applies changes automatically.
Configurationβ
Environment Variablesβ
Key services require environment-specific configuration:
- GraphQL Engine: Database connection, CORS settings
- TimescaleDB: PostgreSQL credentials, AI extensions
- IPFS: Storage paths, network configuration
- Safe Content: Content validation rules
- Histocrawler: Blockchain endpoints, indexing parameters
- Image Guard: Image scanning policies, security rules
- RPC Proxy: Upstream RPC endpoints, caching configuration
- Consumers: Event processing queues, database connections
Secrets Managementβ
Secrets are managed through Kubernetes secrets and external secret providers:
- Database credentials
- API keys
- Service account tokens
Monitoring & Observabilityβ
Health Checksβ
- Liveness probes configured for all services
- Readiness probes for database services
- Custom health endpoints for GraphQL and IPFS
Loggingβ
- Structured logging enabled for GraphQL engine
- Query logging for debugging
- WebSocket and HTTP request logging
Securityβ
Network Securityβ
- Private GKE cluster with private nodes
- VPC-native networking
- NAT gateway for outbound internet access
- Ingress controller for external access
Access Controlβ
- Workload Identity for GCP service accounts
- Kubernetes RBAC
- ArgoCD project-based access control
Developmentβ
Local Developmentβ
# Test Kustomize configurations
cd test-kustomize
kubectl kustomize . | kubectl apply --dry-run=client
# Validate manifests
kubectl kustomize apps/graphql/ | kubectl apply --dry-run=client
Adding New Servicesβ
- Create service directory in
apps/ - Add Kubernetes manifests (deployment, service, etc.)
- Create ArgoCD application definition
- Update project permissions if needed
CI/CD Pipelineβ
The deployment follows GitOps principles:
- Code changes pushed to Git repository
- ArgoCD detects changes automatically
- Applications updated in Kubernetes cluster
- Health checks validate deployment
Scalingβ
Horizontal Scalingβ
- Application services can scale horizontally via HPA
- Database services use StatefulSets for data persistence
- IPFS and GraphQL support multiple replicas
Vertical Scalingβ
- Node pools can be resized via Terraform
- Storage volumes support online resizing
- Resource limits configured per service
Troubleshootingβ
Common Issuesβ
- Database Connection: Check TimescaleDB service and secrets
- IPFS Storage: Verify PVC and storage class
- GraphQL Health: Check liveness probe and database connectivity
- ArgoCD Sync: Verify repository access and permissions
- Consumer Processing: Check event queue connectivity and processing status
- Histocrawler: Verify blockchain endpoint accessibility
- Image Guard: Check image scanning service health
- RPC Proxy: Validate upstream RPC endpoint connectivity
Debug Commandsβ
# Check pod status
kubectl get pods -A
# View logs
kubectl logs -f deployment/graphql-engine
# Check ArgoCD applications
argocd app list
# Validate Terraform state
terraform plan