Skip to main content

Run on Kubernetes Cluster

A comprehensive Kubernetes-based deployment infrastructure for blockchain indexing and data services, managed with ArgoCD and Terraform.

Architecture Overview​

This project deploys a complete blockchain indexing platform on Google Cloud Platform (GCP) using:

  • GKE Cluster: Multi-node pool Kubernetes cluster
  • ArgoCD: GitOps-based continuous deployment
  • Terraform: Infrastructure as Code for GCP resources
  • Kustomize: Kubernetes manifest management

Core Services​

Data Layer​

  • TimescaleDB: Time-series database with PostgreSQL extensions and AI capabilities
  • Indexer Database: Dedicated database for blockchain indexing operations

Application Services​

  • GraphQL Engine: Hasura GraphQL API for data access
  • IPFS Node: InterPlanetary File System for decentralized storage
  • Safe Content Service: Content validation and processing
  • TimescaleDB Vectorizer Worker: Vector processing for AI/ML workloads
  • Histocrawler: Historical data crawling and indexing service
  • Image Guard: Image validation and security service
  • RPC Proxy: Blockchain RPC request routing and caching

Consumer Services​

  • Decoded Consumer: Blockchain event decoding and processing
  • IPFS Upload Consumer: IPFS content upload and management
  • Resolver Consumer: Data resolution and lookup services

Management Tools​

  • pgAdmin: PostgreSQL administration interface
  • Ingress Controller: Traffic routing and load balancing

Infrastructure Components​

GKE Cluster Suggested Configuration​

  • Region: us-west2
  • Project: be-cluster
  • Network: Custom VPC with private/public subnets
  • Node Pools:
    • db-pool: n2-standard-16 (dedicated for databases)
    • app-pool: e2-standard-2 (application services)
    • consumer-pool: custom-4-8192 (data processing)

Storage​

  • Persistent Volumes: GCP Persistent Disk with resizable storage class
  • IPFS Storage: 50Gi persistent volume for IPFS data
  • Database Storage: 50Gi for TimescaleDB

Project Structure​

gcp-deployment/
β”œβ”€β”€ apps/ # Kubernetes applications
β”‚ β”œβ”€β”€ consumers/ # Data processing consumers
β”‚ β”‚ β”œβ”€β”€ decoded/ # Blockchain event decoder
β”‚ β”‚ β”œβ”€β”€ ipfs-upload/ # IPFS upload processor
β”‚ β”‚ └── resolver/ # Data resolver service
β”‚ β”œβ”€β”€ graphql/ # Hasura GraphQL engine
β”‚ β”œβ”€β”€ histocrawler/ # Historical data crawler
β”‚ β”œβ”€β”€ image-guard/ # Image validation service
β”‚ β”œβ”€β”€ indexer-db/ # Indexer database
β”‚ β”œβ”€β”€ ipfs/ # IPFS node
β”‚ β”œβ”€β”€ pgadmin/ # PostgreSQL admin
β”‚ β”œβ”€β”€ rpc-proxy/ # RPC request proxy
β”‚ β”œβ”€β”€ safe-content/ # Content validation service
β”‚ β”œβ”€β”€ timescale_db/ # TimescaleDB instance
β”‚ β”œβ”€β”€ timescale_db_vectorizer/ # Vector processing
β”‚ └── ingress/ # Ingress configuration
β”œβ”€β”€ argocd/ # ArgoCD configuration
β”‚ β”œβ”€β”€ coreapps/ # Core application definitions
β”‚ β”œβ”€β”€ namespacedapps/ # Namespace-specific apps
β”‚ β”œβ”€β”€ projects/ # ArgoCD project definitions
β”‚ └── repos/ # Repository secrets
β”œβ”€β”€ terraform/ # Infrastructure as Code
β”‚ └── debug-gke/ # GKE cluster provisioning
└── test-kustomize/ # Kustomize testing

Quick Start​

Prerequisites​

  • Google Cloud SDK
  • Terraform >= 1.0
  • kubectl
  • ArgoCD CLI

1. Deploy Infrastructure​

cd terraform/debug-gke
terraform init
terraform plan
terraform apply

2. Configure ArgoCD​

# Get GKE credentials
gcloud container clusters get-credentials debug-cluster --region us-west2

# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Apply ArgoCD configuration
kubectl apply -f argocd/

3. Deploy Applications​

Applications are automatically deployed via ArgoCD GitOps. The system monitors the Git repository and applies changes automatically.

Configuration​

Environment Variables​

Key services require environment-specific configuration:

  • GraphQL Engine: Database connection, CORS settings
  • TimescaleDB: PostgreSQL credentials, AI extensions
  • IPFS: Storage paths, network configuration
  • Safe Content: Content validation rules
  • Histocrawler: Blockchain endpoints, indexing parameters
  • Image Guard: Image scanning policies, security rules
  • RPC Proxy: Upstream RPC endpoints, caching configuration
  • Consumers: Event processing queues, database connections

Secrets Management​

Secrets are managed through Kubernetes secrets and external secret providers:

  • Database credentials
  • API keys
  • Service account tokens

Monitoring & Observability​

Health Checks​

  • Liveness probes configured for all services
  • Readiness probes for database services
  • Custom health endpoints for GraphQL and IPFS

Logging​

  • Structured logging enabled for GraphQL engine
  • Query logging for debugging
  • WebSocket and HTTP request logging

Security​

Network Security​

  • Private GKE cluster with private nodes
  • VPC-native networking
  • NAT gateway for outbound internet access
  • Ingress controller for external access

Access Control​

  • Workload Identity for GCP service accounts
  • Kubernetes RBAC
  • ArgoCD project-based access control

Development​

Local Development​

# Test Kustomize configurations
cd test-kustomize
kubectl kustomize . | kubectl apply --dry-run=client

# Validate manifests
kubectl kustomize apps/graphql/ | kubectl apply --dry-run=client

Adding New Services​

  1. Create service directory in apps/
  2. Add Kubernetes manifests (deployment, service, etc.)
  3. Create ArgoCD application definition
  4. Update project permissions if needed

CI/CD Pipeline​

The deployment follows GitOps principles:

  1. Code changes pushed to Git repository
  2. ArgoCD detects changes automatically
  3. Applications updated in Kubernetes cluster
  4. Health checks validate deployment

Scaling​

Horizontal Scaling​

  • Application services can scale horizontally via HPA
  • Database services use StatefulSets for data persistence
  • IPFS and GraphQL support multiple replicas

Vertical Scaling​

  • Node pools can be resized via Terraform
  • Storage volumes support online resizing
  • Resource limits configured per service

Troubleshooting​

Common Issues​

  1. Database Connection: Check TimescaleDB service and secrets
  2. IPFS Storage: Verify PVC and storage class
  3. GraphQL Health: Check liveness probe and database connectivity
  4. ArgoCD Sync: Verify repository access and permissions
  5. Consumer Processing: Check event queue connectivity and processing status
  6. Histocrawler: Verify blockchain endpoint accessibility
  7. Image Guard: Check image scanning service health
  8. RPC Proxy: Validate upstream RPC endpoint connectivity

Debug Commands​

# Check pod status
kubectl get pods -A

# View logs
kubectl logs -f deployment/graphql-engine

# Check ArgoCD applications
argocd app list

# Validate Terraform state
terraform plan

Additional Resources​