Setting up proxy configuration for Vertex AI Python client library

ExcitedGamer85 · August 20, 2025, 11:48pm

I need help setting up proxy settings for the Vertex AI Python client library in my work environment. We have a corporate proxy that all network traffic must go through.

I want to configure the proxy settings just for the Vertex AI client without affecting other applications running on the same system. Is there a way to set proxy configuration per client instance?

Here’s my current code setup:

from google.cloud import aiplatform
from google.cloud.aiplatform import GenerativeModel

class AIClient:
    
    def create_response(self, auth_credentials, prompt_data):
        aiplatform.init(project="my-project", location="us-central1", credentials=auth_credentials)
        llm_model = GenerativeModel("gemini-pro")
        config = {
            "max_output_tokens": 4096,
            "temperature": 0.8,
            "top_p": 0.1
        }
        return llm_model.generate_content(
            contents=prompt_data,
            generation_config=config,
            stream=True
        )

I tried using environment variables like HTTP_PROXY and HTTPS_PROXY but this affects our entire system and causes problems with other services.

I also attempted to set the proxy variables in code before each API call and then remove them after, but I’m worried this might cause threading issues or other unexpected behavior.

Is there a built-in way to configure proxy settings specifically for the Vertex AI client? Any suggestions would be really helpful!

ClimbingLion · August 29, 2025, 2:28am

you can pass proxy config straight to the gRPC channel with Vertex AI. set the GRPC_PROXY_CONFIG env variable or create a custom channel with proxy settings in your aiplatform.init() call. way cleaner than dealing with requests transport.

nateharris · August 28, 2025, 9:38am

Best way is to create a custom transport with proxy settings for your Google client. Here’s what I use in production:

import requests
from google.auth.transport.requests import Request
from google.cloud import aiplatform
from google.cloud.aiplatform import GenerativeModel

class AIClient:
    def __init__(self, proxy_url=None):
        self.proxy_url = proxy_url
        self._setup_transport()
    
    def _setup_transport(self):
        if self.proxy_url:
            session = requests.Session()
            session.proxies = {
                'http': self.proxy_url,
                'https': self.proxy_url
            }
            self.transport = Request(session)
        else:
            self.transport = None
    
    def create_response(self, auth_credentials, prompt_data):
        if self.transport and hasattr(auth_credentials, 'refresh'):
            auth_credentials.refresh(self.transport)
        
        aiplatform.init(
            project="my-project", 
            location="us-central1", 
            credentials=auth_credentials
        )
        
        llm_model = GenerativeModel("gemini-pro")
        config = {
            "max_output_tokens": 4096,
            "temperature": 0.8,
            "top_p": 0.1
        }
        return llm_model.generate_content(
            contents=prompt_data,
            generation_config=config,
            stream=True
        )

# Usage
client = AIClient(proxy_url="http://your-proxy:8080")

Proxy only affects auth requests for this specific client. I’ve used this across multiple Google Cloud services and it works great without messing with global environment variables.

Key part is creating a custom requests session with proxy config and using it for credential refresh. The API calls inherit the authenticated session context.

sophialee92 · August 27, 2025, 12:15pm

I hit this same problem at my last company and found a workaround that’s been solid. Instead of messing with the transport layer, I built a context manager that temporarily sets proxy environment variables just for Vertex AI calls, then restores everything after. Avoids those threading headaches since each context stays isolated.

import os
from contextlib import contextmanager
from google.cloud import aiplatform
from google.cloud.aiplatform import GenerativeModel

@contextmanager
def proxy_context(http_proxy, https_proxy):
    original_http = os.environ.get('HTTP_PROXY')
    original_https = os.environ.get('HTTPS_PROXY')
    
    os.environ['HTTP_PROXY'] = http_proxy
    os.environ['HTTPS_PROXY'] = https_proxy
    
    try:
        yield
    finally:
        if original_http:
            os.environ['HTTP_PROXY'] = original_http
        else:
            os.environ.pop('HTTP_PROXY', None)
        if original_https:
            os.environ['HTTPS_PROXY'] = original_https
        else:
            os.environ.pop('HTTPS_PROXY', None)

class AIClient:
    def __init__(self, proxy_http, proxy_https):
        self.proxy_http = proxy_http
        self.proxy_https = proxy_https
    
    def create_response(self, auth_credentials, prompt_data):
        with proxy_context(self.proxy_http, self.proxy_https):
            aiplatform.init(project="my-project", location="us-central1", credentials=auth_credentials)
            llm_model = GenerativeModel("gemini-pro")
            config = {
                "max_output_tokens": 4096,
                "temperature": 0.8,
                "top_p": 0.1
            }
            return llm_model.generate_content(
                contents=prompt_data,
                generation_config=config,
                stream=True
            )

Proxy settings only kick in during actual Vertex AI calls and automatically clean up afterwards. Been running this in production for six months without any problems.