Using a non-transparent proxy

Refer to this section if your environment requires all internet traffic to go through an internet proxy. You may use a proxy server to control the connections that are allowed from your VPC or VNet and prevent unattended connections initiated from your environment.

Proxy servers can be used for:
  • FreeIPA backups: Backups created on an hourly basis are uploaded to cloud storage S3/ADLS Gen2.
  • Parcel downloads: Although CDP currently only supports pre-warmed images, it is a requirement to download parcels from archive.cloudera.com when an upgrade is performed.
  • Cluster Connectivity Manager (CCM): Communication via CCMv1 and CCMv2.

For our purposes, because we are addressing an environment with no internet setup, we will only use a proxy server when CCM is being used.

The following CDP services are supported by this feature:

CDP service AWS Azure GCP
Data Lake GA GA GA
FreeIPA GA GA GA
Data Engineering GA
Data Hub GA GA GA
Data Warehouse GA
DataFlow GA
Machine Learning GA
Operational Database

Note that in order to use a non-transparent proxy with CDP data services (such as Data Engineering, Data Warehouse, DataFlow, and Machine Learning), you must first configure it on environment level and then once again when enabling/activating the CDP data service.

Setting up a non-transparent proxy in CDP

To set up a proxy server you can register an http proxy server as a shared resource and then add that shared resource when you set up your environment.

Required role: EnvironmentCreator can register a proxy in CDP. Owner or SharedResourceUser can view the proxy details. Owner can delete the proxy registration from CDP.

Steps

  1. Log in to the CDP web interface.

  2. Navigate to the Management Console.
  3. Select Shared Resources > Proxies from the left navigation pane.

  4. Click Create Proxy Configuration.

  5. Enter the information for your proxy server:
    Parameter Description
    Name (Required) Provide a name for the proxy. The name will be used for this specific proxy in CDP.
    Description You can optionally specify a longer description for this proxy.
    Protocol (Required) Select the protocol used by the proxy: HTTP or HTTPS.
    Server Host (Required) Provide proxy server's host.
    Server Port (Required) Provide the proxy server's port.
    No Proxy Hosts

    The no-proxy field allows you to designate specific IP addresses, domains, or subdomains that bypass the proxy. This setting can be useful for locally resolvable and internal endpoints, for example the CCMv2 agent or the metering agent.

    Enter the values for this field in a comma-separated list. For example: 172.100.0.110,domainname.com,my.host.com

    Note the following guidelines:
    • The period character (".") is allowed as a prefix for domain names only
    • CIDR notation is not allowed
    User name If needed, provide a user name to access the proxy.
    Password If needed, provide a password to access the proxy.
  6. Click REGISTER.
  7. Click Environments in the left navigation pane, then click Register Environment.

  8. Add your environment information, navigating through the Register Environment and Data Lake Scaling steps.

  9. When you reach the Region, Networking and Security steps, choose the Proxy you registered.

  10. Finish setting up your Environment.

As an alternative to using the UI, you can also register the proxy using the CLI.
  1. Use the following commands:

    cdp environments create-proxy-config \
      --proxy-config-name companyProxy \
      --host 10.102.0.19 \
      --port 3128 \
      --user squid \
      --password squid \
      --protocol http
  2. Provide the proxyConfigName in the environment JSON:

    ...
    "subnetIds": [
      "subnet-1",
      "subnet-2",
      "subnet-3"
      ],
      "proxyConfigName": "companyProxy"
    }
  3. Or in the --proxy-config-name argument of the environment creation command, enter the following:

    AWS:

    cdp environments create-aws-environment \
      --cli-input-json '{...}' \
      --proxy-config-name companyProxy
    Azure:
    cdp environments create-azure-environment \
      --cli-input-json '{...}' \
      --proxy-config-name companyProxy