Эта страница переведена с помощью Cloud Translation API.

Семинар Anthos Service Mesh: Руководство по лабораторной работе

1. СЕМИНАР ПО АЛЬФА-ФАКТУРЕ

Ссылка на мастер-класс по программированию : bit.ly/asm-workshop

2. Обзор

Архитектурная схема

Этот практический семинар посвящен настройке глобально распределенных сервисов в GCP в производственной среде. Основные используемые технологии — Google Kubernetes Engine (GKE) для вычислительных ресурсов и Istio Service Mesh для обеспечения безопасного подключения, мониторинга и расширенного управления трафиком. Все методы и инструменты, используемые на этом семинаре, вы будете применять в производственной среде.

Повестка дня

Модуль 0 — Введение и настройка платформы
Введение и архитектура
Введение в Service Mesh и Istio/ASM
Лаборатория: Настройка инфраструктуры: Рабочий процесс пользователя
Перерыв
Вопросы и ответы
Модуль 1 — Установка, защита и мониторинг приложений с помощью ASM
Модель репозиториев: объяснение репозиториев инфраструктуры и Kubernetes.
Лабораторная работа: Развертывание тестового приложения
Распределенные сервисы и наблюдаемость
Обед
Лабораторная работа: Наблюдаемость с помощью Stackdriver
ВОПРОСЫ И ОТВЕТЫ
Модуль 2 - DevOps - Канареечные развертывания, политики/RBAC
Обнаружение сервисов в многокластерной среде и обеспечение безопасности/политик
Лаборатория: Взаимная TLS
Развертывание канареек
Лаборатория: Развертывание канареечных систем
Надежная глобальная балансировка нагрузки в многокластерном режиме
Перерыв
Лаборатория: Политика авторизации
ВОПРОСЫ И ОТВЕТЫ
Модуль 3 - Инфраструктурные операции - Модернизация платформы
Компоненты распределенных сервисов
Лаборатория: Масштабирование инфраструктуры
Следующие шаги

Слайды

Слайды к этому семинару можно найти по следующей ссылке:

Слайды семинара ASM

Предварительные требования

Для участия в этом семинаре вам потребуется следующее:

Узел организации GCP
Идентификатор платежного аккаунта (ваш пользователь должен быть администратором платежного аккаунта).
Роль администратора организации (IAM) на уровне организации для вашего пользователя.

3. Настройка инфраструктуры — административный рабочий процесс

Объяснение скрипта Bootstrap Workshop

Для настройки начальной среды для семинара используется скрипт bootstrap_workshop.sh . Вы можете использовать этот скрипт для настройки одной среды для себя или нескольких сред для нескольких пользователей, если проводите этот семинар в качестве обучения для нескольких человек.

Для работы скрипта Bootstrap Workshop требуются следующие входные данные:

Название организации (например, yourcompany.com ) — это организация, в которой вы создаете среды для семинара.
Идентификатор выставления счетов (например, 12345-12345-12345 ) — этот идентификатор используется для выставления счетов за все ресурсы, использованные во время семинара.
Номер семинара (например, 01 ) — это двухзначное число. Оно используется, если вы проводите несколько семинаров в один день и хотите отслеживать их отдельно. Номера семинаров также используются для определения идентификаторов проектов. Наличие отдельных номеров семинаров упрощает обеспечение уникальности идентификаторов проектов каждый раз. В дополнение к номеру семинара, для идентификаторов проектов также используется текущая дата (в формате YYMMDD ). Сочетание даты и номера семинара обеспечивает уникальные идентификаторы проектов.
Начальный номер пользователя (например, 1 ) — это число обозначает первого пользователя в мастерской. Например, если вы хотите создать мастерскую для 10 пользователей, вы можете установить начальный номер пользователя 1 и конечный номер пользователя 10.
Номер конечного пользователя (например, 10 ) — это число обозначает последнего пользователя в мастерской. Например, если вы хотите создать мастерскую для 10 пользователей, вы можете установить номер начального пользователя равным 1 и номер конечного пользователя равным 10. Если вы настраиваете единую среду (например, для себя), установите одинаковые номера начального и конечного пользователей. Это создаст единую среду.

Административный бакет GCS (например, my-gcs-bucket-name ) — бакет GCS используется для хранения информации, связанной с мастер-классами. Эта информация используется скриптом cleanup_workshop.sh для корректного удаления всех ресурсов, созданных во время инициализации скрипта мастер-класса. Администраторы, создающие мастер-классы, должны иметь права на чтение/запись в этот бакет.

Скрипт начальной загрузки для мастерской использует указанные выше значения и выступает в качестве скрипта-обертки, который вызывает скрипт setup-terraform-admin-project.sh . Скрипт setup-terraform-admin-project.sh создает среду мастерской для одного пользователя.

Для запуска мастерской требуются права администратора.

В этом семинаре участвуют два типа пользователей. Первый — это ADMIN_USER , который создает и удаляет ресурсы для семинара, а второй — это MY_USER , который выполняет действия в семинаре. MY_USER имеет доступ только к своим собственным ресурсам. ADMIN_USER имеет доступ ко всем настройкам пользователей. Если вы создаете настройки для себя, то ADMIN_USER и MY_USER — это одно и то же лицо. Если вы инструктор, создающий этот семинар для нескольких студентов, то ваши ADMIN_USER и MY_USER будут разными.

Для пользователя ADMIN_USER требуются следующие разрешения на уровне организации:

Владелец - Право собственности владельца проекта на все проекты в Организации.
Администрирование папок — возможность создавать и удалять папки в организации. Каждому пользователю предоставляется отдельная папка со всеми его ресурсами в рамках проекта.
Администратор организации
Создатель проектов — умение создавать проекты в организации.
Удаление проектов — возможность удалять проекты в организации.
Администрирование IAM в проектах — возможность создавать правила IAM для всех проектов в организации.

Помимо этого, ADMIN_USER также должен быть администратором по выставлению счетов для идентификатора выставления счетов, используемого для семинара.

Схема пользователя и права доступа, используемые в ходе семинара.

Если вы планируете создать этот семинар для пользователей (помимо себя) в вашей организации, необходимо придерживаться определенной схемы именования пользователей для MY_USERs . В скрипте bootstrap_workshop.sh вы указываете начальный и конечный номера пользователей. Эти номера используются для создания следующих имен пользователей:

user<3 digit user number>@<organization_name>

Например, если вы запустите скрипт начальной загрузки среды разработки с начальным пользователем 1 и конечным пользователем 3 в вашей организации с именем yourcompany.com, будут созданы среды разработки для следующих пользователей:

user001@yourcompany.com
user002@yourcompany.com
user003@yourcompany.com

Этим пользователям присваиваются роли владельца проекта для конкретных проектов, созданных во время выполнения скрипта setup_terraform_admin_project.sh. При использовании скрипта начальной загрузки необходимо придерживаться этой схемы именования пользователей. См. инструкцию по добавлению нескольких пользователей одновременно в GSuite .

Инструменты, необходимые для мастерской

Данный мастер-класс предназначен для запуска непосредственно в Cloud Shell. Для его проведения необходимы следующие инструменты.

gcloud (версия >= 270)
kubectl
sed (работает с sed в Cloud Shell/Linux, но не в Mac OS)
git (убедитесь, что у вас установлена последняя версия)
sudo apt update
sudo apt install git
jq
envsubst
кастомизировать

Настройте мастерскую для себя (настройка для одного пользователя).

Откройте Cloud Shell и выполните все действия, описанные ниже, в Cloud Shell. Щелкните по ссылке ниже.

ОБЛАЧНАЯ ОБОЛОЧКА

Убедитесь, что вы вошли в gcloud под учетной записью администратора.

gcloud config list

Создайте WORKDIR и клонируйте репозиторий мастерской.

mkdir asm-workshop
cd asm-workshop
export WORKDIR=`pwd`
git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-workshop.git asm

Укажите название вашей организации, идентификатор выставления счетов, номер семинара и административный сегмент GCS, который будет использоваться для семинара. Ознакомьтесь с необходимыми разрешениями для настройки семинара в разделах выше.

gcloud organizations list
export ORGANIZATION_NAME=<ORGANIZATION NAME>

gcloud beta billing accounts list
export ADMIN_BILLING_ID=<ADMIN_BILLING ID>

export WORKSHOP_NUMBER=<two digit number for example 01>

export ADMIN_STORAGE_BUCKET=<ADMIN CLOUD STORAGE BUCKET>

Запустите скрипт bootstrap_workshop.sh. Выполнение этого скрипта может занять несколько минут.

cd asm
./scripts/bootstrap_workshop.sh --org-name ${ORGANIZATION_NAME} --billing-id ${ADMIN_BILLING_ID} --workshop-num ${WORKSHOP_NUMBER} --admin-gcs-bucket ${ADMIN_STORAGE_BUCKET} --set-up-for-admin

После завершения выполнения скрипта bootstrap_workshop.sh для каждого пользователя в организации создается папка GCP. Внутри этой папки создается проект администрирования Terraform . Проект администрирования Terraform используется для создания остальных ресурсов GCP, необходимых для этого семинара. Вы включаете необходимые API в проекте администрирования Terraform. Вы используете Cloud Build для применения планов Terraform. Вы назначаете учетной записи службы Cloud Build соответствующие роли IAM, чтобы она могла создавать ресурсы в GCP. Наконец, вы настраиваете удаленный бэкенд в хранилище Google Cloud Storage (GCS) для хранения состояний Terraform для всех ресурсов GCP.

Для просмотра задач Cloud Build в административном проекте Terraform вам потребуется идентификатор административного проекта Terraform. Он хранится в файле vars/vars.sh в каталоге вашего проекта asm. Этот каталог сохраняется только в том случае, если вы настраиваете Workshop для себя как администратор.

Подключите файл переменных, чтобы установить переменные среды.

echo "export WORKDIR=$WORKDIR" >> $WORKDIR/asm/vars/vars.sh
source $WORKDIR/asm/vars/vars.sh

Настройка мастерской для нескольких пользователей (многопользовательская настройка)

Откройте Cloud Shell и выполните все действия, описанные ниже, в Cloud Shell. Щелкните по ссылке ниже.

ОБЛАЧНАЯ ОБОЛОЧКА

Убедитесь, что вы вошли в gcloud под учетной записью администратора.

gcloud config list

Создайте WORKDIR и клонируйте репозиторий мастерской.

mkdir asm-workshop
cd asm-workshop
export WORKDIR=`pwd`
git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-workshop.git asm

Укажите название вашей организации, идентификатор выставления счетов, номер семинара, номера начального и конечного пользователей, а также административный сегмент GCS, который будет использоваться для семинара. Ознакомьтесь с необходимыми разрешениями для настройки семинара в разделах выше.

gcloud organizations list
export ORGANIZATION_NAME=<ORGANIZATION NAME>

gcloud beta billing accounts list
export ADMIN_BILLING_ID=<BILLING ID>

export WORKSHOP_NUMBER=<two digit number for example 01>

export START_USER_NUMBER=<number for example 1>

export END_USER_NUMBER=<number greater or equal to START_USER_NUM>

export ADMIN_STORAGE_BUCKET=<ADMIN CLOUD STORAGE BUCKET>

Запустите скрипт bootstrap_workshop.sh. Выполнение этого скрипта может занять несколько минут.

cd asm
./scripts/bootstrap_workshop.sh --org-name ${ORGANIZATION_NAME} --billing-id ${ADMIN_BILLING_ID} --workshop-num ${WORKSHOP_NUMBER} --start-user-num ${START_USER_NUMBER} --end-user-num ${END_USER_NUMBER} --admin-gcs-bucket ${ADMIN_STORAGE_BUCKET}

Чтобы получить идентификаторы проектов Terraform, воспользуйтесь файлом workshop.txt из административного хранилища GCS.

export WORKSHOP_ID="$(date '+%y%m%d')-${WORKSHOP_NUMBER}"
gsutil cp gs://${ADMIN_STORAGE_BUCKET}/${ORGANIZATION_NAME}/${WORKSHOP_ID}/workshop.txt .

4. Подготовка и обустройство лаборатории

Выберите свой лабораторный курс

Практические занятия в этом семинаре можно выполнять двумя способами:

Простой и быстрый способ создания интерактивных скриптов .
Способ " ручного копирования и вставки каждой инструкции "

Метод ускоренного выполнения скриптов позволяет запускать один интерактивный скрипт для каждой лабораторной работы, который пошагово выполняет команды, необходимые для этой работы. Команды выполняются партиями с кратким описанием каждого шага и его результатов. После каждой партии вам предлагается перейти к следующей. Таким образом, вы можете выполнять лабораторные работы в удобном для вас темпе. Ускоренные скрипты являются идемпотентными , то есть вы можете запускать их несколько раз, получая тот же результат.

Ускоренные сценарии будут отображаться вверху каждой лабораторной работы в зеленом прямоугольнике, как показано ниже.

Метод копирования и вставки — это традиционный способ копирования и вставки отдельных блоков команд с пояснениями к командам. Этот метод предназначен для однократного использования. Нет гарантии, что повторное выполнение команд этим методом даст тот же результат.

При выполнении лабораторных работ, пожалуйста, выберите один из двух методов.

Ускоренная настройка скрипта

Получить информацию о пользователе

Данный семинар проводится с использованием временной учетной записи пользователя (или учетной записи лаборатории), созданной администратором семинара. Учетная запись лаборатории является владельцем всех проектов в семинаре. Администратор семинара предоставляет учетные данные учетной записи лаборатории (имя пользователя и пароль) пользователю, участвующему в семинаре. Все проекты пользователя имеют префикс, равный имени пользователя учетной записи лаборатории; например, для учетной записи лаборатории user001@yourcompany.com идентификатор проекта администратора Terraform будет user001-200131-01-tf-abcde и так далее для остальных проектов. Каждый пользователь должен войти в систему с помощью учетной записи лаборатории, предоставленной администратором семинара, и пройти семинар, используя эту учетную запись.

Откройте Cloud Shell, перейдя по ссылке ниже.

ОБЛАЧНАЯ ОБОЛОЧКА

Войдите в систему, используя учетные данные лабораторной учетной записи (не используйте корпоративную или личную учетную запись). Лабораторная учетная запись выглядит следующим образом: userXYZ@<workshop_domain>.com .
Поскольку это новый аккаунт, вам будет предложено принять Условия использования Google . Нажмите «Принять».

4. На следующем экране установите флажок, подтверждающий согласие с Условиями использования Google , и нажмите Start Cloud Shell .

На этом этапе создается небольшая виртуальная машина Linux Debian, которую вы будете использовать для доступа к ресурсам GCP. Каждой учетной записи предоставляется виртуальная машина Cloud Shell. Вход в систему с использованием учетной записи лаборатории осуществляется путем создания виртуальной машины и авторизации с использованием учетных данных этой учетной записи. В дополнение к Cloud Shell также предоставляется редактор кода, упрощающий редактирование конфигурационных файлов (terraform, YAML и т. д.). По умолчанию экран Cloud Shell разделен на среду оболочки Cloud Shell (внизу) и редактор кода Cloud (вверху). Карандаш и приглашение командной строки Значки в правом верхнем углу позволяют переключаться между двумя режимами (оболочка и редактор кода). Вы также можете перетаскивать среднюю разделительную полосу (вверх или вниз) и вручную изменять размер каждого окна. 5. Создайте рабочую директорию (WORKDIR) для этого семинара. WORKDIR — это папка, из которой вы будете выполнять все лабораторные работы для этого семинара. Выполните следующие команды в Cloud Shell, чтобы создать WORKDIR.

mkdir -p ${HOME}/asm-workshop
cd ${HOME}/asm-workshop
export WORKDIR=`pwd`

Сохраните имя пользователя учетной записи лаборатории в качестве переменной для использования в этом семинаре. Это та же учетная запись, под которой вы вошли в Cloud Shell.

export MY_USER=<LAB ACCOUNT EMAIL PROVIDED BY THE WORKSHOP ADMIN>
# For example export MY_USER=user001@gcpworkshops.com

Чтобы убедиться в правильности установки переменных WORKDIR и MY_USER, выполните следующие команды.

echo "WORKDIR set to ${WORKDIR}" && echo "MY_USER set to ${MY_USER}"

Клонируйте репозиторий мастерской.

git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-workshop.git ${WORKDIR}/asm

5. Настройка инфраструктуры — Рабочий процесс пользователя

Цель: Проверка инфраструктуры и установки Istio.

Установка инструментов для мастерской
Репозиторий мастерской клонирования
Проверьте установку Infrastructure .
Проверьте установку k8s-repo .
Проверьте установку Istio.

Инструкции к лабораторной работе по методу копирования и вставки

Получить информацию о пользователе

Администратор, настраивающий мастерскую, должен предоставить пользователю имя пользователя и пароль. Все проекты пользователя будут начинаться с имени пользователя, например, для пользователя user001@yourcompany.com идентификатор проекта администратора Terraform будет user001-200131-01-tf-abcde и так далее для остальных проектов. Каждый пользователь имеет доступ только к своей собственной среде мастерской.

Инструменты, необходимые для мастерской

gcloud (версия >= 270)
kubectl
sed (работает с sed в Cloud Shell/Linux, но не в Mac OS)
git (убедитесь, что у вас установлена последняя версия)
sudo apt update
sudo apt install git
jq
envsubst
кастомизировать
PV

Доступ к проекту администрирования Terraform

После завершения работы скрипта bootstrap_workshop.sh для каждого пользователя в организации создается папка GCP. Внутри этой папки создается проект администрирования Terraform . Проект администрирования Terraform используется для создания остальных ресурсов GCP, необходимых для этого семинара. Скрипт setup-terraform-admin-project.sh включает необходимые API в проекте администрирования Terraform. Cloud Build используется для применения планов Terraform. С помощью скрипта вы назначаете учетной записи службы Cloud Build соответствующие роли IAM, чтобы она могла создавать ресурсы в GCP. Наконец, настраивается удаленный бэкенд в хранилище Google Cloud Storage (GCS) для хранения состояний Terraform для всех ресурсов GCP.

Для просмотра задач Cloud Build в административном проекте Terraform вам потребуется идентификатор административного проекта Terraform. Он хранится в хранилище GCS администратора, указанном в скрипте начальной загрузки. Если вы запускаете скрипт начальной загрузки для нескольких пользователей, все идентификаторы административных проектов Terraform будут находиться в хранилище GCS.

Откройте Cloud Shell (если он еще не открыт в разделе «Настройка и подготовка лаборатории»), перейдя по ссылке ниже.

ОБЛАЧНАЯ ОБОЛОЧКА

Установите kustomize (если он еще не установлен) в папку $HOME/bin и добавьте папку $HOME/bin в переменную $PATH.

mkdir -p $HOME/bin
cd $HOME/bin
curl -s "https://raw.githubusercontent.com/\
kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash
cd $HOME
export PATH=$PATH:${HOME}/bin
echo "export PATH=$PATH:$HOME/bin" >> $HOME/.bashrc

Установите pv и переместите его в $HOME/bin/pv.

sudo apt-get update && sudo apt-get -y install pv
sudo mv /usr/bin/pv ${HOME}/bin/pv

Обновите командную строку.

cp $WORKDIR/asm/scripts/krompt.bash $HOME/.krompt.bash
echo "export PATH=\$PATH:\$HOME/bin" >> $HOME/.asm-workshop.bash
echo "source $HOME/.krompt.bash" >> $HOME/.asm-workshop.bash

alias asm-init='source $HOME/.asm-workshop.bash' >> $HOME/.bashrc
echo "source $HOME/.asm-workshop.bash" >> $HOME/.bashrc
source $HOME/.bashrc

Убедитесь, что вы вошли в gcloud под нужной учетной записью пользователя.

echo "Check logged in user output from the next command is $MY_USER"
gcloud config list account --format=json | jq -r .core.account

Получите идентификатор своего проекта в административной панели Terraform, выполнив следующую команду:

export TF_ADMIN=$(gcloud projects list | grep tf- | awk '{ print $1 }')
echo $TF_ADMIN

Все ресурсы, связанные с семинаром, хранятся в виде переменных в файле vars.sh, который находится в хранилище GCS в проекте Terraform Admin. Получите файл vars.sh для вашего проекта Terraform Admin.

mkdir $WORKDIR/asm/vars
gsutil cp gs://$TF_ADMIN/vars/vars.sh $WORKDIR/asm/vars/vars.sh
echo "export WORKDIR=$WORKDIR" >> $WORKDIR/asm/vars/vars.sh

Нажмите на отображаемую ссылку, чтобы открыть страницу Cloud Build для административного проекта Terraform и убедиться, что сборка успешно завершена.

source $WORKDIR/asm/vars/vars.sh
echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_ADMIN}"

При первом доступе к Cloud Console, пожалуйста, примите условия использования Google.

Теперь, когда вы находитесь на странице Cloud Build , щелкните ссылку History в левой боковой панели навигации и выберите последнюю сборку, чтобы просмотреть подробности первоначального применения Terraform. Следующие ресурсы создаются в рамках скрипта Terraform. Вы также можете обратиться к приведенной выше архитектурной схеме.

В организации 4 проекта GCP. К каждому проекту привязан предоставленный платежный аккаунт.
Один из проектов — это network host project для общей VPC. В этом проекте не создаются никакие другие ресурсы.
Один из проектов — это ops project используемый для кластеров GKE, работающих в плоскости управления Istio.
Два проекта представляют собой работу двух разных команд разработчиков над соответствующими сервисами.
В каждом из трех проектов ops , dev1 и dev2 создаются два кластера GKE.
Создан репозиторий CSR под названием k8s-repo , содержащий шесть папок для файлов манифестов Kubernetes. По одной папке на каждый кластер GKE. Этот репозиторий используется для развертывания манифестов Kubernetes в кластерах в соответствии с принципами GitOps.
Создан триггер Cloud Build , который при каждом коммите в ветку master репозитория k8s-repo развертывает манифесты Kubernetes в кластеры GKE из соответствующих папок.

После завершения сборки в terraform admin project начнётся новая сборка в проекте ops. Щёлкните по отображаемой ссылке, чтобы открыть страницу Cloud Build для ops project и убедиться, что сборка k8s-repo Cloud Build успешно завершилась.

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Проверьте установку

Создайте файлы kubeconfig для всех кластеров. Запустите следующий скрипт.

$WORKDIR/asm/scripts/setup-gke-vars-kubeconfig.sh

Этот скрипт создает новый файл kubeconfig в папке gke под названием kubemesh .

Измените переменную KUBECONFIG , указав в ней путь к новому файлу kubeconfig.

source $WORKDIR/asm/vars/vars.sh
export KUBECONFIG=$WORKDIR/asm/gke/kubemesh

Добавьте переменные vars.sh и KUBECONFIG в файл .bashrc в Cloud Shell, чтобы они подключались каждый раз при перезапуске Cloud Shell.

echo "source ${WORKDIR}/asm/vars/vars.sh" >> $HOME/.bashrc
echo "export KUBECONFIG=${WORKDIR}/asm/gke/kubemesh" >> $HOME/.bashrc

Перечислите контексты ваших кластеров. Вы должны увидеть шесть кластеров.

kubectl config view -ojson | jq -r '.clusters[].name'

    `Output (do not copy)`

gke_tf05-01-ops_us-central1_gke-asm-2-r2-prod
gke_tf05-01-ops_us-west1_gke-asm-1-r1-prod
gke_tf05-02-dev1_us-west1-a_gke-1-apps-r1a-prod
gke_tf05-02-dev1_us-west1-b_gke-2-apps-r1b-prod
gke_tf05-03-dev2_us-central1-a_gke-3-apps-r2a-prod
gke_tf05-03-dev2_us-central1-b_gke-4-apps-r2b-prod

Проверка установки Istio

Убедитесь, что Istio установлен на обоих кластерах, проверив, запущены ли все поды и завершены ли задания.

kubectl --context ${OPS_GKE_1} get pods -n istio-system
kubectl --context ${OPS_GKE_2} get pods -n istio-system

    `Output (do not copy)`

NAME                                      READY   STATUS    RESTARTS   AGE
grafana-5f798469fd-z9f98                  1/1     Running   0          6m21s
istio-citadel-568747d88-qdw64             1/1     Running   0          6m26s
istio-egressgateway-8f454cf58-ckw7n       1/1     Running   0          6m25s
istio-galley-6b9495645d-m996v             2/2     Running   0          6m25s
istio-ingressgateway-5df799fdbd-8nqhj     1/1     Running   0          2m57s
istio-pilot-67fd786f65-nwmcb              2/2     Running   0          6m24s
istio-policy-74cf89cb66-4wrpl             2/2     Running   1          6m25s
istio-sidecar-injector-759bf6b4bc-mw4vf   1/1     Running   0          6m25s
istio-telemetry-77b6dfb4ff-zqxzz          2/2     Running   1          6m24s
istio-tracing-cd67ddf8-n4d7k              1/1     Running   0          6m25s
istiocoredns-5f7546c6f4-g7b5c             2/2     Running   0          6m39s
kiali-7964898d8c-5twln                    1/1     Running   0          6m23s
prometheus-586d4445c7-xhn8d               1/1     Running   0          6m25s

    `Output (do not copy)`

NAME                                      READY   STATUS    RESTARTS   AGE
grafana-5f798469fd-2s8k4                  1/1     Running   0          59m
istio-citadel-568747d88-87kdj             1/1     Running   0          59m
istio-egressgateway-8f454cf58-zj9fs       1/1     Running   0          60m
istio-galley-6b9495645d-qfdr6             2/2     Running   0          59m
istio-ingressgateway-5df799fdbd-2c9rc     1/1     Running   0          60m
istio-pilot-67fd786f65-nzhx4              2/2     Running   0          59m
istio-policy-74cf89cb66-4bc7f             2/2     Running   3          59m
istio-sidecar-injector-759bf6b4bc-grk24   1/1     Running   0          59m
istio-telemetry-77b6dfb4ff-6zr94          2/2     Running   4          60m
istio-tracing-cd67ddf8-grs9g              1/1     Running   0          60m
istiocoredns-5f7546c6f4-gxd66             2/2     Running   0          60m
kiali-7964898d8c-nhn52                    1/1     Running   0          59m
prometheus-586d4445c7-xr44v               1/1     Running   0          59m

Убедитесь, что Istio установлен на обоих кластерах dev1 . В кластерах dev1 работают только Citadel, sidecar-injector и coredns. Они используют общую плоскость управления Istio, работающую в кластере ops-1.

kubectl --context ${DEV1_GKE_1} get pods -n istio-system
kubectl --context ${DEV1_GKE_2} get pods -n istio-system

Убедитесь, что Istio установлен на обоих кластерах dev2 . В кластерах dev2 работают только Citadel, sidecar-injector и coredns. Они используют общую плоскость управления Istio, работающую в кластере ops-2.

kubectl --context ${DEV2_GKE_1} get pods -n istio-system
kubectl --context ${DEV2_GKE_2} get pods -n istio-system

    `Output (do not copy)`

NAME                                      READY   STATUS    RESTARTS   AGE
istio-citadel-568747d88-4lj9b             1/1     Running   0          66s
istio-sidecar-injector-759bf6b4bc-ks5br   1/1     Running   0          66s
istiocoredns-5f7546c6f4-qbsqm             2/2     Running   0          78s

Проверка обнаружения служб для совместно используемых плоскостей управления.

При желании можно проверить, развернуты ли секреты.

kubectl --context ${OPS_GKE_1} get secrets -l istio/multiCluster=true -n istio-system
kubectl --context ${OPS_GKE_2} get secrets -l istio/multiCluster=true -n istio-system

    `Output (do not copy)`

For OPS_GKE_1:
NAME                  TYPE     DATA   AGE
gke-1-apps-r1a-prod   Opaque   1      8m7s
gke-2-apps-r1b-prod   Opaque   1      8m7s
gke-3-apps-r2a-prod   Opaque   1      44s
gke-4-apps-r2b-prod   Opaque   1      43s

For OPS_GKE_2:
NAME                  TYPE     DATA   AGE
gke-1-apps-r1a-prod   Opaque   1      40s
gke-2-apps-r1b-prod   Opaque   1      40s
gke-3-apps-r2a-prod   Opaque   1      8m4s
gke-4-apps-r2b-prod   Opaque   1      8m4s

В этом мастер-классе вы используете единую общую VPC , в которой созданы все кластеры GKE. Для обнаружения сервисов в разных кластерах вы используете файлы kubeconfig (для каждого из кластеров приложений), созданные в качестве секретов в кластерах операций. Pilot использует эти секреты для обнаружения сервисов, запрашивая API-сервер Kube кластеров приложений (аутентификация осуществляется с помощью секретов, указанных выше). Вы увидите, что оба кластера операций могут аутентифицироваться во всех кластерах приложений, используя секреты, созданные в kubeconfig. Кластеры операций могут автоматически обнаруживать сервисы, используя файлы kubeconfig в качестве секретного метода. Для этого требуется, чтобы Pilot в кластерах операций имел доступ к API-серверу Kube всех остальных кластеров. Если Pilot не может связаться с API-серверами Kube, вам нужно вручную добавить удаленные сервисы в качестве ServiceEntries . ServiceEntries можно рассматривать как записи DNS в вашем реестре сервисов. ServiceEntries определяют сервис, используя полное DNS-имя ( FQDN ) и IP-адрес, по которому он доступен. Для получения более подробной информации см. документацию Istio Multicluster .

6. Объяснение репозитория инфраструктуры

Создание облачной инфраструктуры

Ресурсы GCP для семинара создаются с помощью Cloud Build и репозитория CSR infrastructure . Вы только что запустили скрипт начальной загрузки (расположенный по адресу scripts/bootstrap_workshop.sh ) из локального терминала. Скрипт начальной загрузки создает папку GCP, проект администрирования Terraform и соответствующие разрешения IAM для учетной записи службы Cloud Build . Проект администрирования Terraform используется для хранения состояний Terraform, логов и различных скриптов. Он содержит репозитории CSR infrastructure и k8s_repo . Эти репозитории подробно описаны в следующем разделе. Никакие другие ресурсы семинара не создаются в проекте администрирования Terraform. Учетная запись службы Cloud Build в проекте администрирования Terraform используется для создания ресурсов для семинара.

Файл cloudbuild.yaml расположенный в папке infrastructure , используется для сборки ресурсов GCP для семинара. Он создает пользовательский образ сборщика со всеми инструментами, необходимыми для создания ресурсов GCP. Эти инструменты включают gcloud SDK, Terraform и другие утилиты, такие как Python, Git, jq и т. д. Пользовательский образ сборщика запускает команды terraform plan и apply для каждого ресурса. Файлы Terraform для каждого ресурса находятся в отдельных папках (подробности в следующем разделе). Ресурсы собираются по одному и в порядке, в котором они обычно собираются (например, проект GCP собирается до того, как в проекте создаются ресурсы). Для получения более подробной информации ознакомьтесь с файлом cloudbuild.yaml .

Cloud Build запускается всякий раз, когда в репозиторий infrastructure вносится коммит. Любые изменения, внесенные в инфраструктуру, сохраняются как инфраструктура как код (IaC) и фиксируются в репозитории. Состояние вашего мастер-класса всегда хранится в этом репозитории.

Структура папок — команды, среды и ресурсы.

Репозиторий Infrastructure настраивает ресурсы инфраструктуры GCP для семинара. Он структурирован на папки и подпапки. Базовые папки в репозитории представляют team , которая владеет конкретными ресурсами GCP. Следующий уровень папок представляет конкретную environment для команды (например, dev, stage, prod). Следующий уровень папок внутри среды представляет конкретный resource (например, host_project, gke_clusters и т. д.). Необходимые скрипты и файлы Terraform находятся в папках ресурсов.

В этом семинаре представлены следующие четыре типа команд:

Команда, отвечающая за инфраструктуру , представляет команду облачной инфраструктуры. Она отвечает за создание ресурсов GCP для всех остальных команд. Для своих ресурсов они используют административный проект Terraform. Сам репозиторий инфраструктуры находится в административном проекте Terraform, как и файлы состояния Terraform (подробнее см. ниже). Эти ресурсы создаются с помощью bash-скрипта в процессе начальной загрузки (подробнее см. Модуль 0 — Рабочий процесс администратора).
Команда network отвечает за сетевые ресурсы и VPC. В её обязанности входят следующие ресурсы GCP.
host project — представляет собой проект хоста общей виртуальной частной сети (VPC).
shared VPC — представляет собой общую виртуальную частную сеть (VPC), подсети, дополнительные диапазоны IP-адресов, маршруты и правила брандмауэра.
ops — это команда, отвечающая за операционную деятельность/devops. В её распоряжении следующие ресурсы.
ops project — представляет собой проект для всех операционных ресурсов.
gke clusters — по одному кластеру ops GKE на каждый регион. В каждом кластере ops GKE установлен блок управления Istio.
k8s-repo — репозиторий CSR, содержащий манифесты GKE для всех кластеров GKE.
apps — это команды разработчиков приложений. В этом мастер-классе моделируются две команды, app1 и app2 . В их распоряжении следующие ресурсы.
app projects — каждая команда разработчиков получает свой собственный набор проектов. Это позволяет им контролировать выставление счетов и управление идентификацией и доступом (IAM) для своего конкретного проекта.
gke clusters — это кластеры приложений, в которых запускаются контейнеры/поды приложений.
gce instances — опционально, если на них работают приложения, запущенные на экземплярах GCE. В этом семинаре приложение app1 использует несколько экземпляров GCE, на которых работает часть приложения.

В этом мастер-классе одно и то же приложение (приложение Hipster shop) будет представлять собой как приложение 1, так и приложение 2.

Поставщик, состояния и выходные данные — бэкэнды и общие состояния

Провайдеры google и google-beta находятся по адресу gcp/[environment]/gcp/provider.tf . Файл provider.tf является символической ссылкой в каждой папке ресурсов. Это позволяет изменять провайдера в одном месте, вместо того чтобы управлять провайдерами для каждого ресурса по отдельности.

Каждый ресурс содержит файл backend.tf , который определяет местоположение файла tfstate ресурса. Этот файл backend.tf генерируется из шаблона (расположенного по адресу templates/backend.tf_tmpl ) с помощью скрипта (расположенного по адресу scripts/setup_terraform_admin_project ), а затем помещается в соответствующую папку ресурса. Для бэкендов используются корзины Google Cloud Storage (GCS). Имя папки корзины GCS совпадает с именем ресурса. Все бэкенды ресурсов находятся в проекте Terraform Admin.

Ресурсы с взаимозависимыми значениями содержат файл output.tf . Необходимые выходные значения хранятся в файле tfstate, определенном в бэкэнде для данного ресурса. Например, для создания кластера GKE в проекте необходимо знать идентификатор проекта. Идентификатор проекта выводится через output.tf в файл tfstate, который может быть использован через источник данных terraform_remote_state в ресурсе кластера GKE.

Файл shared_state является источником данных terraform_remote_state , указывающим на файл tfstate ресурса. Файл (или файлы) shared_state_[resource_name].tf существуют в папках ресурсов, которым требуются выходные данные от других ресурсов. Например, в папке ресурса ops_gke находятся файлы shared_state из ресурсов ops_project и shared_vpc , поскольку для создания кластеров GKE в проекте ops необходимы идентификатор проекта и данные VPC. Файлы shared_state генерируются из шаблона (расположенного по адресу templates/shared_state.tf_tmpl ) с помощью скрипта (расположенного по адресу scripts/setup_terraform_admin_project ). Все файлы shared_state ресурсов размещаются в папке gcp/[environment]/shared_states . Необходимые файлы shared_state создаются в виде символических ссылок в соответствующих папках ресурсов. Размещение всех файлов shared_state в одной папке и создание символических ссылок на них в соответствующих папках ресурсов упрощает управление всеми файлами состояния в одном месте.

Переменные

Все значения ресурсов хранятся в виде переменных окружения. Эти переменные хранятся (в виде операторов экспорта) в файле vars.sh , расположенном в хранилище GCS в проекте администрирования Terraform. Он содержит идентификатор организации, платежный аккаунт, идентификаторы проектов, сведения о кластере GKE и т. д. Вы можете загрузить и выполнить команду `source vars.sh из любого терминала, чтобы получить значения для вашей конфигурации.

Переменные Terraform хранятся в vars.sh как TF_VAR_[variable name] . Эти переменные используются для генерации файла variables.tfvars в соответствующей папке ресурсов. Файл variables.tfvars содержит все переменные с их значениями. Файл variables.tfvars генерируется из файла шаблона в той же папке с помощью скрипта (расположенного по адресу scripts/setup_terraform_admin_project ).

Объяснение репозитория Kubernetes

k8s_repo — это репозиторий CSR (отдельный от репозитория инфраструктуры), расположенный в административном проекте Terraform. Он используется для хранения и применения манифестов GKE ко всем кластерам GKE. k8s_repo создается в процессе Cloud Build инфраструктуры (подробности см. в предыдущем разделе). В ходе первоначального процесса Cloud Build инфраструктуры создается в общей сложности шесть кластеров GKE. В k8s_repo создается шесть папок. Каждая папка (имя которой соответствует имени кластера GKE) соответствует кластеру GKE, содержащему соответствующие файлы манифестов ресурсов. Аналогично сборке инфраструктуры, Cloud Build используется для применения манифестов Kubernetes ко всем кластерам GKE с помощью k8s_repo. Cloud Build запускается всякий раз, когда происходит коммит в репозиторий k8s_repo . Как и в случае с инфраструктурой, все манифесты Kubernetes хранятся в виде кода в репозитории k8s_repo , а состояние каждого кластера GKE всегда хранится в соответствующей папке.

В рамках первоначального создания инфраструктуры создается репозиторий k8s_repo и на всех кластерах устанавливается Istio.

Проекты, кластеры GKE и пространства имен

Ресурсы, представленные на этом семинаре, разделены на различные проекты GCP. Проекты должны соответствовать организационной (или командной) структуре вашей компании. Команды (в вашей организации), ответственные за разные проекты/продукты/ресурсы, используют разные проекты GCP. Наличие отдельных проектов позволяет создавать отдельные наборы разрешений IAM и управлять выставлением счетов на уровне проекта. Кроме того, квоты также управляются на уровне проекта.

В этом семинаре представлены пять команд, каждая со своим собственным проектом.

Команда, отвечающая за инфраструктуру и создающая ресурсы GCP, использует Terraform admin project . Они управляют инфраструктурой как кодом в репозитории CSR (называемом infrastructure ) и хранят всю информацию о состоянии Terraform, относящуюся к ресурсам, созданным в GCP, в бакетах GCS. Они контролируют доступ к репозиторию CSR и бакетам GCS с состоянием Terraform.
Команда , занимающаяся созданием общей VPC, использует host project . Этот проект содержит VPC, подсети, маршруты и правила брандмауэра. Наличие общей VPC позволяет им централизованно управлять сетью для ресурсов GCP. Все проекты использовали эту единую общую VPC для организации сети.
Команда ops/platform , которая создает кластеры GKE и управляющие плоскости ASM/Istio, использует ops project . Они управляют жизненным циклом кластеров GKE и сервисной сетки. Они отвечают за повышение безопасности кластеров, управление отказоустойчивостью и масштабируемостью платформы Kubernetes. В этом мастер-классе вы будете использовать метод gitops для развертывания ресурсов в Kubernetes. В проекте ops существует репозиторий CSR (называемый k8s_repo ).
Lastly, dev1 and dev2 teams (represent two development teams) that build applications use their own dev1 and dev2 projects . These are the applications and services you provide to your customers. These are built on the platform that the ops team manages. The resources (Deployments, Services etc) are pushed to the k8s_repo and get deployed to the appropriate clusters. It is important to note that this workshop does not focus on CI/CD best practices and tooling. You use Cloud Build to automate deploying Kubernetes resources to the GKE clusters directly. In real world production scenarios, you would use a proper CI/CD solution to deploy applications to GKE clusters.

There are two types of GKE clusters in this workshop.

Ops clusters - used by the ops team to run devops tools. In this workshop, they run the ASM/Istio control plane to manage the service mesh.
Application (apps) clusters - used by the development teams to run applications. In this workshop, the Hipster shop app is used.

Separating the ops/admin tooling from the clusters running the application allows you to manage the life cycle of each resource independently. The two types of clusters also exist in different projects pertaining to the team/product that uses them which makes IAM permissions also easier to manage.

There are a total of six GKE clusters. Two regional ops clusters are created in the ops project. ASM/Istio control plane is installed on both ops clusters. Each ops cluster is in a different region. In addition, there are four zonal application clusters. These are created in their own projects. This workshop simulates two development teams each with their own projects. Each project contains two app clusters. App clusters are zonal clusters in different zones. The four app clusters are located in two regions and four zones. This way you get regional and zonal redundancy.

The application used in this workshop, the Hipster Shop app, is deployed on all four app clusters. Each microservice lives in its own namespace in every app cluster. Hipster shop app Deployments (Pods) are not deployed on the ops clusters. However, the namespaces and Service resources for all microservices are also created in the ops clusters. ASM/Istio control plane uses the Kubernetes service registries for service discovery. In the absence of Services (in the ops clusters), you would have to manually create ServiceEntries for each service running in the app cluster.

You deploy a 10-tier microservices application in this workshop. The application is a web-based e-commerce app called " Hipster Shop " where users can browse items, add them to the cart, and purchase them.

Kubernetes manifests and k8s_repo

You use the k8s_repo to add Kubernetes resources to all GKE clusters. You do this by copying Kubernetes manifests and committing to the k8s_repo . All commits to the k8s_repo trigger a Cloud Build job which deploys the Kubernetes manifests to the respective cluster. Each cluster's manifest is located in a separate folder named the same as the cluster name.

The six cluster names are:

gke-asm-1-r1-prod - the regional ops cluster in region 1
gke-asm-2-r2-prod - the regional ops cluster in region 2
gke-1-apps-r1a-prod - the app cluster in region 1 zone a
gke-2-apps-r1b-prod - the app cluster in region 1 zone b
gke-3-apps-r2a-prod - the app cluster in region 2 zone a
gke-4-apps-r2b-prod - the app cluster in region 2 zone b

The k8s_repo has folders corresponding to these clusters. Any manifest placed in these folders get applied to the corresponding GKE cluster. Manifests for each cluster are placed in sub-folders (within the cluster's main folder) for ease of management. In this workshop, you use Kustomize to keep track of resources that get deployed. Please refer to the Kustomize official documentation for more details.

7. Deploy the Sample App

Objective: Deploy Hipster shop app on apps clusters

Clone k8s-repo
Copy Hipster shop manifests to all apps clusters
Create Services for Hipster shop app in the ops clusters
Setup loadgenerators in the ops clusters to test global connectivity
Verify secure connectivity to the Hipster shop app

Copy-and-Paste Method Lab Instructions

Clone the ops project source repo

As part of the initial Terraform infrastructure build, the k8s-repo is already created in the ops project.

Create an empty directory for git repo:

mkdir $WORKDIR/k8s-repo

Init git repo, add remote and pull master from remote repo:

cd $WORKDIR/k8s-repo
git init && git remote add origin \
https://source.developers.google.com/p/$TF_VAR_ops_project_name/r/k8s-repo

Set local git local configuration.

git config --local user.email $MY_USER
git config --local user.name "K8s repo user"
git config --local \
credential.'https://source.developers.google.com'.helper gcloud.sh
git pull origin master

Copy manifests, commit and push

Copy the Hipster Shop namespaces and services to the source repo for all clusters.

cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/namespaces \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app/.

cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app/services \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app/.

Copy the app folder kustomization.yaml to all clusters.

cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/
cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/
cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/
cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/
cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app/
cp $WORKDIR/asm/k8s_manifests/prod/app/kustomization.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app/

Copy the Hipster Shop Deployments, RBAC and PodSecurityPolicy to the source repo for the apps clusters.

cp -r $WORKDIR/asm/k8s_manifests/prod/app/deployments \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/deployments \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/deployments \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/deployments \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/

cp -r $WORKDIR/asm/k8s_manifests/prod/app/rbac \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/rbac \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/rbac \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/rbac \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/podsecuritypolicies \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/podsecuritypolicies \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/podsecuritypolicies \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/
cp -r $WORKDIR/asm/k8s_manifests/prod/app/podsecuritypolicies \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/

Remove the cartservice deployment, rbac and podsecuritypolicy from all but one dev cluster. Hipstershop was not built for multi-cluster deployment, so to avoid inconsistent results, we are using just one cartservice.

rm $WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/deployments/app-cart-service.yaml
rm $WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/podsecuritypolicies/cart-psp.yaml
rm $WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/app/rbac/cart-rbac.yaml

rm $WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/deployments/app-cart-service.yaml
rm $WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/podsecuritypolicies/cart-psp.yaml
rm $WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app/rbac/cart-rbac.yaml

rm $WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/deployments/app-cart-service.yaml
rm $WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/podsecuritypolicies/cart-psp.yaml
rm $WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/app/rbac/cart-rbac.yaml

Add cartservice deployment, rbac and podsecuritypolicy to kustomization.yaml in the first dev cluster only.

cd ${WORKDIR}/k8s-repo/${DEV1_GKE_1_CLUSTER}/app
cd deployments && kustomize edit add resource app-cart-service.yaml
cd ../podsecuritypolicies && kustomize edit add resource cart-psp.yaml
cd ../rbac && kustomize edit add resource cart-rbac.yaml
cd ${WORKDIR}/asm

Remove podsecuritypolicies, deployments and rbac directories from ops clusters kustomization.yaml

sed -i -e '/- deployments\//d' -e '/- podsecuritypolicies\//d' \
  -e '/- rbac\//d' \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app/kustomization.yaml
sed -i -e '/- deployments\//d' -e '/- podsecuritypolicies\//d' \
  -e '/- rbac\//d' \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app/kustomization.yaml

Replace the PROJECT_ID in the RBAC manifests.

sed -i 's/\${PROJECT_ID}/'${TF_VAR_dev1_project_name}'/g' \
${WORKDIR}/k8s-repo/${DEV1_GKE_1_CLUSTER}/app/rbac/*
sed -i 's/\${PROJECT_ID}/'${TF_VAR_dev1_project_name}'/g' \
${WORKDIR}/k8s-repo/${DEV1_GKE_2_CLUSTER}/app/rbac/*
sed -i 's/\${PROJECT_ID}/'${TF_VAR_dev2_project_name}'/g' \
${WORKDIR}/k8s-repo/${DEV2_GKE_1_CLUSTER}/app/rbac/*
sed -i 's/\${PROJECT_ID}/'${TF_VAR_dev2_project_name}'/g' \
${WORKDIR}/k8s-repo/${DEV2_GKE_2_CLUSTER}/app/rbac/*

Copy the IngressGateway and VirtualService manifests to the source repo for the ops clusters.

cp -r $WORKDIR/asm/k8s_manifests/prod/app-ingress/* \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-ingress/
cp -r $WORKDIR/asm/k8s_manifests/prod/app-ingress/* \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-ingress/

Copy the Config Connector resources to one of clusters in each project.

cp -r $WORKDIR/asm/k8s_manifests/prod/app-cnrm/* \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-cnrm/
cp -r $WORKDIR/asm/k8s_manifests/prod/app-cnrm/* \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app-cnrm/
cp -r $WORKDIR/asm/k8s_manifests/prod/app-cnrm/* \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app-cnrm/

Replace the PROJECT_ID in the Config Connector manifests.

sed -i 's/${PROJECT_ID}/'$TF_VAR_ops_project_name'/g' \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-cnrm/*
sed -i 's/${PROJECT_ID}/'$TF_VAR_dev1_project_name'/g' \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/app-cnrm/*
sed -i 's/${PROJECT_ID}/'$TF_VAR_dev2_project_name'/g' \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/app-cnrm/*

Copy loadgenerator manifests (Deployment, PodSecurityPolicy and RBAC) to the ops clusters. The Hipster shop app is exposed using a global Google Cloud Load Balancer (GCLB). GCLB receives client traffic (destined to frontend ) and sends it to the closest instance of the Service. Putting loadgenerator on both ops clusters will ensure traffic to being sent to both Istio Ingress gateways running in the ops clusters. Load balancing is explained in detail in the following section.

cp -r $WORKDIR/asm/k8s_manifests/prod/app-loadgenerator/. \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-loadgenerator/.
cp -r $WORKDIR/asm/k8s_manifests/prod/app-loadgenerator/. \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-loadgenerator/.

Replace the ops project ID in the loadgenerator manifests for both ops clusters.

sed -i 's/OPS_PROJECT_ID/'$TF_VAR_ops_project_name'/g'  \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-loadgenerator/loadgenerator-deployment.yaml
sed -i 's/OPS_PROJECT_ID/'$TF_VAR_ops_project_name'/g' \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-loadgenerator/loadgenerator-rbac.yaml
sed -i 's/OPS_PROJECT_ID/'$TF_VAR_ops_project_name'/g' \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-loadgenerator/loadgenerator-deployment.yaml
sed -i 's/OPS_PROJECT_ID/'$TF_VAR_ops_project_name'/g' \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-loadgenerator/loadgenerator-rbac.yaml

Add the loadgenerator resources to kustomization.yaml for both ops clusters.

cd $WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-loadgenerator/
kustomize edit add resource loadgenerator-psp.yaml
kustomize edit add resource loadgenerator-rbac.yaml
kustomize edit add resource loadgenerator-deployment.yaml

cd $WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-loadgenerator/
kustomize edit add resource loadgenerator-psp.yaml
kustomize edit add resource loadgenerator-rbac.yaml
kustomize edit add resource loadgenerator-deployment.yaml

Commit to k8s-repo .

cd $WORKDIR/k8s-repo
git add . && git commit -am "create app namespaces and install hipster shop"
git push --set-upstream origin master

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Verify Application deployment

Verify pods in all application namespaces except cart are in Running state in all dev clusters.

for ns in ad checkout currency email frontend payment product-catalog recommendation shipping; do
  kubectl --context $DEV1_GKE_1 get pods -n $ns;
  kubectl --context $DEV1_GKE_2 get pods -n $ns;
  kubectl --context $DEV2_GKE_1 get pods -n $ns;
  kubectl --context $DEV2_GKE_2 get pods -n $ns;
done;

Output (do not copy)

NAME                               READY   STATUS    RESTARTS   AGE
currencyservice-5c5b8876db-pvc6s   2/2     Running   0          13m
NAME                               READY   STATUS    RESTARTS   AGE
currencyservice-5c5b8876db-xlkl9   2/2     Running   0          13m
NAME                               READY   STATUS    RESTARTS   AGE
currencyservice-5c5b8876db-zdjkg   2/2     Running   0          115s
NAME                               READY   STATUS    RESTARTS   AGE
currencyservice-5c5b8876db-l748q   2/2     Running   0          82s

NAME                            READY   STATUS    RESTARTS   AGE
emailservice-588467b8c8-gk92n   2/2     Running   0          13m
NAME                            READY   STATUS    RESTARTS   AGE
emailservice-588467b8c8-rvzk9   2/2     Running   0          13m
NAME                            READY   STATUS    RESTARTS   AGE
emailservice-588467b8c8-mt925   2/2     Running   0          117s
NAME                            READY   STATUS    RESTARTS   AGE
emailservice-588467b8c8-klqn7   2/2     Running   0          84s

NAME                        READY   STATUS    RESTARTS   AGE
frontend-64b94cf46f-kkq7d   2/2     Running   0          13m
NAME                        READY   STATUS    RESTARTS   AGE
frontend-64b94cf46f-lwskf   2/2     Running   0          13m
NAME                        READY   STATUS    RESTARTS   AGE
frontend-64b94cf46f-zz7xs   2/2     Running   0          118s
NAME                        READY   STATUS    RESTARTS   AGE
frontend-64b94cf46f-2vtw5   2/2     Running   0          85s

NAME                              READY   STATUS    RESTARTS   AGE
paymentservice-777f6c74f8-df8ml   2/2     Running   0          13m
NAME                              READY   STATUS    RESTARTS   AGE
paymentservice-777f6c74f8-bdcvg   2/2     Running   0          13m
NAME                              READY   STATUS    RESTARTS   AGE
paymentservice-777f6c74f8-jqf28   2/2     Running   0          117s
NAME                              READY   STATUS    RESTARTS   AGE
paymentservice-777f6c74f8-95x2m   2/2     Running   0          86s

NAME                                     READY   STATUS    RESTARTS   AGE
productcatalogservice-786dc84f84-q5g9p   2/2     Running   0          13m
NAME                                     READY   STATUS    RESTARTS   AGE
productcatalogservice-786dc84f84-n6lp8   2/2     Running   0          13m
NAME                                     READY   STATUS    RESTARTS   AGE
productcatalogservice-786dc84f84-gf9xl   2/2     Running   0          119s
NAME                                     READY   STATUS    RESTARTS   AGE
productcatalogservice-786dc84f84-v7cbr   2/2     Running   0          86s

NAME                                     READY   STATUS    RESTARTS   AGE
recommendationservice-5fdf959f6b-2ltrk   2/2     Running   0          13m
NAME                                     READY   STATUS    RESTARTS   AGE
recommendationservice-5fdf959f6b-dqd55   2/2     Running   0          13m
NAME                                     READY   STATUS    RESTARTS   AGE
recommendationservice-5fdf959f6b-jghcl   2/2     Running   0          119s
NAME                                     READY   STATUS    RESTARTS   AGE
recommendationservice-5fdf959f6b-kkspz   2/2     Running   0          87s

NAME                              READY   STATUS    RESTARTS   AGE
shippingservice-7bd5f569d-qqd9n   2/2     Running   0          13m
NAME                              READY   STATUS    RESTARTS   AGE
shippingservice-7bd5f569d-xczg5   2/2     Running   0          13m
NAME                              READY   STATUS    RESTARTS   AGE
shippingservice-7bd5f569d-wfgfr   2/2     Running   0          2m
NAME                              READY   STATUS    RESTARTS   AGE
shippingservice-7bd5f569d-r6t8v   2/2     Running   0          88s

Verify pods in cart namespace are in Running state in first dev cluster only.

kubectl --context $DEV1_GKE_1 get pods -n cart;

Output (do not copy)

NAME                           READY   STATUS    RESTARTS   AGE
cartservice-659c9749b4-vqnrd   2/2     Running   0          17m

Access the Hipster Shop app

Global load balancing

You now have Hipster Shop app deployed to all four app clusters. These clusters are in two regions and four zones. Clients can access the Hipster shop app by accessing the frontend service. The frontend service runs on all four app clusters. A Google Cloud Load Balancer ( GCLB ) is used to get client traffic to all four instances of the frontend service.

Istio Ingress gateways only run in the ops clusters and act as a regional load balancer to the two zonal application clusters within the region. GCLB uses the two Istio ingress gateways (running in the two ops clusters) as backends to the global frontend service. The Istio Ingress gateways receive the client traffic from the GCLB and then send the client traffic onwards to the frontend Pods running in the application clusters.

Alternatively, you can put Istio Ingress gateways on the application clusters directly and the GCLB can use those as backends.

GKE Autoneg controller

Istio Ingress gateway Kubernetes Service registers itself as a backend to the GCLB using Network Endpoint Groups (NEGs). NEGs allow for container-native load balancing using GCLBs. NEGs are created through a special annotation on a Kubernetes Service, so it can register itself to the NEG Controller. Autoneg controller is a special GKE controller that automates the creation of NEGs as well as assigning them as backends to a GCLB using Service annotations. Istio control planes including the Istio ingress gateways are deployed during the initial infrastructure Terraform Cloud Build. The GCLB and autoneg configuration is done as part of the initial Terraform infrastructure Cloud Build.

Secure Ingress using Cloud Endpoints and managed certs

GCP Managed certs are used to secure the client traffic to the frontend GCLB service. GCLB uses managed certs for the global frontend service and the certificate is terminated at the GCLB. In this workshop, you use Cloud Endpoints as the domain for the managed cert. Alternatively, you can use your domain and a DNS name for the frontend to create GCP managed certs.

To access the Hipster shop, click on the link output of the following command.

echo "https://frontend.endpoints.$TF_VAR_ops_project_name.cloud.goog"

You can check that the certificate is valid by clicking the lock symbol in the URL bar of your Chrome tab.

Verify global load balancing

As part of the application deployment, load generators were deployed in both ops clusters that generate test traffic to the GCLB Hipster shop Cloud Endpoints link. Verify that the GCLB is receiving traffic and sending to both Istio Ingress gateways.

Get the GCLB > Monitoring link for the ops project where the Hipster shop GCLB is created.

echo "https://console.cloud.google.com/net-services/loadbalancing/details/http/istio-ingressgateway?project=$TF_VAR_ops_project_name&cloudshell=false&tab=monitoring&duration=PT1H"

Change from All backends to istio-ingressgateway from the Backend dropdown menu as shown below.

Note traffic going to both istio-ingressgateways .

There are three NEGs created per istio-ingressgateway . Since the ops clusters are regional clusters, one NEG is created for each zone in the region. The istio-ingressgateway Pods, however, run in a single zone per region. Traffic is shown going to the istio-ingressgateway Pods.

Load generators are running in both ops clusters simulating client traffic from the two regions they are in. The load generated in the ops cluster region 1 is being sent to istio-ingressgateway in region 2. Likewise, the load generated in ops cluster region 2 is being sent to istio-ingressgateway in region 2.

8. Observability with Stackdriver

Objective: Connect Istio telemetry to Stackdriver and validate.

Install istio-telemetry resources
Create/update Istio Services dashboards
Просмотреть журналы контейнера
View distributed tracing in Stackdriver

Copy-and-Paste Method Lab Instructions

One of Istio's major features is built-in observability ("o11y"). This means that even with black-box, uninstrumented containers, operators can still observe the traffic going in and out of these containers, providing services to customers. This observation takes the shape of a few different methods: metrics, logs, and traces.

We will also utilize the built-in load generation system in Hipster Shop. Observability doesn't work very well in a static system with no traffic, so load generation helps us see how it works. This load is already running, now we'll just be able to see it.

Install the istio to stackdriver config file.

cd $WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/istio-telemetry
kustomize edit add resource istio-telemetry.yaml

cd $WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/istio-telemetry
kustomize edit add resource istio-telemetry.yaml

Commit to k8s-repo.

cd $WORKDIR/k8s-repo
git add . && git commit -am "Install istio to stackdriver configuration"
git push

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Verify the Istio → Stackdriver integration Get the Stackdriver Handler CRD.

kubectl --context $OPS_GKE_1 get handler -n istio-system

The output should show a handler named stackdriver:

NAME            AGE
kubernetesenv   12d
prometheus      12d
stackdriver     69s      # <== NEW!

Verify that the Istio metrics export to Stackdriver is working. Click the link output from this command:

echo "https://console.cloud.google.com/monitoring/metrics-explorer?cloudshell=false&project=$TF_VAR_ops_project_name"

You will be prompted to create a new Workspace, named after the Ops project, just choose OK. If it prompts you about the new UI, just dismiss the dialog.

In the Metrics Explorer, under "Find resource type and metric" type " istio " to see there are options like "Server Request Count" on the "Kubernetes Container" resource type. This shows us that the metrics are flowing from the mesh into Stackdriver.

(You will have to Group By destination_service_name label if you want to see the lines below.)

Visualizing metrics with Dashboards:

Now that our metrics are in the Stackdriver APM system, we want a way to visualize them. In this section, we will install a pre-built dashboard which shows us the three of the four " Golden Signals " of metrics: Traffic (Requests per second), Latency (in this case, 99th and 50th percentile), and Errors (we're excluding Saturation in this example).

Istio's Envoy proxy gives us several metrics , but these are a good set to start with. (exhaustive list is here ). Note that each metric has a set of labels that can be used for filtering, aggregating, such as: destination_service, source_workload_namespace, response_code, istio_tcp_received_bytes_total, etc).

Now let's add our pre-canned metrics dashboard . We are going to be using the Dashboard API directly. This is something you wouldn't normally do by hand-generating API calls, it would be part of an automation system, or you would build the dashboard manually in the web UI. This will get us started quickly:

sed -i 's/OPS_PROJECT/'${TF_VAR_ops_project_name}'/g' \
$WORKDIR/asm/k8s_manifests/prod/app-telemetry/services-dashboard.json
OAUTH_TOKEN=$(gcloud auth application-default print-access-token)
curl -X POST -H "Authorization: Bearer $OAUTH_TOKEN" -H "Content-Type: application/json" \
https://monitoring.googleapis.com/v1/projects/$TF_VAR_ops_project_name/dashboards \
 -d @$WORKDIR/asm/k8s_manifests/prod/app-telemetry/services-dashboard.json

Navigate to the output link below to view the newly added "Services dashboard".

echo "https://console.cloud.google.com/monitoring/dashboards/custom/servicesdash?cloudshell=false&project=$TF_VAR_ops_project_name"

We could edit the dashboard in-place using the UX, but in our case we are going to quickly add a new graph using the API. In order to do that, you should pull down the latest version of the dashboard, apply your edits, then push it back up using the HTTP PATCH method.

You can get an existing dashboard by querying the monitoring API. Get the existing dashboard that was just added:

curl -X GET -H "Authorization: Bearer $OAUTH_TOKEN" -H "Content-Type: application/json" \
https://monitoring.googleapis.com/v1/projects/$TF_VAR_ops_project_name/dashboards/servicesdash > /tmp/services-dashboard.json

Add a new graph: (50th %ile latency): [ API reference ] Now we can add a new graph widget to our dashboard in code. This change can be reviewed by peers and checked into version control. Here is a widget to add that shows 50%ile latency (median latency).

Try editing the dashboard you just got, adding a new stanza:

NEW_CHART=${WORKDIR}/asm/k8s_manifests/prod/app-telemetry/new-chart.json
jq --argjson newChart "$(<$NEW_CHART)" '.gridLayout.widgets += [$newChart]' /tmp/services-dashboard.json > /tmp/patched-services-dashboard.json

Update the existing services dashboard:

curl -X PATCH -H "Authorization: Bearer $OAUTH_TOKEN" -H "Content-Type: application/json" \
https://monitoring.googleapis.com/v1/projects/$TF_VAR_ops_project_name/dashboards/servicesdash \
 -d @/tmp/patched-services-dashboard.json

View the updated dashboard by navigating to the following output link:

echo "https://console.cloud.google.com/monitoring/dashboards/custom/servicesdash?cloudshell=false&project=$TF_VAR_ops_project_name"

Do some simple Logs Analysis.

Istio provides a set of structured logs for all in-mesh network traffic and uploads them to Stackdriver Logging to allow cross-cluster analysis in one powerful tool. Logs are annotated with service-level metadata such as the cluster, container, app, connection_id, etc.

An example log entry (in this case, Envoy proxy's accesslog) might look like this (trimmed):

*** DO NOT PASTE *** 
 logName: "projects/PROJECTNAME-11932-01-ops/logs/server-tcp-accesslog-stackdriver.instance.istio-system" 
labels: {
  connection_id: "fbb46826-96fd-476c-ac98-68a9bd6e585d-1517191"   
  destination_app: "redis-cart"   
  destination_ip: "10.16.1.7"   
  destination_name: "redis-cart-6448dcbdcc-cj52v"   
  destination_namespace: "cart"   
  destination_owner: "kubernetes://apis/apps/v1/namespaces/cart/deployments/redis-cart"   
  destination_workload: "redis-cart"   
  source_ip: "10.16.2.8"   
  total_received_bytes: "539"   
  total_sent_bytes: "569" 
...  
 }

View your logs here:

echo "https://console.cloud.google.com/logs/viewer?cloudshell=false&project=$TF_VAR_ops_project_name"

You can view Istio's control plane logs by selecting Resource > Kubernetes Container, and searching on "pilot" —

Here, we can see the Istio Control Plane pushing proxy config to the sidecar proxies for each sample app service. "CDS," "LDS," and "RDS" represent different Envoy APIs ( more information ).

Beyond Istio's logs, you can also find container logs as well as infrastructure or other GCP services logs all in the same interface. Here are some sample logs queries for GKE. The logs viewer also allows you to create metrics out of logs (eg: "count every error that matches some string") which can be used on a dashboard or as part of an alert. Logs can also be streamed to other analysis tools such as BigQuery.

Some sample filters for hipster shop:

resource.type="k8s_container" labels.destination_app="productcatalogservice"

resource.type="k8s_container" resource.labels.namespace_name="cart"

Check out Distributed Traces.

Now that you're working with a distributed system, debugging needs a new tool: Distributed Tracing . This tool allows you to discover statistics about how your services are interacting (such as finding outlying slow events in the picture below), as well as dive into raw sample traces to investigate the details of what is really going on.

The Timeline View shows all requests over time, graphed by their latency, or time spent between initial request, through the Hipster stack, to finally respond to the end user. The higher up the dots, the slower (and less-happy!) the user's experience.

You can click on a dot to find the detailed Waterfall View of that particular request. This ability to find the raw details of a particular request (not just aggregate statistics) is vital to understanding the interplay between services, especially when hunting down rare, but bad, interactions between services.

The Waterfall View should be familiar to anyone who has used a debugger, but in this case instead of showing time spent in different processes of a single application, this is showing time spent traversing our mesh, between services, running in separate containers.

Here you can find your Traces:

echo "https://console.cloud.google.com/traces/overview?cloudshell=false&project=$TF_VAR_ops_project_name"

An example screenshot of the tool:

9. Mutual TLS Authentication

Objective: Secure connectivity between microservices (AuthN).

Enable mesh wide mTLS
Verify mTLS by inspecting logs

Copy-and-Paste Method Lab Instructions

Now that our apps are installed and Observability is set up, we can start securing the connections between services and make sure it keeps working.

For example, we can see on the Kiali dashboard that our services are not using MTLS (no "lock" icon). But the traffic is flowing and the system is working fine. Our StackDriver Golden Metrics dashboard is giving us some peace of mind that things are working, overall.

Check MeshPolicy in ops clusters. Note mTLS is PERMISSIVE allowing for both encrypted and non-mTLS traffic.

kubectl --context $OPS_GKE_1 get MeshPolicy -o json | jq '.items[].spec'
kubectl --context $OPS_GKE_2 get MeshPolicy -o json | jq '.items[].spec'

    `Output (do not copy)`

{
  "peers": [
    {
      "mtls": {
        "mode": "PERMISSIVE"
      }
    }
  ]
}

Istio is configured on all clusters using the Istio operator, which uses the IstioControlPlane custom resource (CR). We will configure mTLS in all clusters by updating the IstioControlPlane CR and updating the k8s-repo. Setting global > mTLS > enabled: true in the IstioControlPlane CR results in the following two changes to the Istio control plane:

MeshPolicy is set to turn on mTLS mesh wide for all Services running in all clusters.
A DestinationRule is created to allow ISTIO_MUTUAL traffic between Services running in all clusters.

We will apply a kustomize patch to the istioControlPlane CR to enable mTLS cluster wide. Copy the patch to relevant dir for all clusters and add a kustomize patch.

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-replicated.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-replicated.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-shared.yaml \
$WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$DEV1_GKE_1_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-shared.yaml \
$WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$DEV1_GKE_2_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-shared.yaml \
$WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$DEV2_GKE_1_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

cp -r $WORKDIR/asm/k8s_manifests/prod/app-mtls/mtls-kustomize-patch-shared.yaml \
$WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/istio-controlplane/mtls-kustomize-patch.yaml
cd $WORKDIR/k8s-repo/$DEV2_GKE_2_CLUSTER/istio-controlplane
kustomize edit add patch mtls-kustomize-patch.yaml

Commit to k8s-repo.

cd $WORKDIR/k8s-repo
git add . && git commit -am "turn mTLS on"
git push

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Verify mTLS

Check MeshPolicy once more in ops clusters. Note mTLS is no longer PERMISSIVE and will only allow for mTLS traffic.

kubectl --context $OPS_GKE_1 get MeshPolicy -o json | jq .items[].spec
kubectl --context $OPS_GKE_2 get MeshPolicy -o json | jq .items[].spec

Output (do not copy):

{
  "peers": [
    {
      "mtls": {}
    }
  ]
}

Describe the DestinationRule created by the Istio operator controller.

kubectl --context $OPS_GKE_1 get DestinationRule default -n istio-system -o json | jq '.spec'
kubectl --context $OPS_GKE_2 get DestinationRule default -n istio-system -o json | jq '.spec'

Output (do not copy):

{
    host: '*.local',
    trafficPolicy: {
      tls: {
        mode: ISTIO_MUTUAL
      }
   }
}

We can also see the move from HTTP to HTTPS in the logs.

We can expose this particular field from the logs in the UI by clicking one one log entry and then clicking on the value of the field you want to display, in our case, click on "http" next to "protocol:

This results in a nice way to visualize the changeover.:

10. Canary Deployments

Objective: Rollout a new version of the frontend Service.

Rollout frontend-v2 (next production version) Service in one region
Use DestinationRules and VirtualServices to slowly steer traffic to frontend-v2
Verify GitOps deployment pipeline by inspecting series of commits to the k8s-repo

Copy-and-Paste Method Lab Instructions

A canary deployment is a progressive rollout of a new service. In a canary deployment, you send an increasing amount of traffic to the new version, while still sending the remaining percentage to the current version. A common pattern is to perform a canary analysis at each stage of traffic splitting, and compare the "golden signals" of the new version (latency, error rate, saturation) against a baseline. This helps prevent outages, and ensure the stability of the new "v2" service at every stage of traffic splitting.

In this section, you will learn how to use Cloud Build and Istio traffic policies to create a basic canary deployment for a new version of the frontend service.

First, we'll run the Canary pipeline in the DEV1 region (us-west1), and roll out frontend v2 on both clusters in that region. Second, we'll run the Canary pipeline in the DEV2 region (us-central), and deploy v2 onto both clusters in that region. Running the pipeline on regions in order, versus in parallel across all regions, helps avoid global outages caused by bad configuration, or by bugs in the v2 app itself.

Note : we'll manually trigger the Canary pipeline in both regions, but in production, you would use an automated trigger, for instance based on a new Docker image tag pushed to a registry.

From Cloud Shell, define some env variables to simplify running the rest of the commands.

CANARY_DIR="$WORKDIR/asm/k8s_manifests/prod/app-canary/"
K8S_REPO="$WORKDIR/k8s-repo"

Run the repo_setup.sh script, to copy the baseline manifests into k8s-repo.

$CANARY_DIR/repo-setup.sh

The following manifests are copied:

frontend-v2 deployment
frontend-v1 patch (to include the "v1" label, and an image with a "/version" endpoint)
respy , a small pod that will print HTTP response distribution, and help us visualize the canary deployment in real time.
frontend Istio DestinationRule - splits the frontend Kubernetes Service into two subsets, v1 and v2, based on the "version" deployment label
frontend Istio VirtualService - routes 100% of traffic to frontend v1. This overrides the Kubernetes Service default round-robin behavior, which would immediately send 50% of all Dev1 regional traffic to frontend v2.

Commit changes to k8s_repo:

cd $K8S_REPO 
git add . && git commit -am "frontend canary setup"
git push

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Navigate to Cloud Build in the console for the OPS1 project. Wait for the Cloud Build pipeline to complete, then get pods in the frontend namespace in both DEV1 clusters. You should see the following:

watch -n 1 kubectl --context $DEV1_GKE_1 get pods -n frontend

Output (do not copy)

NAME                           READY   STATUS    RESTARTS   AGE
frontend-578b5c5db6-h9567      2/2     Running   0          59m
frontend-v2-54b74fc75b-fbxhc   2/2     Running   0          2m26s
respy-5f4664b5f6-ff22r         2/2     Running   0          2m26s

We will use tmux to split our cloudshell window into 2 panes:

The bottom pane will be running the watch command to observe the HTTP response distribution for the frontend service.
The top pane will be running the actual canary pipeline script.

Run the command to split the cloud shell window and execute the watch command in the bottom pane.

RESPY_POD=$(kubectl --context $DEV1_GKE_1 get pod \
-n frontend -l app=respy -o jsonpath='{..metadata.name}')
export TMUX_SESSION=$(tmux display-message -p '#S')
tmux split-window -d -t $TMUX_SESSION:0 -p33 \
-v "export KUBECONFIG=$WORKDIR/asm/gke/kubemesh; \
kubectl --context $DEV1_GKE_1 exec -n frontend -it \
$RESPY_POD -c respy /bin/sh -- -c 'watch -n 1 ./respy \
--u http://frontend:80/version --c 10 --n 500'; sleep 2"

Output (do not copy)

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v1       | 100.0%            |
|          |                   |
+----------+-------------------+

Execute the canary pipeline on the Dev1 region. We provide a script that updates frontend-v2 traffic percentages in the VirtualService (updating weights to 20%, 50%, 80%, then 100%). Between updates, the script waits for the Cloud Build pipeline to complete. Run the canary deployment script for the Dev1 region. Note - this script takes about 10 minutes to complete.

K8S_REPO=$K8S_REPO CANARY_DIR=$CANARY_DIR \
OPS_DIR=$OPS_GKE_1_CLUSTER OPS_CONTEXT=$OPS_GKE_1 \
${CANARY_DIR}/auto-canary.sh

You can see traffic splitting in real time in the bottom window where you're running the respy command. For instance, at the 20% mark :

Output (do not copy)

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v1       | 79.4%             |
|          |                   |
| v2       | 20.6%             |
|          |                   |
+----------+-------------------+

Once the Dev2 rollout completes for frontend-v2, you should see a success message at the end of the script:
```
 Output (do not copy) 
```

✅ 100% successfully deployed
🌈 frontend-v2 Canary Complete for gke-asm-1-r1-prod

And all frontend traffic from a Dev2 pod should be going to frontend-v2:
```
 Output (do not copy) 
```

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v2       | 100.0%            |
|          |                   |
+----------+-------------------+

Close the split pane.

tmux respawn-pane -t ${TMUX_SESSION}:0.1 -k 'exit'

Navigate to Cloud Source Repos at the link generated.

echo https://source.developers.google.com/p/$TF_VAR_ops_project_name/r/k8s-repo

You should see a separate commit for each traffic percentage, with the most recent commit at the top of the list:

Now, you will repeat the same process for the Dev2 region. Note that the Dev2 region is still "locked" on v1. This is because in the baseline repo_setup script, we pushed a VirtualService to explicitly send all traffic to v1. This way, we were able to safely do a regional canary on Dev1, and make sure it ran successfully before rolling out the new version globally.

Run the command to split the cloud shell window and execute the watch command in the bottom pane.

RESPY_POD=$(kubectl --context $DEV2_GKE_1 get pod \
-n frontend -l app=respy -o jsonpath='{..metadata.name}')
export TMUX_SESSION=$(tmux display-message -p '#S')
tmux split-window -d -t $TMUX_SESSION:0 -p33 \
-v "export KUBECONFIG=$WORKDIR/asm/gke/kubemesh; \
kubectl --context $DEV2_GKE_1 exec -n frontend -it \
$RESPY_POD -c respy /bin/sh -- -c 'watch -n 1 ./respy \
--u http://frontend:80/version --c 10 --n 500'; sleep 2"

Output (do not copy)

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v1       | 100.0%            |
|          |                   |
+----------+-------------------+

Execute the canary pipeline on the Dev2 region. We provide a script that updates frontend-v2 traffic percentages in the VirtualService (updating weights to 20%, 50%, 80%, then 100%). Between updates, the script waits for the Cloud Build pipeline to complete. Run the canary deployment script for the Dev1 region. Note - this script takes about 10 minutes to complete.

K8S_REPO=$K8S_REPO CANARY_DIR=$CANARY_DIR \
OPS_DIR=$OPS_GKE_2_CLUSTER OPS_CONTEXT=$OPS_GKE_2 \
${CANARY_DIR}/auto-canary.sh

Output (do not copy)

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v1       | 100.0%            |
|          |                   |
+----------+-------------------+

From the Respy pod in Dev2, watch traffic from Dev2 pods move progressively from frontend v1 to v2. Once the script completes, you should see:

Output (do not copy)

500 requests to http://frontend:80/version...
+----------+-------------------+
| RESPONSE | % OF 500 REQUESTS |
+----------+-------------------+
| v2       | 100.0%            |
|          |                   |
+----------+-------------------+

Close the split pane.

tmux respawn-pane -t ${TMUX_SESSION}:0.1 -k 'exit'

This section introduced how to use Istio for regional canary deployments. In production, instead of a manual script, you might automatically trigger this canary script as a Cloud Build pipeline, using a trigger such as a new tagged image pushed to a container registry. You would also want to add canary analysis in between each step, analyzing v2's latency and error rate against a predefined safety threshold, before sending over more traffic.

11. Authorization Policies

Objective: Set up RBAC between microservices (AuthZ).

Create AuthorizationPolicy to DENY access to a microservice
Create AuthorizationPolicy to ALLOW specific access to a microservice

Copy-and-Paste Method Lab Instructions

Unlike a monolithic application that might be running in one place, globally-distributed microservices apps make calls across network boundaries. This means more points of entry into your applications, and more opportunities for malicious attacks. And because Kubernetes pods have transient IPs, traditional IP-based firewall rules are no longer adequate to secure access between workloads. In a microservices architecture, a new approach to security is needed. Building on Kubernetes security building blocks like service accounts , Istio provides a flexible set of security policies for your applications.

Istio policies cover both authentication and authorization. Authentication verifies identity (is this server who they say they are?), and authorization verifies permissions (is this client allowed to do that?). We covered Istio authentication in the mutual TLS section in Module 1 (MeshPolicy). In this section, we will learn how to use Istio authorization policies to control access to one of our application workloads, currencyservice .

First, we'll deploy an AuthorizationPolicy across all 4 Dev clusters, closing off all access to currencyservice, and triggering an error in the frontend. Then, we will allow only the frontend service to access currencyservice.

Inspect the contents of currency-deny-all.yaml . This policy uses Deployment label selectors to restrict access to the currencyservice. Notice how there is no spec field - this means this policy will deny all access to the selected service.

cat $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-deny-all.yaml

Output (do not copy)

apiVersion: "security.istio.io/v1beta1"
kind: "AuthorizationPolicy"
metadata:
  name: "currency-policy"
  namespace: currency
spec:
  selector:
    matchLabels:
      app: currencyservice

Copy the currency policy into k8s-repo, for the ops clusters in both regions.

cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-deny-all.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-authorization/currency-policy.yaml
cd $WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-authorization
kustomize edit add resource currency-policy.yaml
cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-deny-all.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-authorization/currency-policy.yaml
cd $WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-authorization
kustomize edit add resource currency-policy.yaml

Push changes.

cd $WORKDIR/k8s-repo 
git add . && git commit -am "AuthorizationPolicy - currency: deny all"
git push

Check the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo https://console.cloud.google.com/cloud-build/builds?project=$TF_VAR_ops_project_name

After the build finishes successfully, try to reach the hipstershop frontend in a browser on the following link:

echo "https://frontend.endpoints.$TF_VAR_ops_project_name.cloud.goog"

You should see an Authorization error from currencyservice:

Let's investigate how the currency service is enforcing this AuthorizationPolicy. First, enable trace-level logs on the Envoy proxy for one of the currency pods, since blocked authorization calls aren't logged by default.

CURRENCY_POD=$(kubectl --context $DEV1_GKE_2 get pod -n currency | grep currency| awk '{ print $1 }')
kubectl --context $DEV1_GKE_2 exec -it $CURRENCY_POD -n \
currency -c istio-proxy -- curl -X POST \
"http://localhost:15000/logging?level=trace"

Get the RBAC (authorization) logs from the currency service's sidecar proxy. You should see an "enforced denied" message, indicating that the currencyservice is set to block all inbound requests.

kubectl --context $DEV1_GKE_2 logs -n currency $CURRENCY_POD \
-c istio-proxy | grep -m 3 rbac

Output (do not copy)

[Envoy (Epoch 0)] [2020-01-30 00:45:50.815][22][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:67] checking request: remoteAddress: 10.16.5.15:37310, localAddress: 10.16.3.8:7000, ssl: uriSanPeerCertificate: spiffe://cluster.local/ns/frontend/sa/frontend, subjectPeerCertificate: , headers: ':method', 'POST'
[Envoy (Epoch 0)] [2020-01-30 00:45:50.815][22][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:118] enforced denied
[Envoy (Epoch 0)] [2020-01-30 00:45:50.815][22][debug][http] [external/envoy/source/common/http/conn_manager_impl.cc:1354] [C115][S17310331589050212978] Sending local reply with details rbac_access_denied

Now, let's allow the frontend – but not the other backend services – to access currencyservice. Open currency-allow-frontend.yaml and inspect its contents. Note that we've added the following rule:

cat ${WORKDIR}/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend.yaml

Output (do not copy)

rules:
 - from:
   - source:
       principals: ["cluster.local/ns/frontend/sa/frontend"]

Here, we are whitelisting a specific source.principal (client) to access currency service. This source.principal is defined by is Kubernetes Service Account. In this case, the service account we are whitelisting is the frontend service account in the frontend namespace.

Note: when using Kubernetes Service Accounts in Istio AuthorizationPolicies, you must first enable cluster-wide mutual TLS, as we did in Module 1. This is to ensure that service account credentials are mounted into requests.

Copy over the updated currency policy

cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-authorization/currency-policy.yaml
cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-authorization/currency-policy.yaml

Push changes.

cd $WORKDIR/k8s-repo
git add . && git commit -am "AuthorizationPolicy - currency: allow frontend"
git push

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo https://console.cloud.google.com/cloud-build/builds?project=$TF_VAR_ops_project_name

After the build finishes successfully, open the Hipstershop frontend again. This time you should see no errors in the homepage - this is because the frontend is explicitly allowed to access the current service.
Now, try to execute a checkout, by adding items to your cart and clicking "place order." This time, you should see a price-conversion error from currency service - this is because we have only whitelisted the frontend, so the checkoutservice is still unable to access currencyservice.

Finally, let's allow the checkout service access to currency, by adding another rule to our currencyservice AuthorizationPolicy. Note that we are only opening up currency access to the two services that need to access it - frontend and checkout. The other backends will still be blocked.
Open currency-allow-frontend-checkout.yaml and inspect its contents. Notice that the list of rules functions as a logical OR - currency will accept only requests from workloads with either of these two service accounts.

cat ${WORKDIR}/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend-checkout.yaml

Output (do not copy)

apiVersion: "security.istio.io/v1beta1"
kind: "AuthorizationPolicy"
metadata:
  name: "currency-policy"
  namespace: currency
spec:
  selector:
    matchLabels:
      app: currencyservice
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/frontend/sa/frontend"]
  - from:
    - source:
        principals: ["cluster.local/ns/checkout/sa/checkout"]

Copy the final authorization policy to k8s-repo.

cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend-checkout.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_1_CLUSTER/app-authorization/currency-policy.yaml
cp $WORKDIR/asm/k8s_manifests/prod/app-authorization/currency-allow-frontend-checkout.yaml \
$WORKDIR/k8s-repo/$OPS_GKE_2_CLUSTER/app-authorization/currency-policy.yaml

Push changes

cd $WORKDIR/k8s-repo 
git add . && git commit -am "AuthorizationPolicy - currency: allow frontend and checkout"
git push

View the status of the Ops project Cloud Build in a previously opened tab or by clicking the following link:

echo https://console.cloud.google.com/cloud-build/builds?project=$TF_VAR_ops_project_name

After the build finishes successfully, try to execute a checkout - it should work successfully.

This section walked through how to use Istio Authorization Policies to enforce granular access control at the per-service level. In production, you might create one AuthorizationPolicy per service, and (for instance) use an allow-all policy to let all workloads in the same namespace access each other.

12. Infrastructure Scaling

Objective: Scale infrastructure by adding new region, project, and clusters.

Clone the infrastructure repo
Update the terraform files to create new resources
2 subnets in the new region (one for the ops project and one for the new project)
New ops cluster in new region (in the new subnet)
New Istio control plane for the new region
2 apps clusters in the new project in the new region
Commit to infrastructure repo
Проверьте установку

Copy-and-Paste Method Lab Instructions

There are a number of ways to scale a platform. You can add more compute by adding nodes to existing clusters. You can add more clusters in a region. Or you can add more regions to the platform. The decision on what aspect of the platform to scale depends upon the requirements. For example, if you have clusters in all three zones in a region, perhaps adding more nodes (or node pools) to existing cluster may suffice. However, if you have clusters in two of three zones in a single region, then adding a new cluster in the third zone gives you scaling and an additional fault domain (ie a new zone). Another reason for adding a new cluster in a region might be the need to create a single tenant cluster - for regulatory or compliance reasons (for example PCI, or a database cluster that houses PII information). As your business and services expand, adding new regions become inevitable to provide services closer to the clients.

The current platform consists of two regions and clusters in two zones per region. You can think of scaling the platform in two ways:

Vertically - within each region by adding more compute. This is done either by adding more nodes (or node pools) to existing clusters or by adding new clusters within the region. This is done via the infrastructure repo. The simplest path is adding nodes to existing clusters. No additional configuration is required. Adding new clusters may require additional subnets (and secondary ranges), adding appropriate firewall rules, adding the new clusters to the regional ASM/Istio service mesh control plane and deploying application resources to the new clusters.
Horizontally - by adding more regions. The current platform gives you a regional template. It consists on a regional ops cluster where the ASM/Istio control please resides and two (or more) zonal application clusters where application resources are deployed.

In this workshop, you scale the platform "horizontally" as it encompasses the vertical use case steps as well. In order to horizontally, scale the platform by adding a new region (r3) to the platform, the following resources need to be added:

Subnets in the host project shared VPC in region r3 for the new ops and application clusters.
A regional ops cluster in region r3 where the ASM/Istio control plane resides.
Two zonal application clusters in two zones on region r3.
Update to the k8s-repo:
Deploy ASM/Istio control plane resources to the ops cluster in region r3.
Deploy ASM/Istio shared control plane resources to the app clusters in region r3.
While you don't need to create a new project, the steps in the workshop demonstrate adding a new project dev3 to cover the use case of adding a new team to the platform.

Infrastructure repo is used to add new resources stated above.

In Cloud Shell, navigate to WORKDIR and clone the infrastructure repo.

mkdir -p $WORKDIR/infra-repo
cd $WORKDIR/infra-repo
git init && git remote add origin https://source.developers.google.com/p/${TF_ADMIN}/r/infrastructure
git config --local user.email ${MY_USER}
git config --local user.name "infra repo user"
git config --local credential.'https://source.developers.google.com'.helper gcloud.sh
git pull origin master

Clone the workshop source repo add-proj branch into the add-proj-repo directory.

cd $WORKDIR
git clone https://github.com/GoogleCloudPlatform/anthos-service-mesh-workshop.git add-proj-repo -b add-proj

Copy files from the add-proj branch in the source workshop repo. The add-proj branch contains the changes for this section.

cp -r $WORKDIR/add-proj-repo/infrastructure/* $WORKDIR/infra-repo/

Replace the infrastructure directory in the add-proj repo directory with a symlink to the infra-repo directory to allow the scripts on the branch to run.

rm -rf $WORKDIR/add-proj-repo/infrastructure
ln -s $WORKDIR/infra-repo $WORKDIR/add-proj-repo/infrastructure

Run the add-project.sh script to copy the shared states and vars to the new project directory structure.

$WORKDIR/add-proj-repo/scripts/add-project.sh app3 $WORKDIR/asm $WORKDIR/infra-repo

Commit and push changes to create new project

cd $WORKDIR/infra-repo
git add .
git status
git commit -m "add new project" && git push origin master

The commit triggers the infrastructure repo to deploy the infrastructure with the new resources. View the Cloud Build progress by clicking on the output of the following link and navigating to the latest build at the top.

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_ADMIN}"

The last step of the infrastructure Cloud Build creates new Kubernetes resources in the k8s-repo . This triggers the Cloud Build in the k8s-repo (in the ops project). The new Kubernetes resources are for the three new clusters added in the previous step. ASM/Istio control plane and shared control plane resources are added to the new clusters with the k8s-repo Cloud Build.

After the infrastructure Cloud Build successfully finishes, navigate to the k8s-repo latest Cloud Build run by clicking on the following output link.

echo "https://console.cloud.google.com/cloud-build/builds?project=${TF_VAR_ops_project_name}"

Run the following script to add the new clusters to the vars and kubeconfig file.

$WORKDIR/add-proj-repo/scripts/setup-gke-vars-kubeconfig-add-proj.sh $WORKDIR/asm

Change the KUBECONFIG variable to point to the new kubeconfig file.

source $WORKDIR/asm/vars/vars.sh
export KUBECONFIG=$WORKDIR/asm/gke/kubemesh

List your cluster contexts. You should see eight clusters.

kubectl config view -ojson | jq -r '.clusters[].name'

    `Output (do not copy)`

gke_user001-200204-05-dev1-49tqc4_us-west1-a_gke-1-apps-r1a-prod
gke_user001-200204-05-dev1-49tqc4_us-west1-b_gke-2-apps-r1b-prod
gke_user001-200204-05-dev2-49tqc4_us-central1-a_gke-3-apps-r2a-prod
gke_user001-200204-05-dev2-49tqc4_us-central1-b_gke-4-apps-r2b-prod
gke_user001-200204-05-dev3-49tqc4_us-east1-b_gke-5-apps-r3b-prod
gke_user001-200204-05-dev3-49tqc4_us-east1-c_gke-6-apps-r3c-prod
gke_user001-200204-05-ops-49tqc4_us-central1_gke-asm-2-r2-prod
gke_user001-200204-05-ops-49tqc4_us-east1_gke-asm-3-r3-prod
gke_user001-200204-05-ops-49tqc4_us-west1_gke-asm-1-r1-prod

Verify Istio Installation

Ensure Istio is installed on the new ops cluster by checking all pods are running and jobs have completed.

kubectl --context $OPS_GKE_3 get pods -n istio-system

    `Output (do not copy)`

NAME                                      READY   STATUS    RESTARTS   AGE
grafana-5f798469fd-72g6w                  1/1     Running   0          5h12m
istio-citadel-7d8595845-hmmvj             1/1     Running   0          5h12m
istio-egressgateway-779b87c464-rw8bg      1/1     Running   0          5h12m
istio-galley-844ddfc788-zzpkl             2/2     Running   0          5h12m
istio-ingressgateway-59ccd6574b-xfj98     1/1     Running   0          5h12m
istio-pilot-7c8989f5cf-5plsg              2/2     Running   0          5h12m
istio-policy-6674bc7678-2shrk             2/2     Running   3          5h12m
istio-sidecar-injector-7795bb5888-kbl5p   1/1     Running   0          5h12m
istio-telemetry-5fd7cbbb47-c4q7b          2/2     Running   2          5h12m
istio-tracing-cd67ddf8-2qwkd              1/1     Running   0          5h12m
istiocoredns-5f7546c6f4-qhj9k             2/2     Running   0          5h12m
kiali-7964898d8c-l74ww                    1/1     Running   0          5h12m
prometheus-586d4445c7-x9ln6               1/1     Running   0          5h12m

Ensure Istio is installed on both dev3 clusters. Only Citadel, sidecar-injector and coredns run in the dev3 clusters. They share an Istio controlplane running in the ops-3 cluster.

kubectl --context $DEV3_GKE_1 get pods -n istio-system
kubectl --context $DEV3_GKE_2 get pods -n istio-system

    `Output (do not copy)`

NAME                                      READY   STATUS    RESTARTS   AGE
istio-citadel-568747d88-4lj9b             1/1     Running   0          66s
istio-sidecar-injector-759bf6b4bc-ks5br   1/1     Running   0          66s
istiocoredns-5f7546c6f4-qbsqm             2/2     Running   0          78s

Verify service discovery for shared control planes

Verify the secrets are deployed in all ops clusters for all six application clusters.

kubectl --context $OPS_GKE_1 get secrets -l istio/multiCluster=true -n istio-system
kubectl --context $OPS_GKE_2 get secrets -l istio/multiCluster=true -n istio-system
kubectl --context $OPS_GKE_3 get secrets -l istio/multiCluster=true -n istio-system

    `Output (do not copy)`

NAME                  TYPE     DATA   AGE
gke-1-apps-r1a-prod   Opaque   1      14h
gke-2-apps-r1b-prod   Opaque   1      14h
gke-3-apps-r2a-prod   Opaque   1      14h
gke-4-apps-r2b-prod   Opaque   1      14h
gke-5-apps-r3b-prod   Opaque   1      5h12m
gke-6-apps-r3c-prod   Opaque   1      5h12m

13. Circuit Breaking

Objective: Implement a Circuit Breaker for the shipping Service.

Create a DestinationRule for the shipping Service to implement a circuit breaker
Use fortio (a load gen utility) to validate circuit breaker for the shipping Service by force tripping the circuit

Fast Track Script Lab Instructions

Fast Track Script Lab is coming soon!!

Copy-and-Paste Method Lab Instructions

Now that we've learned some basic monitoring and troubleshooting strategies for Istio-enabled services, let's look at how Istio helps you improve the resilience of your services, reducing the amount of troubleshooting you'll have to do in the first place.

A microservices architecture introduces the risk of cascading failures , where the failure of one service can propagate to its dependencies, and the dependencies of those dependencies, causing a "ripple effect" outage that can potentially affect end-users. Istio provides a Circuit Breaker traffic policy to help you isolate services, protecting downstream (client-side) services from waiting on failing services, and protecting upstream (server-side) services from a sudden flood of downstream traffic when they do come back online. Overall, using Circuit Breakers can help you avoid all your services failing their SLOs because of one backend service that is hanging.

The Circuit Breaker pattern is named for an electrical switch that can "trip" when too much electricity flows through, protecting devices from overload. In an Istio setup , this means that Envoy is the circuit breaker, keeping track of the number of pending requests for a service. In this default closed state, requests flow through Envoy uninterrupted.

But when the number of pending requests exceeds your defined threshold, the circuit breaker trips (opens), and Envoy immediately returns an error. This allows the server to fail fast for the client, and prevents the server application code from receiving the client's request when overloaded.

Then, after your defined timeout, Envoy moves to a half open state, where the server can start receiving requests again in a probationary way, and if it can successfully respond to requests, the circuit breaker closes again, and requests to the server begin to flow again.

This diagram summarizes the Istio circuit breaker pattern. The blue rectangles represent Envoy, the blue-filled circle represents the client, and the white-filled circles represent the server container:

You can define Circuit Breaker policies using Istio DestinationRules. In this section, we'll apply the following policy to enforce a circuit breaker for the shipping service:

Output (do not copy)

apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "shippingservice-shipping-destrule"
  namespace: "shipping"
spec:
  host: "shippingservice.shipping.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 10s
      maxEjectionPercent: 100

There are two DestinationRule fields to note here. connectionPool defines the number of connections this service will allow. The outlierDetection field is where we configure how Envoy will determine the threshold at which to open the circuit breaker. Here, every second (interval), Envoy will count the number of errors it received from the server container. If it exceeds the consecutiveErrors threshold, the Envoy circuit breaker will open, and 100% of productcatalog pods will be shielded from new client requests for 10 seconds. Once the Envoy circuit breaker is open (ie. active), clients will receive 503 (Service Unavailable) errors. Let's see this in action.

Set environment variables for the k8s-repo and asm dir to simplify commands.

export K8S_REPO="${WORKDIR}/k8s-repo"
export ASM="${WORKDIR}/asm"

Update the k8s-repo

cd $WORKDIR/k8s-repo
git pull
cd $WORKDIR

Update the shipping service DestinationRule on both Ops clusters.

cp $ASM/k8s_manifests/prod/istio-networking/app-shipping-circuit-breaker.yaml ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/app-shipping-circuit-breaker.yaml
cp $ASM/k8s_manifests/prod/istio-networking/app-shipping-circuit-breaker.yaml ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/app-shipping-circuit-breaker.yaml

cd ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/; kustomize edit add resource app-shipping-circuit-breaker.yaml
cd ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/; kustomize edit add resource app-shipping-circuit-breaker.yaml

Copy a Fortio load generator pod into the GKE_1 cluster in the Dev1 region. This is the client pod we'll use to "trip" the circuit breaker for shippingservice.

cp $ASM/k8s_manifests/prod/app/deployments/app-fortio.yaml ${K8S_REPO}/${DEV1_GKE_1_CLUSTER}/app/deployments/
cd ${K8S_REPO}/${DEV1_GKE_1_CLUSTER}/app/deployments; kustomize edit add resource app-fortio.yaml

Commit changes.

cd $K8S_REPO 
git add . && git commit -am "Circuit Breaker: shippingservice"
git push
cd $ASM

Wait for Cloud Build to complete.
Back in Cloud Shell, use the fortio pod to send gRPC traffic to shippingservice with 1 concurrent connection, 1000 requests total - this will not trip the circuit breaker, because we have not exceeded the connectionPool settings yet.

FORTIO_POD=$(kubectl --context ${DEV1_GKE_1} get pod -n shipping | grep fortio | awk '{ print $1 }')

kubectl --context ${DEV1_GKE_1} exec -it $FORTIO_POD -n shipping -c fortio /usr/bin/fortio -- load -grpc -c 1 -n 1000 -qps 0 shippingservice.shipping.svc.cluster.local:50051

Output (do not copy)

Health SERVING : 1000
All done 1000 calls (plus 0 warmup) 4.968 ms avg, 201.2 qps

Now run fortio again, increasing the number of concurrent connections to 2, but keeping the total number of requests constant. We should see up to two-thirds of the requests return an "overflow" error, because the circuit breaker has been tripped: in the policy we defined, only 1 concurrent connection is allowed in a 1-second interval.

kubectl --context ${DEV1_GKE_1} exec -it $FORTIO_POD -n shipping -c fortio /usr/bin/fortio -- load -grpc -c 2 -n 1000 -qps 0 shippingservice.shipping.svc.cluster.local:50051

Output (do not copy)

18:46:16 W grpcrunner.go:107> Error making grpc call: rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: overflow
...

Health ERROR : 625
Health SERVING : 375
All done 1000 calls (plus 0 warmup) 12.118 ms avg, 96.1 qps

Envoy keeps track of the number of connections it dropped when the circuit breaker is active, with the upstream_rq_pending_overflow metric. Let's find this in the fortio pod:

kubectl --context ${DEV1_GKE_1} exec -it $FORTIO_POD -n shipping -c istio-proxy  -- sh -c 'curl localhost:15000/stats' | grep shipping | grep pending

Output (do not copy)

cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.upstream_rq_pending_failure_eject: 9
cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.upstream_rq_pending_overflow: 565
cluster.outbound|50051||shippingservice.shipping.svc.cluster.local.upstream_rq_pending_total: 1433

Clean up by removing the circuit breaker policy from both regions.

kubectl --context ${OPS_GKE_1} delete destinationrule shippingservice-circuit-breaker -n shipping 
rm ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/app-shipping-circuit-breaker.yaml
cd ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/; kustomize edit remove resource app-shipping-circuit-breaker.yaml
 

kubectl --context ${OPS_GKE_2} delete destinationrule shippingservice-circuit-breaker -n shipping 
rm ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/app-shipping-circuit-breaker.yaml
cd ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/; kustomize edit remove resource app-shipping-circuit-breaker.yaml
cd $K8S_REPO; git add .; git commit -m "Circuit Breaker: cleanup"; git push origin master

This section demonstrated how to set up a single circuit breaker policy for a service. A best practice is to set up a circuit breaker for any upstream (backend) service that has the potential to hang. By applying Istio circuit breaker policies, you help isolate your microservices, build fault tolerance into your architecture, and reduce the risk of cascading failures under high load.

14. Fault Injection

Objective: Test the resilience of the recommendation Service by introducing delays (before it is pushed to production).

Create a VirtualService for the recommendation Service to introduce a 5s delay
Test the delay using fortio load generator
Remove the delay in the VirtualService and validate

Fast Track Script Lab Instructions

Fast Track Script Lab is coming soon!!

Copy-and-Paste Method Lab Instructions

Adding circuit breaker policies to your services is one way to build resilience against services in production. But circuit breaking results in faults — potentially user-facing errors — which is not ideal. To get ahead of these error cases, and better predict how your downstream services might respond when backends do return errors, you can adopt chaos testing in a staging environment. Chaos testing is the practice of deliberately breaking your services, in order to analyze weak points in the system and improve fault tolerance. You can also use chaos testing to identify ways to mitigate user-facing errors when backends fail - for instance, by displaying a cached result in a frontend.

Using Istio for fault injection is helpful because you can use your production release images, and add the fault at the network layer, instead of modifying source code. In production, you might use a full-fledged chaos testing tool to test resilience at the Kubernetes/compute layer in addition to the network layer.

You can use Istio for chaos testing by applying a VirtualService with the "fault" field. Istio supports two kinds of faults: delay faults (inject a timeout) and abort faults (inject HTTP errors). In this example, we'll inject a 5-second delay fault into the recommendations service . But this time instead of using a circuit breaker to "fail fast" against this hanging service, we will force downstream services to endure the full timeout.

Navigate into the fault injection directory.

export K8S_REPO="${WORKDIR}/k8s-repo"
export ASM="${WORKDIR}/asm/" 
cd $ASM

Open k8s_manifests/prod/istio-networking/app-recommendation-vs-fault.yaml to inspect its contents. Notice that Istio has an option to inject the fault into a percentage of the requests - here, we'll introduce a timeout into all recommendationservice requests.

Output (do not copy)

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: recommendation-delay-fault
spec:
  hosts:
  - recommendationservice.recommendation.svc.cluster.local
  http:
  - route:
    - destination:
        host: recommendationservice.recommendation.svc.cluster.local
    fault:
      delay:
        percentage:
          value: 100
        fixedDelay: 5s

Copy the VirtualService into k8s_repo. We'll inject the fault globally, across both regions.

cp $ASM/k8s_manifests/prod/istio-networking/app-recommendation-vs-fault.yaml ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/app-recommendation-vs-fault.yaml
cd ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/; kustomize edit add resource app-recommendation-vs-fault.yaml

cp $ASM/k8s_manifests/prod/istio-networking/app-recommendation-vs-fault.yaml ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/app-recommendation-vs-fault.yaml
cd ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/; kustomize edit add resource app-recommendation-vs-fault.yaml

Push changes

cd $K8S_REPO 
git add . && git commit -am "Fault Injection: recommendationservice"
git push
cd $ASM

Wait for Cloud Build to complete.
Exec into the fortio pod deployed in the circuit breaker section, and send some traffic to recommendationservice.

FORTIO_POD=$(kubectl --context ${DEV1_GKE_1} get pod -n shipping | grep fortio | awk '{ print $1 }')

kubectl --context ${DEV1_GKE_1} exec -it $FORTIO_POD -n shipping -c fortio /usr/bin/fortio -- load -grpc -c 100 -n 100 -qps 0 recommendationservice.recommendation.svc.cluster.local:8080

    Once the fortio command is complete, you should see responses averaging 5s:

Output (do not copy)

Ended after 5.181367359s : 100 calls. qps=19.3
Aggregated Function Time : count 100 avg 5.0996506 +/- 0.03831 min 5.040237641 max 5.177559818 sum 509.965055

Another way to see the fault we injected in action is open the frontend in a web browser, and click on any product. A product page should take 5 extra seconds to load, since it fetches the recommendations that are displayed at the bottom of the page.
Clean up by removing the fault injection service from both Ops clusters.

kubectl --context ${OPS_GKE_1} delete virtualservice recommendation-delay-fault -n recommendation 
rm ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/app-recommendation-vs-fault.yaml
cd ${K8S_REPO}/${OPS_GKE_1_CLUSTER}/istio-networking/; kustomize edit remove resource app-recommendation-vs-fault.yaml

kubectl --context ${OPS_GKE_2} delete virtualservice recommendation-delay-fault -n recommendation 
rm ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/app-recommendation-vs-fault.yaml
cd ${K8S_REPO}/${OPS_GKE_2_CLUSTER}/istio-networking/; kustomize edit remove resource app-recommendation-vs-fault.yaml

Push changes:

cd $K8S_REPO 
git add . && git commit -am "Fault Injection cleanup / restore"
git push
cd $ASM

15. Monitoring the Istio Control Plane

ASM installs four important control plane components: Pilot, Mixer, Galley and Citadel. Each sends its relevant monitoring metrics to Prometheus, and ASM ships with Grafana dashboards that let operators visualize this monitoring data and assess the health and performance of the control plane.

Viewing the Dashboards

Port-forward your Grafana service installed with Istio

kubectl --context ${OPS_GKE_1} -n istio-system port-forward svc/grafana 3000:3000 >> /dev/null

Open Grafana in your browser
Click on the "Web Preview" icon on the top right corner of your Cloud Shell Window
Click Preview on port 3000 (Note: if the port is not 3000, click on change port and select port 3000)
This will open a tab in your browser with a URL similar to " BASE_URL/?orgId=1&authuser=0&environment_id=default "
View available dashboards
Modify the URL to " BASE_URL/dashboard "
Click on "istio" folder to view available dashboards
Click on any of those dashboards to view the performance of that component. We'll look at the important metrics for each component in the following sections.

Monitoring Pilot

Pilot is the control plane component that distributes networking and policy configuration to the data plane (the Envoy proxies). Pilot tends to scale with the number of workloads and deployments, although not necessarily with the amount of traffic to those workloads. An unhealthy Pilot can:

consume more resources than necessary (CPU and/or RAM)
result in delays in pushing updated configuration information to Envoys

Note: if Pilot is down, or if there are delays, your workloads still serve traffic.

Navigate to " BASE_URL/dashboard/db/istio-pilot-dashboard " in your browser to view Pilot metrics.

Important monitored metrics

Использование ресурсов

Use the Istio Performance and Scalability page as your guide for acceptable usage numbers. Contact GCP support if you see significantly more sustained resource usage than this.

Pilot Push Information

This section monitors Pilots pushes of configuration to your Envoy proxies.

Pilot Pushes shows the type of configuration pushed at any given time.
ADS Monitoring shows the number of Virtual Services, Services and Connected Endpoints in the system.
Clusters with no known endpoints shows endpoints that have been configured but do not have any instances running (which may indicate external services, such as *.googleapis.com).
Pilot Errors show the number of errors encountered over time.
Conflicts show the number of conflicts which are ambiguous configuration on listeners.

If you have Errors or Conflicts, you have bad or inconsistent configuration for one or more of your services. See Troubleshooting the data plane for information.

Envoy Information

This section contains information about the Envoy proxies contacting the control plane. Contact GCP support if you see repeated XDS Connection Failures.

Monitoring Mixer

Mixer is the component that funnels telemetry from the Envoy proxies to telemetry backends (typically Prometheus, Stackdriver, etc). In this capacity, it is not in the data plane. It is deployed as two Kubernetes Jobs (called Mixer) deployed with two different service names (istio-telemetry and istio-policy).

Mixer can also be used to integrate with policy systems. In this capacity, Mixer does affect the data plane, as policy checks to Mixer that fail block access to your services.

Mixer tends to scale with volume of traffic.

Navigate to " BASE_URL/dashboard/db/istio-mixer-dashboard " in your browser to view Mixer metrics.

Important monitored metrics

Использование ресурсов

Use the Istio Performance and Scalability page as your guide for acceptable usage numbers. Contact GCP support if you see significantly more sustained resource usage than this.

Mixer Overview

Response Duration is an important metric. While reports to Mixer telemetry are not in the datapath, if these latencies are high it will definitely slow down sidecar proxy performance. You should expect the 90th percentile to be in the single-digit milliseconds, and the 99th percentile to be under 100ms.

Adapter Dispatch Duration indicates the latency Mixer is experiencing in calling adapters (through which it sends information to telemetry and logging systems). High latencies here will absolutely affect performance on the mesh. Again, p90 latencies should be under 10ms.

Monitoring Galley

Galley is Istio's configuration validation, ingestion, processing and distribution component. It conveys configuration from the Kubernetes API server to Pilot. Like Pilot, it tends to scale with the number of services and endpoints in the system.

Navigate to " BASE_URL/dashboard/db/istio-galley-dashboard " in your browser to view Galley metrics.

Important monitored metrics

Resource Validation

The most important metric to follow which indicates the number of resources of various types like Destination rules, Gateways and Service entries that are passing or failing validation.

Connected clients

Indicates how many clients are connected to Galley; typically this will be 3 (pilot, istio-telemetry, istio-policy) and will scale as those components scale.

16. Troubleshooting Istio

Troubleshooting the data plane

If your Pilot dashboard indicates that you have configuration issues, you should examine PIlot logs or use istioctl to find configuration problems.

To examine Pilot logs, run kubectl -n istio-system logs istio-pilot-69db46c598-45m44 discovery, replacing istio-pilot-... with the pod identifier for the Pilot instance you want to troubleshoot.

In the resulting log, search for a Push Status message. For example:

2019-11-07T01:16:20.451967Z        info        ads        Push Status: {
    "ProxyStatus": {
        "pilot_conflict_outbound_listener_tcp_over_current_tcp": {
            "0.0.0.0:443": {
                "proxy": "cartservice-7555f749f-k44dg.hipster",
                "message": "Listener=0.0.0.0:443 AcceptedTCP=accounts.google.com,*.googleapis.com RejectedTCP=edition.cnn.com TCPServices=2"
            }
        },
        "pilot_duplicate_envoy_clusters": {
            "outbound|15443|httpbin|istio-egressgateway.istio-system.svc.cluster.local": {
                "proxy": "sleep-6c66c7765d-9r85f.default",
                "message": "Duplicate cluster outbound|15443|httpbin|istio-egressgateway.istio-system.svc.cluster.local found while pushing CDS"
            },
            "outbound|443|httpbin|istio-egressgateway.istio-system.svc.cluster.local": {
                "proxy": "sleep-6c66c7765d-9r85f.default",
                "message": "Duplicate cluster outbound|443|httpbin|istio-egressgateway.istio-system.svc.cluster.local found while pushing CDS"
            },
            "outbound|80|httpbin|istio-egressgateway.istio-system.svc.cluster.local": {
                "proxy": "sleep-6c66c7765d-9r85f.default",
                "message": "Duplicate cluster outbound|80|httpbin|istio-egressgateway.istio-system.svc.cluster.local found while pushing CDS"
            }
        },
        "pilot_eds_no_instances": {
            "outbound_.80_._.frontend-external.hipster.svc.cluster.local": {},
            "outbound|443||*.googleapis.com": {},
            "outbound|443||accounts.google.com": {},
            "outbound|443||metadata.google.internal": {},
            "outbound|80||*.googleapis.com": {},
            "outbound|80||accounts.google.com": {},
            "outbound|80||frontend-external.hipster.svc.cluster.local": {},
            "outbound|80||metadata.google.internal": {}
        },
        "pilot_no_ip": {
            "loadgenerator-778c8489d6-bc65d.hipster": {
                "proxy": "loadgenerator-778c8489d6-bc65d.hipster"
            }
        }
    },
    "Version": "o1HFhx32U4s="
}

The Push Status will indicate any issues that occurred when trying to push the configuration to Envoy proxies – in this case, we see several "Duplicate cluster" messages, which indicate duplicate upstream destinations.

For assistance in diagnosing problems, contact Google Cloud support with issues.

Finding configuration errors

In order to use istioctl to analyze your configuration, run istioctl experimental analyze -k --context $OPS_GKE_1 . This will perform an analysis of configuration in your system, indicate any problems along with any suggested changes. See documentation for a full list of configuration errors that this command can detect.

17. Cleanup

An administrator runs the cleanup_workshop.sh script to delete resources created by the bootstrap_workshop.sh script. You need the following pieces of information for the cleanup script to run.

Organization name - for example yourcompany.com
Workshop ID - in the form YYMMDD-NN for example 200131-01
Admin GCS bucket - defined in the bootstrap script.

Open Cloud Shell, perform all actions below in Cloud Shell. Click on the link below.

CLOUD SHELL

Verify you are logged into gcloud with the intended Admin user.

gcloud config list

Navigate you the asm folder.

cd ${WORKDIR}/asm

Define your Organization name and workshop ID to be deleted.

export ORGANIZATION_NAME=<ORGANIZATION NAME>
export ASM_WORKSHOP_ID=<WORKSHOP ID>
export ADMIN_STORAGE_BUCKET=<ADMIN CLOUD STORAGE BUCKET>

Run the cleanup script as follows.

./scripts/cleanup_workshop.sh --workshop-id ${ASM_WORKSHOP_ID} --admin-gcs-bucket ${ADMIN_STORAGE_BUCKET} --org-name ${ORGANIZATION_NAME}

Семинар Anthos Service Mesh: Руководство по лабораторной работе Оптимизируйте свои подборки Сохраняйте и классифицируйте контент в соответствии со своими настройками.

1. СЕМИНАР ПО АЛЬФА-ФАКТУРЕ

2. Обзор

Архитектурная схема

Повестка дня

Слайды

Предварительные требования

3. Настройка инфраструктуры — административный рабочий процесс

Объяснение скрипта Bootstrap Workshop

Для запуска мастерской требуются права администратора.

Схема пользователя и права доступа, используемые в ходе семинара.

Инструменты, необходимые для мастерской

Настройте мастерскую для себя (настройка для одного пользователя).

Настройка мастерской для нескольких пользователей (многопользовательская настройка)

4. Подготовка и обустройство лаборатории

Выберите свой лабораторный курс

Ускоренная настройка скрипта

Получить информацию о пользователе

5. Настройка инфраструктуры — Рабочий процесс пользователя

Цель: Проверка инфраструктуры и установки Istio.

Инструкции к лабораторной работе по методу копирования и вставки

Получить информацию о пользователе

Инструменты, необходимые для мастерской

Доступ к проекту администрирования Terraform

Проверьте установку

Проверка установки Istio

Проверка обнаружения служб для совместно используемых плоскостей управления.

6. Объяснение репозитория инфраструктуры

Создание облачной инфраструктуры

Структура папок — команды, среды и ресурсы.

Поставщик, состояния и выходные данные — бэкэнды и общие состояния

Переменные

Объяснение репозитория Kubernetes

Проекты, кластеры GKE и пространства имен

Kubernetes manifests and k8s_repo

7. Deploy the Sample App

Objective: Deploy Hipster shop app on apps clusters

Copy-and-Paste Method Lab Instructions

Clone the ops project source repo

Copy manifests, commit and push

Verify Application deployment

Access the Hipster Shop app

Global load balancing

GKE Autoneg controller

Secure Ingress using Cloud Endpoints and managed certs

Verify global load balancing

8. Observability with Stackdriver

Objective: Connect Istio telemetry to Stackdriver and validate.

Copy-and-Paste Method Lab Instructions

9. Mutual TLS Authentication

Objective: Secure connectivity between microservices (AuthN).

Copy-and-Paste Method Lab Instructions

Verify mTLS

10. Canary Deployments

Objective: Rollout a new version of the frontend Service.

Copy-and-Paste Method Lab Instructions

11. Authorization Policies

Objective: Set up RBAC between microservices (AuthZ).

Copy-and-Paste Method Lab Instructions

12. Infrastructure Scaling

Objective: Scale infrastructure by adding new region, project, and clusters.

Copy-and-Paste Method Lab Instructions

Verify Istio Installation

Verify service discovery for shared control planes

13. Circuit Breaking

Objective: Implement a Circuit Breaker for the shipping Service.

Fast Track Script Lab Instructions

Copy-and-Paste Method Lab Instructions

14. Fault Injection

Objective: Test the resilience of the recommendation Service by introducing delays (before it is pushed to production).

Fast Track Script Lab Instructions

Copy-and-Paste Method Lab Instructions

15. Monitoring the Istio Control Plane

Viewing the Dashboards

Monitoring Pilot

Important monitored metrics

Monitoring Mixer

Important monitored metrics

Monitoring Galley

Important monitored metrics

Семинар Anthos Service Mesh: Руководство по лабораторной работе