Kubeflow 1.1
www.kubeflow.org/docs/started/k8s/kfctl-k8s-istio/
Install Kustomize
kubectl.docs.kubernetes.io/installation/kustomize/binaries/
$ curl -s "https://raw.githubusercontent.com/\ kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
$ mv kustomize /usr/local/bin
$ [root@kubeflow MyKubeflow]# kustomize version
{Version:kustomize/v3.8.8 GitCommit:72262c5e7135045ed51b01e417a7e72f558a22b0 BuildDate:2020-12-10T18:05:35Z GoOs:linux GoArch:amd64} |
Install Dynamic Volume Provisioning
github.com/rancher/local-path-provisioner#deployment
$ kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
$ kubectl get sc
NAME PROVISIONER AGE
local-path rancher.io/local-path 81m
※ storageclass에 default 설정을 추가해준다.
$ kubectl edit sc local-path
annotations:
storageclass.kubernetes.io/is-default-class: "true"
$ kubectl get sc
NAME PROVISIONER AGE
local-path (default) rancher.io/local-path 81m
Insttall Kubeflow
$ yum install wget
$ wget http://github.com/kubeflow/kfctl/releases/download/v1.2.0/kfctl_v1.2.0-0-gbc038f9_linux.tar.gz
$ tar -xvf kfctl_v1.2.0-0-gbc038f9_linux.tar.gz
$ mv kfctl /usr/local/bin
$ export PATH=$PATH:/usr/local/bin
$ export export KF_NAME=MyKubeflow
$ export BASE_DIR=$HOME
$ export KF_DIR=${BASE_DIR}/${KF_NAME}
$ export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.1-branch/kfdef/kfctl_k8s_istio.v1.1.0.yaml"
$ mkdir -p ${KF_DIR}
$ cd ${KF_DIR}
$ kfctl apply -V -f ${CONFIG_URI}
※ 만약, 설치가 제대로 안된다면, 아래의 troubleshooing을 적용후 삭제후 재설치해본다.
$ kfctl delete -f kfctl_k8s_istio.v1.1.0.yaml
$ rm -rf ${KF_DIR}
$ cd ${KF_DIR}
$ kfctl apply -V -f ${CONFIG_URI}
※ 만약, istio-token 에러가 난다면, api-server에 설정을 추가해준다.
MountVolume.SetUp failed for volume "istio-token"
github.com/kubeflow/manifests/issues/959
$ vi /etc/kubernetes/manifests/kube-apiserver.yaml
- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
- --service-account-issuer=kubernetes.default.svc
※ 만약, unable to get metrics for resource cpu 에러가 난다면, metricserver를 설치한다.
github.com/kubernetes-sigs/metrics-server
$ wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
$ vi components.yaml
deployment 설정에 추가해준다 --kubelet-insecure-tls |
$ kubectl apply -f components.yaml
정상적으로 설치완료되면 아래와 같이 구성요소들이 기동된다.
$ kubectl get pod -n kubeflow
NAME READY STATUS RESTARTS AGE admission-webhook-bootstrap-stateful-set-0 1/1 Running 1 45m admission-webhook-deployment-5bc5f97cfd-dvsmc 1/1 Running 0 33m application-controller-stateful-set-0 1/1 Running 0 47m argo-ui-669bcd8bfc-2d4nz 1/1 Running 0 45m cache-deployer-deployment-b75f5c5f6-8hjtf 2/2 Running 1 45m cache-server-85bccd99bd-4hsgm 2/2 Running 0 45m centraldashboard-68965b5d89-8x6b8 1/1 Running 0 45m jupyter-web-app-deployment-5dfbb68956-j65jd 1/1 Running 0 45m katib-controller-76b78f5db-f4pnm 1/1 Running 0 45m katib-db-manager-67c9554b6d-h5dr7 1/1 Running 0 45m katib-mysql-5754b5dd66-mtwzd 1/1 Running 0 45m katib-ui-844b4fc655-mgndk 1/1 Running 0 44m kfserving-controller-manager-0 2/2 Running 0 45m kubeflow-pipelines-profile-controller-65b65d97bb-nlz29 1/1 Running 0 45m metacontroller-0 1/1 Running 0 45m metadata-db-695fb6f55-qfttw 1/1 Running 0 45m metadata-deployment-7d77b884b6-j77d7 1/1 Running 4 45m metadata-envoy-deployment-c5985d64b-kkfjk 1/1 Running 0 45m metadata-grpc-deployment-9fdb476-w9wvw 1/1 Running 2 44m metadata-ui-cf67fdb48-5rbfx 1/1 Running 0 45m metadata-writer-59d755696c-k75pb 2/2 Running 0 45m minio-6647564c5c-nng67 1/1 Running 0 45m ml-pipeline-6bc56cd86d-ctvnp 2/2 Running 7 45m ml-pipeline-persistenceagent-6f99b56974-vnl9q 2/2 Running 0 45m ml-pipeline-scheduledworkflow-d596b8bd-t768z 2/2 Running 0 45m ml-pipeline-ui-8695cc6b46-5mj5l 2/2 Running 0 44m ml-pipeline-viewer-crd-5998ff7f56-fg6nc 2/2 Running 1 44m ml-pipeline-visualizationserver-cbbb5b5b-429j9 2/2 Running 0 44m mpi-operator-c747f5bf6-tmf47 1/1 Running 1 44m mxnet-operator-7cd59d475-4htgf 1/1 Running 1 44m mysql-76597cf5b5-7tj8x 2/2 Running 0 44m notebook-controller-deployment-756587d86-4p4xr 1/1 Running 0 44m profiles-deployment-65fcc9c97-qt2bs 2/2 Running 0 44m pytorch-operator-5db58565b-76s8c 1/1 Running 1 44m seldon-controller-manager-6ddf664d54-dhgm7 1/1 Running 1 44m spark-operatorsparkoperator-85bbf89886-gk6dn 1/1 Running 0 45m spartakus-volunteer-7566cfd658-wv79t 1/1 Running 0 44m tf-job-operator-5bf84768bf-z2vjd 1/1 Running 1 44m workflow-controller-54dccb7dc4-rl4bm 1/1 Running 0 44m |
Dashboard 접속
$ kubectl get svc istio-ingressgateway -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway NodePort 10.111.109.51 <none> 15020:31630/TCP,80:31380/TCP,443:31390/TCP,31400:31400/TCP,15029:31116/TCP,15030:31533/TCP,15031:31478/TCP,15032:31955/TCP,15443:30855/TCP 50m
http://192.168.19.134:31380

Notebook Server 만들기


Fashion Mnist 실행해보기

import tensorflow as tf
class MyFashionMnist(object):
def train(self):
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.summary()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test, verbose=2)
if __name__ == '__main__':
local_train = MyFashionMnist()
local_train.train()
