修改svc的LoadBalancer的IP引发的惨案
- 互联网
- 2025-08-15 14:24:04

文章目录 背景修改externalIPs的操作api-server报错日志挽救教训 背景
k8s集群没有接外部负载均衡,部署istio的时候ingressgateway一直pending。 于是手动修改了这个lb svc的externalIP,于是k8s就崩了,如何崩的,且听我还道来。
修改externalIPs的操作修改了svc的这个位置,于是api-server就崩了。
[root@k8s-worker-node1 cloud-native-istio-archive]# k -n istio-system get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE istio-egressgateway ClusterIP 10.68.66.210 <none> 80/TCP,443/TCP 8d istio-ingressgateway LoadBalancer 10.68.215.92 <pending> 15021:30422/TCP,80:32418/TCP,443:31569/TCP,31400:32664/TCP,15443:31617/TCP 8d istiod ClusterIP 10.68.49.71 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 8d [root@k8s-worker-node1 cloud-native-istio-archive]# k -n istio-system edit svc istio-ingressgateway service/istio-ingressgateway edited [root@k8s-worker-node1 cloud-native-istio-archive]# [root@k8s-worker-node1 cloud-native-istio-archive]# [root@k8s-worker-node1 cloud-native-istio-archive]# k -n istio-system get svc The connection to the server 10.50.10.10:6443 was refused - did you specify the right host or port? [root@k8s-worker-node1 cloud-native-istio-archive]# [root@k8s-worker-node1 cloud-native-istio-archive]# [root@k8s-worker-node1 cloud-native-istio-archive]# [root@k8s-worker-node1 cloud-native-istio-archive]# k -n istio-system get svc The connection to the server 10.50.10.10:6443 was refused - did you specify the right host or port?如果 EXTERNAL-IP 有值(IP 地址或主机名),则说明您的环境具有可用于 Ingress 网关的外部负载均衡器。如果 EXTERNAL-IP 值是 (或一直是 ),则说明可能您的环境并没有为 Ingress 网关提供外部负载均衡器的功能。
api-server报错日志 [root@k8s-worker-node1 cloud-native-istio-archive]# systemctl status kube-apiserver -l ● kube-apiserver.service - Kubernetes API Server Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2023-10-19 17:19:09 CST; 1 weeks 1 days ago Docs: github /GoogleCloudPlatform/kubernetes Main PID: 45101 (kube-apiserver) Tasks: 10 Memory: 470.1M CGroup: /system.slice/kube-apiserver.service └─45101 /opt/kube/bin/kube-apiserver --allow-privileged=true --anonymous-auth=false --api-audiences=api,istio-ca --authorization-mode=Node,RBAC --bind-address=10.50.10.10 --client-ca-file=/etc/kubernetes/ssl/ca.pem --endpoint-reconciler-type=lease --etcd-cafile=/etc/kubernetes/ssl/ca.pem --etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem --etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem --etcd-servers= 10.50.10.10:2379 --kubelet-certificate-authority=/etc/kubernetes/ssl/ca.pem --kubelet-client-certificate=/etc/kubernetes/ssl/kubernetes.pem --kubelet-client-key=/etc/kubernetes/ssl/kubernetes-key.pem --secure-port=6443 --service-account-issuer= kubernetes.default.svc --service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem --service-account-key-file=/etc/kubernetes/ssl/ca.pem --service-cluster-ip-range=10.68.0.0/16 --service-node-port-range=30000-32767 --tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem --tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem --requestheader-client-ca-file=/etc/kubernetes/ssl/ca.pem --requestheader-allowed-names= --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --proxy-client-cert-file=/etc/kubernetes/ssl/aggregator-proxy.pem --proxy-client-key-file=/etc/kubernetes/ssl/aggregator-proxy-key.pem --enable-aggregator-routing=true --v=2 Oct 27 23:41:20 k8s-worker-node1 kube-apiserver[45101]: "Metadata": null Oct 27 23:41:20 k8s-worker-node1 kube-apiserver[45101]: }. Err: connection error: desc = "transport: Error while dialing dial tcp 10.50.10.10:2379: connect: connection refused" Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: W1027 23:41:25.168319 45101 logging.go:59] [core] [Channel #57333 SubChannel #57334] grpc: addrConn.createTransport failed to connect to { Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "Addr": "10.50.10.10:2379", Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "ServerName": "10.50.10.10", Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "Attributes": null, Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "BalancerAttributes": null, Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "Type": 0, Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: "Metadata": null Oct 27 23:41:25 k8s-worker-node1 kube-apiserver[45101]: }. Err: connection error: desc = "transport: Error while dialing dial tcp 10.50.10.10:2379: connect: connection refused" 挽救重启api-server,起不来,etcd决绝连接。 无法救回,连GPT4也不行 番外: 纪念一下中堂大人。
教训没事不要随便改LB svc的 externalIP ,是根据这个博主的文章修改的 blogs /boshen-hzb/p/10679863.html。 大家注意一下,不要把集群搞挂了。 任何时候对线上环境的更改应该小心,必须知道这么做的后果是什么?
修改svc的LoadBalancer的IP引发的惨案由讯客互联互联网栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“修改svc的LoadBalancer的IP引发的惨案”