IBM Cloud Pak for Integration – Resource Configuration

IBM Cloud Pak for Integration – Resource Configuration

By Colin Lim

This blog post aims to help you consider configuration of your Cloud Pak for Integration (CP4I) instance resources to achieve zero/minimal downtime and minimise unexpected bottlenecks in a Kubernetes/Openshift environment. The focus is on the configurable container resources that you can define in the CP4I Helm charts and how they can impact availability and performance when the Kubernetes node is under resource pressure.

We discuss the two compute resources CPU and Memory.

 

KUBERNETES RESOURCE CLASSIFICATION

The above two compute resources (CPU and Memory) are classified differently in Kubernetes. Due to their classification, the outcome when the Kubernetes node comes under resource pressure is very different.

 

QUALITY OF SERVICE FOR PODS

In Kubernetes, each pod is assigned a quality of service (QoS) level based on their CPU and memory resource specification. Kubernetes uses QoS classes to make decisions about scheduling and evicting pods. Just like any cattle in the cluster, each CP4I pod is assigned a QoS class.

The three QoS assigned to pods are detailed in the table below.

BEST EFFORT CONFIGURATION PARAMETERS

BURSTABLE CONFIGURATION PARAMETERS

GUARANTEED CONFIGURATION PARAMETERS

POD EVICTION POLICY

When a node is under memory resource pressure, the kubelet is responsible for ranking and evicting Pods.

The order of eviction is described below.

*Note: within each QoS category, the Pods are ordered by their Priority (a Pod property) during the eviction ranking process.

These evicted pods are scheduled in a different node if they are managed by a ReplicaSet – provided there are sufficient resources to accommodate the requests.

NODE OUT OF MEMORY (OOM)

In a situation when a node runs out of memory before the eviction policy can kick in, the OOM killer calculates and ranks Pods by its OOM score. The oom-score-adj value is one of the input parameters for the OOM score calculation. The adjustment value is based on the Pod’s QoS.

The oom killer evicts the Pod with the highest oom score. That score is derived from the percentage of memory it is using on the node in addition to the adjustment value. Based on the table above, the Pods with Best Effort QoS are likely to be killed first.

MONITORING AND ALERTING

Using the out of the box monitoring dashboards and alerting features in Openshift and CP4I, you can easily track and manage the pod resources before a node becomes stressed.

FROM OPENSHIFT

Leveraging the predefined Grafana dashboards and alerting capabilities in Prometheus to monitor each Kubernetes node.

Node Level CPU Configuration

 

Node Level Memory Configuration

 

Pod (within an Integration Server) Level CPU configuration

FROM CP4I MONITORING

Cloud Pak for Integration provides the same monitoring features as Openshift. Leveraging the same opensource-based Grafana dashboards we can watch the Pods and containers within an application namespace.

 

CP4I RECOMMENDATIONS

CP4I instances like ACE, MQ and APIC Pods are usually running mission-critical applications. The following considerations are highly recommended to avoid eviction even in a Multi-AZ/Multi-Node architecture to maintain the application’s SLA and/or QoS level.

  • Determine the expected workload of the application Pod to work out its compute resource requirements early during the testing phase;
  • Aim to configure the CP4I related helm charts with the same resource requests and limits values. Set the limits to a higher value if the application workload is bursty and compute resources are limited;
  • Do not leave the resource definition undefined – CP4I helm charts have default values mitigating the risk;
  • Deploy the CP4I application pod to at least 2 nodes;
  • Set CPU Quota in the CP4I project namespace to prevent Kubernetes from overcommitting compressable CPU resources.
  • Limit the number of Pods per node based on the expected workload.
  • Understand each CP4I application Pod workload and resource usage profile before introducing or scaling an application in a cluster.

REFERENCES

 

 

WE WOULD LOVE TO HEAR FROM YOU!
Let us know what you are after, and our team will get
in touch to help address your organisation’s API,
Integration and Security challenges.