Locating Applications on Cloud Foundry Diego
Cloud Foundry deploys application containers on so-called Diego cells. Each Diego cell runs a number of application containers and exposes the applications through random ports on the Diego cell. This blog post shows some very useful debugging and analysis tricks for Diego. First, we determine the Cloud Foundry app belonging to a Diego container and second, we locate the containers for a specific application URL using the cfdot command line utility.
How to Find Which Apps Are Running in a Diego Cell
Identifying the application that is running in a specific container is very useful from time to time. It is particularly interesting if we have identified a certain container behaving in an unusual way - for example if the container might use more CPU or network than the average container in your deployment. To find out what's going on we need to identify the application that is running inside the container on the Diego cell. Now that we know the container we have obtained the IP address of the Diego cell and the port on which the container is running. This is all we need to determine the app that is running on the container.
Entering the Diego Cell
To determine the app running on the container we first need to open a SSH session on the Diego cell the container is running on. With a Pivotal Cloud Foundry deployment you may SSH into the OpsManager VM and from there continue using the BOSH CLI:
# find Diego cell name we are interested in
$ bosh instances | grep diego
Instance Process AZ IPs Load CPU CPU CPU CPU Memory
State (1m, 5m, 15m) Total User Sys Wait Usage
diego_cell/4cc18e61-b1d3-468d-8a72-de3334463ac4 running AZ1 10.18.144.18 0.37, 0.20, 0.18 - 1.9% 2.3% 0.1% 21% (3.4 GB)
diego_cell/32cbae32-9558-43e8-ab57-efbac71639e2 running AZ2 10.18.144.19 2.25, 2.33, 2.29 - 13.0% 12.7% 19.1% 27% (4.4 GB)
diego_cell/32e1ae65-ad0a-4232-965a-c8b1da62fa88 running AZ1 10.18.144.11 0.23, 0.21, 0.23 - 2.6% 2.9% 0.0% 20% (3.3 GB)
diego_cell/5a6d1cf5-752d-4218-b53c-46e952a0da17 running AZ3 10.18.144.23 0.31, 0.29, 0.28 - 3.4% 3.3% 0.0% 21% (3.5 GB)
...
# We identified unusually high wait times on the 2nd Diego cell.
# Let's open a SSH connection to the suspicious Diego cell:
$ bosh ssh diego_cell/32cbae32-9558-43e8-ab57-efbac71639e2
If you are unfamiliar with the CLI tools and how to authenticate with each of them make sure to read our blog post on authenticating Cloud Foundry CLI tools.
Using cfdot to Access the Diego Brain
Now that we are on a Diego cell we have access to the cfdot command line tool, which comes preinstalled on each Diego component. cfdot ships a lot of commands that are all useful to inspect (and even change!) the Diego state.
To see the containers running on the current cell you may use the following snippet:
# Determine cell IP address on the outbound interface
$ CELL_IP=$(ip route get 1 | awk '{print $NF;exit}')
# Get the list of active long running processes on Diego
$ cfdot actual-lrp-groups | grep "\"$CELL_IP\"" | jq
{
"instance": {
"process_guid": "08c9d685-c3db-480c-a0cc-c3ebf8f37fa3-2f2103f5-bfe4-4652-90ae-209a5d5393b1",
"index": 0,
"domain": "cf-apps",
"instance_guid": "a777bf30-522a-4aee-6bae-bcb6",
"address": "10.18.144.19",
"ports": [
{
"container_port": 8080,
"host_port": 61003
},
{
"container_port": 2222,
"host_port": 61004
}
],
"instance_address": "10.255.154.167",
"crash_count": 0,
"state": "RUNNING",
# ...
}
}
# ...
And as the actual-lrp-groups
outputs a single line per app container we can even grep for the specific port (and not use some complicated jq queries).
The application GUID as known to the Cloud Foundry API is contained in the process_guid
element, which we can cut out to finally identify the app (the remainder is the app version
GUID):
$ CELL_IP=$(ip route get 1 | awk '{print $NF;exit}')
$ cfdot actual-lrp-groups | grep $CELL_IP | grep <DIEGO_CELL_PORT> | jq -r '.instance.process_guid' | cut -c1-36
c5c223a0-8541-4b4f-88f9-43e513c612f1
To now access the app information you may use the CF API from a machine where you have the cf
command line (not available on Diego components):
$ cf curl "/v3/apps/c5c223a0-8541-4b4f-88f9-43e513c612f1"
{
"guid": "c5c223a0-8541-4b4f-88f9-43e513c612f1",
"name": "awesome-application",
"state": "STARTED",
"created_at": "2018-09-23T09:09:20Z",
"updated_at": "2018-09-23T14:52:58Z",
# ...
How to Find Out on Which Diego Cells an App Is Running On
Given that we have the route of a Cloud Foundry application we sometimes want to know on which Diego cells and containers the application instances exactly run on.
To do this, we may use the cfdot
tool again by first querying the desired long-running processes (LRPs) for the application route and then finding the instances in the actual LRPs:
# find the route (for example "my-awesome-app.cfapps.io")
$ PGUIDS=$(cfdot desired-lrps | grep <APP_ROUTE> | jq -r '.process_guid')
$ echo $PGUIDS
1145ece5-3dcd-4580-a91c-89d57b26d579-842d8fb3-c29b-4157-bfe3-6a3f19a92439
5b127875-92e8-458c-b5ce-5b3773086c25-72e66556-f5f2-4547-9ed6-f18fd1ba8471
Of course, grepping may yield more process GUIDs than what we are looking for, so make sure that you are really grepping the URLs you are looking for. You may get fancy and write a jq select query instead of grepping to have a precise result. Once you have the process GUIDs in a variable, you can use them to locate the containers as follows:
$ cfdot actual-lrp-groups | grep "$PGUIDS" | jq '{address: .instance.instance_address, ports: .instance.ports}'
{
"address": "10.255.172.142",
"ports": [
{
"container_port": 8080,
"host_port": 61027
},
{
"container_port": 2222,
"host_port": 61028
}
]
}
# ...
Now that you know the Diego cell IP address and the ports the containers are exposed to you may connect to the Diego cell for further investigation of the application containers.
Conclusion
The cfdot
command line tool is very powerful to debug the Diego deployment.
As it is preinstalled and pre-configured (certificate environment variables are already in place :-) ) on Diego cells it is a really nice tool for advanced debugging any Cloud Foundry operator should know.
If you did something cool using cfdot
please let us know in the comments!